Arduino Multiprocessor SAMD board..

I'm not sure if this will be of interesting to anyone, but I did a small project over the holidays to build a small 16/8 CPU SAMD21 based Arduino interfaced board. The goal was to build something that would allow for experimentation with interconnect architectures used in supercomputing. I designed the PCB pretty quickly in Diptrace, sent it off to JLCPCB, and a week later the boards were delivered. To my surprise, they worked great after assembly!

I made a board and platform configuration that works inside the Arduino interface that supports the ability to flash code to all of the processors in one button press, which is convenient. I have a basic message passing system that takes advantage of the interconnects to allow data transfer and a simple form of barrier synchronization.

I used the SAMD21G18s, which have 6 serial ports. That makes it possible to use 4 of the serial ports for node to node communication, one for a connection to the supervisor cpu, and one for external data display. With 16 CPUs and 4 interconnects per node it is easy to build a 4D hypercube interconnect, which has really great and easy navigation properties.

A few pictures attaches, plus I did a small video:

I'm happy to put the code, schematics, etc on github if anyone is interested. It was a pretty fun holiday project, and I enjoyed learning a bit about the board configuration, bootloader operation, and flashing tools for the SAMD Arm Ardunio boards. I have not seen any other 8 or 16 CPU Ardunio projects, although I'm sure there are some out there. Cheers! -Jeff

Hi Jeff,

Nice project. Thanks for sharing and for the accompanying video.

I'm intrigued how you implemented the bootloader process for each CPU with a single Arduino IDE upload? Is the code uploaded to the 4 CPUs over their serial port, rather than native USB?

Linking each CPU to adjacent processors with 4 serial ports, reminds me of the Transputer parallel computing system, back in the 1980s.

MartinL:
I'm intrigued how you implemented the bootloader process for each CPU with a single Arduino IDE upload? Is the code uploaded to the 4 CPUs over their serial port, rather than native USB?

Linking each CPU to adjacent processors with 4 serial ports, reminds me of the Transputer parallel computing system, back in the 1980s.

I tried to do it with the minimal amount of code changes. The supervisor processor (the 5th SAMD on each board) has a USB connection that goes to the upload host (that is running the Arduino software). That supervisor is connected to each of the main CPUs over serial, and it is wired such that on the main CPU side the connection to the supervisor is coming in on PA10/PA11. That means I can run a standard bootloader (which on the SAMD can listen on both USB and Serial) on the main CPUs, and simple copy traffic from the USB port to the serial port for the particular processor that is getting flashed.

There are a few other minor tweaks to make this work : I needed to turn off the 1200bps USB port-open reset (a simple flag in board.txt) so the supervisor doesn't get flashed instead. Since there is more than one destination, I made a simple 2 byte protocol as well as an 'escape sequence' that can be sent over the USBSerial connection to change the CPU traffic is routed to. There is also a control line from the supervisor to the reset line of each main CPU.

The bootloaders only required a single change, somewhat unrelated, which was to support an external 32.768Khz oscillator instead of a crystal.

The upload tool for SAMD (bossac) did need an additional flag added to force XMODEM even when going over USB, as that is what the bootloader expects when coming in over serial. ( I suspect I could have changed the bootloader instead)

The final step was a small script that is specified in the platform.txt file (tools.zzz) that send the escape sequence, reset sequence, and connection sequence to the supervisor CPU before running bossac for each upload.

I did debate having the supervisor cache the entire upload flash image and do the flashing directly, but that would really add a lot of complexity.

Cross board flashing work in a similar way using the supervisor ring bus (serial ports).

Indeed, it is just like the transputer of the 80s! A simple message passing architecture with an interconnect on top, Kinda cool old school.

Jeff

Hi Jeff,

Thank you for taking the time to explain how you multi-processor bootloader works, much appreciated.

Your video made the upload look so seamless, but I guessed that there had to be some elegant code choreography going on behind the scenes with the 5th supervisor processor.

Interesting to see that it's possible to tweak the Arduino system, to allow for multi-processor architectures.

Absolutely. When I started I wasn't sure how much additional code I would need to write, but the way the board and platform tags work it is possible to do much of it with things our of the box.

Long term if I get this better developed I could see about merging back in the few tweaks need to do this so others could leverage it. While a 16 cpu setup is a bit large, a two processor Arduino could have many uses.

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.