Use Portenta H7 as peripheral (slave) SPI device

Short version: could anyone point me to some applicable resources/examples for how to use SPI on an Arduino Portenta H7 in peripheral (slave) mode. I found a useful examples for setting this up in controller mode (for instance, here: Minimal example of SPI for Portenta H7), but am having a hard time adapting the non-Portenta Arduino code I've found for setting up a peripheral SPI device to the Portenta case.

That being said, the reason I want to do this is to quickly transfer data (float values) between two Portentas as quickly as possible. If someone else has a suggestion that would be faster (or as fast and easier to implement) please let me know.

Some background: I'm fairly new to Arduino programming - I am working on a project in which I am gathering a small set of analog inputs, performing some moderately complex computations on each of them to detect an event which then results in a TTL pulse when the event is detected. A key part of this project is that the time between input and output be kept minimal (I'm aiming at ~3000Hz). Given that a lot of the computation can be performed in parallel between the different input streams, I though I could have individual Portenta's getting the input, performing the pre-processing and then sending that output to a 'controller' Portenta which makes a decision based on their inputs. I tried, I2C, which does work but is much slower than I'd like and therefore takes a big bite out of the per cycle time left to do the actual computations.

Thanks!

Am I missing something? Would it not be easier to do the input and number crunching on the M7 core and the output display handling on the M4? To me this would resolve your input results to output display latency issue.

There's no doubt that using both cores would help, but I don't think it would solve the problem. The calculations I'm doing (on the M7 core) already take up the majority of the 333 microseconds (corresponding to 3kHz) available per cycle for just one input channel. I have to process at least a second, and perhaps a third channel as well. Therefore, the only solution that I can think of is to distribute this computation over several Arduino's and quickly feed these results to a 'monitoring' Arduino that integrates the data produced by the several 'input' Arduinos. That requires fast data transfer.

That being said, over the weekend I started using 11 digital connections (8 signal, 2 'byte address', and 1 interrupt) to 'bit bang' a solution to this. It works well enough - I can send 1 byte in about 6 microseconds, I was just hoping that a hardware based SPI solution would be faster (and also not take up as many of the already limited pins on the Portenta).

That’s some heavy duty processing. Thanks for the update. I hate to say it, but maybe you should be looking for a different board. A different brand offers a board with a Cortex M7 running a nominal 600 Mhz that, with proper heat sinking, can be clocked to a little over 1 GHz. It might help.