Portenta H7 fast ADC sampling but too slow data transmission to serial monitor

Long time listener, first time caller. I am by no means an expert with microcontrollers. I've done a lot of googling to try to solve this issue but alas, I suspect the solution may be something that I don't know enough to look for.

Description of my setup and what I'm trying to do: I have a function generator (used for debugging but will eventually be replaced with our actual detector) connected to the portenta's A0 pin. The goal is to have a 100 kHz sampling rate of a 16 bit ADC value. Currently, I have set the code so that it samples data into a 1000 long buffer, prints each data point as it samples, and then runs through the whole process a certain number of times until the counter hits the max value and it stops collecting data.

I previously tried printing the buffer only after it filled up but the time required to print resulted in huge awkward gaps between data sets. I want to be able to collect ~1 minute of data, but more is better. Since we need a 100 kHz sampling rate, one issue is that I can't simply create a buffer long enough to hold 1 minute of 100 kHz data.

Now, we've accomplished our desired sampling rate of 100 kHz by following this very helpful thread: "Portenta H7 ADC DMA first steps" My code is directly taken from this thread, though I removed/modified a few sections to suit my project.

I measured the sampling speed using Serial.print and the micros() function minus the time at before sampling the data. This sampling time was then divided by the number of data points (1000) to give a little over 2 us per data point.

I am currently printing data points as they are collected. This slows down our sampling speed considerably to ~10 kHz because Serial.print takes around 100 us. I have also tried writing to a SD card but that takes anywhere from 100-200 us per data point. Writing bytes with Serial.write and receiving them with the Processing app takes 50 us per Serial.write. However, I need to send 3 separate Serial.write messages: the start bit, the lowByte of our 16 bit reading, and the highByte. This gives a total of ~150 us to write the binary.

By my math, for a baud rate of 2 million and sampling rate of 100 kHz, we should have a transfer bandwidth of 20 bytes. Each ADC reading will be between 0-65535, so that's 4 bytes per number. One question: although I set the baud rate to 2 million, is it actually operating at 2 million??

I would love to be able to utilize this beautifully fast ADC sampling rate but the speed at which I can send that reading via Serial.print, etc, is a huge bottleneck.

And, of course, here is the entirety of my code:

/*
  Based off this forum thread: https://forum.arduino.cc/t/portenta-h7-adc-dma-first-steps/931669/12
  For my setup, we are reading the A0 pin
*/

#include "main.h"
#include <SD.h>

ADC_HandleTypeDef hadc1;
TIM_HandleTypeDef htim16;

void SystemClock_Config(void);
static void MX_ADC1_Init(void);
static void MX_TIM16_Init(void);


const int   TxPin        = 0;               // transmitter pin
const int   ledPin       = LED_BUILTIN;     // pin to use for the LED


uint16_t    raw = 0;
const uint32_t    data_len = 1000;
uint16_t    data[data_len];
uint16_t    data_bool[data_len];
uint8_t     adc_delay = 2;


File myFile;

void setup() {

  HAL_Init();
  //  SystemClock_Config();
  MX_ADC1_Init();
  MX_TIM16_Init();
  MX_GPIO_Init();


  HAL_TIM_Base_Start(&htim16);



  pinMode(ledPin, OUTPUT);

  Serial.begin(2000000);
  while (!Serial);

  //   Initialize SD card and open file

  //  Serial.print("Initializing SD card...");
  //  pinMode(7, OUTPUT);
  //  if (!SD.begin(7)) {
  //    Serial.println("initialization failed!");
  //    return;
  //  }
  //  Serial.println("initialization done.");
  //  myFile = SD.open("testfile1.txt", FILE_WRITE);


}


//Since we can only hold so much data in data[] at once, run through the void loop a certain number of times
//During each cycle, send collected ADC values to SD card/print to serial monitor/whatever you're doing

int counter = 0;
int mycount = 100;


void loop() {

  counter = counter + 1;

  while (raw < 1000) {                                        // stay in here until theshold is reached
    HAL_ADC_Start(&hadc1);                                    // Start ADC conversion
    HAL_ADC_PollForConversion(&hadc1, HAL_MAX_DELAY);         // Wait until conversion is finished
    raw = HAL_ADC_GetValue(&hadc1);                           // Get ADC value

  }
  raw = 0;


  for (int i = 0; i < data_len; i++) {

    //    Serial.println("reset timer 16 to 0");
    __HAL_TIM_SET_COUNTER(&htim16, 0);                     // Reset Timer 16 to 0
    HAL_GPIO_WritePin(GPIOH, GPIO_PIN_15, GPIO_PIN_SET);   // Set PH15 to HIGH
    HAL_ADC_Start(&hadc1);                                 // Start ADC conversion
    HAL_ADC_PollForConversion(&hadc1, HAL_MAX_DELAY);      // Wait until conversion is finished
    data[i] = HAL_ADC_GetValue(&hadc1);                    // Get ADC value


    Serial.println(data[i]);
    //    Serial.println(micros()); // use to test the time needed to print to serial monitor


    while (__HAL_TIM_GET_COUNTER(&htim16) < adc_delay) {}      // wait until counter has reached adc_delay counts
    HAL_GPIO_WritePin(GPIOH, GPIO_PIN_15, GPIO_PIN_RESET); // Reset PH15 to LOW

  }


  //If code has run for a certain number of cycles==mycount, stop collecting data

  if (counter > mycount)
  {
    //      myFile.close();
    digitalWrite(LEDG, LOW);
    delay(500);
    digitalWrite(LEDG, HIGH);
    Serial.println("done recording");
    exit(0);
  }

}



//void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim16){
//  HAL_GPIO_TogglePin(GPIOH, GPIO_PIN_15);
//  }

// -------------------------------------------------------------------------------
// -------------------------------------------------------------------------------
static void MX_TIM16_Init(void)
{


  // TIM_ClockConfigTypeDef sClockSourceConfig;
  // TIM_MasterConfigTypeDef sMasterConfig;

  __HAL_RCC_TIM16_CLK_ENABLE();                       // f*cking important


  htim16.Instance = TIM16;
  htim16.Init.Prescaler = 200 - 1;                    // Prescale the system frequency of the processor -> SystemCoreClock is 200 MHz (probably)... at least with 200-1 the timings are correct
  htim16.Init.CounterMode = TIM_COUNTERMODE_UP;
  htim16.Init.Period = 65536 - 1;                     // count to maximum (16 bit timer)
  htim16.Init.ClockDivision = 0;
  // htim16.Init.ClockDivision = TIM_CLOCKDIVISION_DIV5;
  htim16.Init.RepetitionCounter = 0;
  htim16.Init.AutoReloadPreload = TIM_AUTORELOAD_PRELOAD_ENABLE;



  if (HAL_TIM_Base_Init(&htim16) != HAL_OK)
  {
    Error_Handler();
  }


}
// -------------------------------------------------------------------------------
// -------------------------------------------------------------------------------
static void MX_ADC1_Init(void)
{

  /* USER CODE BEGIN ADC1_Init 0 */

  /* USER CODE END ADC1_Init 0 */

  ADC_MultiModeTypeDef multimode = {0};
  ADC_ChannelConfTypeDef sConfig = {0};

  /* USER CODE BEGIN ADC1_Init 1 */

  /* USER CODE END ADC1_Init 1 */
  /** Common config
  */
  hadc1.Instance = ADC1;
  //  hadc1.Init.ClockPrescaler = ADC_CLOCK_ASYNC_DIV1;
  hadc1.Init.ClockPrescaler = ADC_CLOCK_SYNC_PCLK_DIV1;
  hadc1.Init.Resolution = ADC_RESOLUTION_16B;
  hadc1.Init.ScanConvMode = ADC_SCAN_DISABLE;
  hadc1.Init.EOCSelection = ADC_EOC_SINGLE_CONV;
  hadc1.Init.LowPowerAutoWait = DISABLE;
  hadc1.Init.ContinuousConvMode = DISABLE;
  hadc1.Init.NbrOfConversion = 1;
  hadc1.Init.DiscontinuousConvMode = DISABLE;
  hadc1.Init.ExternalTrigConv = ADC_SOFTWARE_START;
  hadc1.Init.ExternalTrigConvEdge = ADC_EXTERNALTRIGCONVEDGE_NONE;
  hadc1.Init.ConversionDataManagement = ADC_CONVERSIONDATA_DR;
  hadc1.Init.Overrun = ADC_OVR_DATA_OVERWRITTEN;
  hadc1.Init.LeftBitShift = ADC_LEFTBITSHIFT_NONE;
  hadc1.Init.OversamplingMode = DISABLE;
  if (HAL_ADC_Init(&hadc1) != HAL_OK)
  {
    Error_Handler();
  }
  /** Configure the ADC multi-mode
  */
  multimode.Mode = ADC_MODE_INDEPENDENT;
  if (HAL_ADCEx_MultiModeConfigChannel(&hadc1, &multimode) != HAL_OK)
  {
    Error_Handler();
  }
  /** Configure Regular Channel
  */
  sConfig.Channel = ADC_CHANNEL_0;
  sConfig.Rank = ADC_REGULAR_RANK_1;
  sConfig.SamplingTime = ADC_SAMPLETIME_1CYCLE_5;
  sConfig.SingleDiff = ADC_SINGLE_ENDED;
  sConfig.OffsetNumber = ADC_OFFSET_NONE;
  sConfig.Offset = 0;
  sConfig.OffsetSignedSaturation = DISABLE;
  if (HAL_ADC_ConfigChannel(&hadc1, &sConfig) != HAL_OK)
  {
    Error_Handler();
  }
  /* USER CODE BEGIN ADC1_Init 2 */

  /* USER CODE END ADC1_Init 2 */

}
// -------------------------------------------------------------------------------
// -------------------------------------------------------------------------------

void MX_GPIO_Init()
{
  /* GPIO Ports Clock Enable */

  __HAL_RCC_GPIOH_CLK_ENABLE();

  GPIO_InitTypeDef GPIO_InitStruct;

  /*Configure GPIO pin : PH15 */
  GPIO_InitStruct.Pin = GPIO_PIN_15;       // Pin15
  GPIO_InitStruct.Mode = GPIO_MODE_OUTPUT_PP; // digital Output, push-pull configuration
  GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
  HAL_GPIO_Init(GPIOH, &GPIO_InitStruct);  // GPIO-H
}
// -------------------------------------------------------------------------------
// -------------------------------------------------------------------------------

void Error_Handler(void)
{
  /* USER CODE BEGIN Error_Handler_Debug */
  /* User can add his own implementation to report the HAL error return state */
  // __disable_irq();
  while (1)
  {
    digitalWrite(LEDR, LOW);
    delay(500);
    digitalWrite(LEDR, HIGH);
  }
  /* USER CODE END Error_Handler_Debug */
}

Thank you very much for your help.

How about using a server the murata is 5 GHz if I remember what I read correctly so it would no longer limit your ADC speed. Another solution you can use your M4 core to get the Data and the M7 to transfer it (that way your adc still samples at the given frequency though you will still be limited in the amount of data you send through serial com on M7, however if you don't need analysis of the said data on your computer and if it is just for visual purposes sending every sample is useless considering our retinal relevance).

My comments:
first of all: the UART in Portenta H7 (Arduino LIB code) is really very slow! For sure:
I see with 1843200 baud rate: the characters tickle in so slow (for sure not faster as sending with 4800 baud).

I assume: the UART runs in SW-polling mode (not IT mode, not DMA mode). I see on other STM MCU with 1843200 baud, in IT mode: it prints like an immediate splash (guessing 10x faster). But Portenta H7 UART sends every single character with huge gaps in between, never mind what the baud rate is.

OK, there is suggestion to use Murata WiFi/BT module (as 5 GHz RF frequency). But have you realized that the WiFi chip gets the data to send/receive also via a UART (UART7)? Or maybe via SDIO (SDC1).
I would NOT expect that WiFi transmission is any faster as regular UART. If you see it is slow via UART on USB cable - assume the same speed via WiFi connection.

The only way for me to make it fast (I need also very fast data transmission to a host): use the ETH (wired network, LwIP) and UDP via ETH. This is reasonable fast for me.

Using M4: why? It runs even slower as CM7. And running both cores in parallel needs to split the code into two flash memory regions (linker script) so that the code fetch of one core does not stall the other core.

For my impression - I assume this:

  • UART is really slow (for sure much slower as possible, it could be much faster)
  • WiFi/BT chip cannot be really faster (also using UART, SDIO mode, plus overhead), potentially
    using the same slow speed UART implementation
    The only fast interface for me to a host:
  • use SPI (but tricky: Portenta H7 would be a SPI slave)
  • use wired ETH - this is reasonable fast enough

Sometimes, I am thinking: do it in "offline mode", e.g. record to SD card and read afterwards from there.
But SD Card uses a 4-bit SDIO mode (not so bad) but it goes via the FatFS file system. The overhead can be also pretty "remarkable".

But the biggest "problem" with SD Card:
how to read the SD Card content on a host PC?

  • I have not seen, I could not figure out... how to setup a USB based Memory Device on Portenta? In a way that USB plugged in pops up an external memory device on PC and I can copy and paste files? (does not seem to be possible).
  • the only option: implement TFTP (e.g. via wired ETH) and read/write files on SD Card via TFTP client tool
    Works.

But recording data to SD Card can be as slow as sending via UART.
Or: enable SDRAM, record to SDRAM - this can be the fastest way to store results. Just think about how to transfer (in "offline mode") the results in SDRAM to a host (e.g. via UART).

Anyway, at a certain speed for results where you cannot transfer anymore in "real-time" the results to a host - you had to think about to store the results on local MCU: transfer later the results, with slower speed, in "offline mode" to a host.
It has just implication that amount of data is limited, it is not endless real-time (for hours), but it has the charm that you can record, store a limited amount of data in really real-time on local MCU.
It is just a "system design" question how to bring high-speed data to a host - when all the pipes from MCU to host are slower as your process generating the data. It cannot be solved (a bottleneck somewhere cannot be overcome).

Suggestion:
try to use SDRAM and local recording, for a limited time. If it works - afterwards think about how to get the data out from local storage to host. If not possible - think about "offline mode", handling in bursts. If still bottleneck to bring to host: "what could you process already on local MCU?" (ARM CM7, with ARM DSP - can be really powerful).

Thanks for the instructive answer.
Do you think it is doable to use SPI on PortentaH7 with cables such as :
https://ftdichip.com/product-category/products/cables/usb-mpsse-spi-i2c-jtag-master-cable-series/
And then communicate data between the board and the computer or I am just wasting my time and I should use SPI directly between an external WIFI module (or an ESP) and the PortentaH7.

What I see: this cable is actually an active SPI cable: it is a master device, driven from a PC via USB, based on FTDI chip.

It can work - but just and when you configure the SPI on your Portenta H7 as a SLAVE!!!
Then, P7 could receive and respond to Master SPI commands, from this cable. Than, the P7 is a SPI Slave connected to a PC. But I assume not your intention.

No idea what your intention is:
if you want to connect an external chip/device via SPI on P7 - P7 needs to be the SPI master (and this cable is not needed, not helpful, not working).

Yes, if this WIFI module (or an ESP) is a SPI Slave - configure P7 as SPI Master and a direct cable (not too long). This cable would substitute the P7 and you can talk from PC to this external module.
This cable is out of the picture if you want to use P7.

In the end is there really no convincing way (in terms of speed) to communicate directly with a computer from the portenta.. it is kinda disappointing to be honest.
I will test wired ETH.

Not really true: there are several options, but some require hardware or specific code:

  • UART - ok, it is there: potentially due to USB it can be pretty fast (a baudrate up to 10 Mbps might be possible, do not assume just standard baud rates possible: I use 1,843,200 baud, sure it could work also up to 8 Mbps, do not assume 9600 is the limit)
  • ETH: anyway the better option for me: more speed, more flexible, more user friendly and easier to use with host scripts (e.g. Python)
  • potentially, you could implement also "your own USB device": Audio transfer to a USB host, or any propriatary USB to host should be possible (just to know how to setup such a USB device)
  • you can also try to connect PC with MCU via SPI (potentially up to 100 Mbps with short cables)

The options for the "connectivity" is often just limited by the PC (host), not the MCU. What you PC provides determines, not the MCU (imagine, you could read directly on the 16/32bit SRAM interface with a PC from the MCU, or your MCU behaves like an SDIO card ...).

Based on my experience: the interfaces available on PC (laptop, host) is the limitation factor. And nowadays your MCU has to follow, e.g. to act as USB device or network server.

Suggestion: try to use UART via USB (which is there) and figure out the max. throughput. The beauty: the baudrate, if 9600 or 15200 does not matter (anymore). As long as the MCU can handle the fastest baudrate, e.g. 10 Mbps for standard USB = 8 Mbps potentially possible - it might work already.

And: when I need UART but fast transfer: I change to BINARY mode (my own protocol to send data as bytes as they are, e.g. with command, length field for packet etc.). I do not need to convert to ASCII (and convert back on receiver/host side). This BINARY mode increases the throughput already by factor *2. Be creative to think about how to transfer data fast enough for your needs (and as a compromise how much effort for HW and SW).
There is always a solution (e.g. at the end you could connect a fiber optic chip on MCU and send dat via dark fiber to PC/host, but MCU core performance might limit at the end).

1 Like

you can also try to connect PC with MCU via SPI (potentially up to 100 Mbps with short cables)

I have looked it up a little and I found this component :

Yet I am reluctant to buy it since if I understand correctly the communication speed remains limited by the speed of the RS232 communication on the computer side?? Furthermore the datasheet only talks about "Hi-Speed" communication but that' s not quite an accurate data? What component would you use if you had to do it? I get that any MCU having a SPI com can act as an inbetween (PC and PortentaH7) but I ll be at a loss if the MCU has a slower or equal baudrate to the PortentaH7.

EDIT : They base that component on the FT232H MCU

I found more information on speed inside but what I have yet to figure out is wether I can use that as a slave to the PortentaH7.

Yes, you could try to connect via such a "bridge" (device).
BTW:
On another NUCLEO-H743ZI board - I have used "USB Audio" as fast speed transfer to a host PC.

The requirements were:

  • no driver, no hardware need on PC - use the Windows OS (or MacOS) with the available drivers: "USB Audio" as Class 1 - as used for sound cards - are supported without any special driver needed. No need for my clients to install any third-party driver or software.

  • USB Audio, Class 1 can transmit up to 960 bytes every 1 ms. It is a bandwidth/throughput of 7.68 Mbps. Pretty much for most purposes.

Instead of real Audio, the USB Audio packets are filled with my digital data. And a Python script can open and listen on an USB Audio Device (like a Sound Card) and get the data (OK, some traps are there, like avoiding an Audio Mixer in between and having a noise free Audio transfer = bit error free, possible when choosing the Kernel driver for this Sound Device).

I am thinking to try if I could implement an USB Audio Device, from P7 to a host PC, behaving like a digital Audio Input. It needs "just" to setup an USB Audio Device on P7. Ideally: I merge the code from another STM32H7 board into Portenta H7 project.
If this works (also on P7 - it is fine on NUCLEO boards) - I can fill my fast speed data and bring to host PC.

Why not using ETH wired network? This should be the fastest option. (WiFi not - I think it uses a slow serial interface inside P7, throughput might be very low). Just: ETH needs the main board (but it works fine - I use it in my current project).

Or: this Serial device on P7 FW is so slow. No idea why (not a baudrate issue), I guess it sends every byte in SW polling mode: huge gaps between bytes.
If we could have a real USB VCP interface (and it should be such one, actually), or we setup our own USB VCP for UART - it could transmit up to 6..8 Mbps (on USB VCP the baudrate does not matter, just if Fast Speed USB).

A SPI connection between PC and P7 (via this adapter, cable) might be possible, but it needs additional HW on PC, the drivers on PC and also to setup a SPI Slave on P7. This could be pretty fast, assume 40 Mbps with short cables, but some work (on both sides).

But a SPI slave on P7, even for sure possible, is a bit tricky: P7 could never act by itself to send data. The Host PC had to poll and query via SPI if P7 has data (PC is SPI master). A bit complicated FW design in terms of the "protocol" to implement.
And you need SW also on PC, e.g. Python scripts, the drivers for this HW...

BTW:
when you want to achieve 100 KHz sampling with 16bit values (one channel) - it needs just 1,6 Mbps. This should be possible still with a simple UART interface (but in BINARY mode).
I run my UART with 1.8432 Mbps baudrate. (why the hell Arduino LIB does not use DMA or INT for UART transfer?)

This "USB Audio" interface with 7.68 Mbps has another beauty: it is ISO transfer, predictable and constant timing, jitter free: every 1 ms you would get data. The latency between P7 and host PC would be very small (1 ms), constant and predictable.

BTW: I've found why my UART was so slow: I was sending single characters. This is really bad.
If I send strings via Serial.write(str, length); it is really fast now.

And, as mentioned: the baudrate on USB UART does not matter:

  • you can configure any baudrate, e.g. also: Serial.begin(5000000); The baudrate value does not have any effect!

  • you can also open the UART on PC, e.g. in TeraTerm, with any baudrate: also here the baudrate does not matter, has no effect.

The USB VCP (often also called as USB CDC) has the beauty: there is not really a real UART device involved. It is a USB protocol (sending in packets, up to 64 bytes). The speed comes just from USB, e.g. 12 Mbps for Fast Speed.
So, you can achieve 6 Mbps and more via UART, if you make sure to send in larger chunks. 64 bytes per USB packet is possible, but I see in "Serial.h":

const size_t WRITE_BUFF_SZ = 32;

I assume: Arduino/mbed LIB sends chunks via USB with max. 32 bytes - even 64 possible.
So, if you make sure to collect values as chunks, send these chunks (e.g. 32 bytes) - it makes the UART really fast.
Never send single bytes (it slows down a lot, independent of baudrate).

as separate topic: Portenta H7: UART as a fast interface to a host

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.