Go Down

Topic: Measuring Throughput, Data Transfer Rate, and Memory Bandwidth on SRAM (Read 2631 times) previous topic - next topic

Lucario448

if I move out the line of code that sets the mode of the SRAM to sequential before the micros() call, I achieve 92833 Bytes/second!! That is a drastic change..
Wait what? Do you mean that the SRAM skips the address bytes? Or do you mean measuring after setting up the address?


I've been trying to get a transfer rate on the microSD card on its own and I could only go up to ~3500Bytes/second. I guess this is good enough considering that I'm using the SD Library.
Or the overhead is huge.

According to a test I've made long ago, write speed yielded somewhere around 53 KB/s (on that same library of course). I've filled up the file with a predefined string of 512 bytes, multiple times; not with print(), but with write() instead. I also declared that array as const and PROGMEM to save RAM; however, the performance wasn't hurted because fetching data from the internal flash memory is as fast as fetching from RAM.


Originally, I tried to concatenate the bytes into a string but I ran into a RAM issue causing the file to not open so I figured I would do everything after opening the file.
Most likely it's the way to go. Allocating 1K of memory on a system that only has 2K, sometimes is too much to ask.


------------------------------------------------------------------------
 file = SD.open("test5.csv", O_CREAT | O_APPEND | O_WRITE);

 while(count < 5){

 for (byte i = 0; i < 200; i++) {
   file.print(i);
   file.println(",");
 }
 count++;
 }
------------------------------------------------------------------------
Wouldn't be surprised if the resulting speed isn't close to what I've measured before; not because of the obvious overhead, but because dealing with textual data is always slower than with binary ("raw") data.
This is true when the string has to be created on the fly, or when parsing text (numbers to variables, tokenizing CSVs); otherwise printing an already built string (aka char array) is just a straight binary write.

raygun3000

Quote
Wait what? Do you mean that the SRAM skips the address bytes? Or do you mean measuring after setting up the address?
Sorry for the confusion! Measuring after setting up the address essentially before I started issuing the stream of bytes into the SRAM. Essentially, I have a microcontroller that has 8MHz clock speed which is 4Mbit/sec or 500KByte/sec so I'm still getting around the 1/4 theoretical rate.


Quote
Or the overhead is huge.

According to a test I've made long ago, write speed yielded somewhere around 53 KB/s (on that same library of course). I've filled up the file with a predefined string of 512 bytes, multiple times; not with print(), but with write() instead. I also declared that array as const and PROGMEM to save RAM; however, the performance wasn't hurted because fetching data from the internal flash memory is as fast as fetching from RAM.
Yea that was my issue..using print statements and eating up RAM. That does sound better than what I had originally. If I used the idea of PROGMEM (using flash memory) and I wanted each byte separate by a comma, would this be a reasonable approach? I know the write() function could also take in a buffer and length but I also want to have "," to separate them in a .csv file. Maybe I could mess with the predefined buffer and use it as a string first and later converting back to bytes?  :-\

------------------------------------------------
const PROGMEM byte buffer[5000] = {};
for( byte i = 0; i < 5000; i++) {
 buffer = i;
 
}

file = SD.open("test5.csv", O_CREAT | O_APPEND | O_WRITE);

for( byte j = 0; j < 5000; j++) {
file.write(pgm_read_word(buffer[j]));
file.write(",");
file.println();
}

------------------------------------------------

Lucario448

Essentially, I have a microcontroller that has 8MHz clock speed which is 4Mbit/sec or 500KByte/sec so I'm still getting around the 1/4 theoretical rate.
And this means the "avoidable" overhead actually isn't the culprit.

And is not that bad, 125 KB/s is more or less what I would, at most, expect for SD card read speed.

Looks like a DMA (what an Arduino lacks) really would make it faster without overclocking.


I know the write() function could also take in a buffer and length but I also want to have "," to separate them in a .csv file. Maybe I could mess with the predefined buffer and use it as a string first and later converting back to bytes?  :-\
If you want, but be aware that the printing (binary to text) process might become a bottleneck in some cases.

To minimize this effect, you can print directly into a buffer rather than the stream itself:

Code: [Select]
sprintf(yourBuffer, "%d, ", value); // for all values but the last

sprintf(yourBuffer, "%d\r\n", value); // for the last value only, it also creates a new line

Although not sure if it makes any difference, since both sprintf() and File.print() write a buffer anyway.




PD: not sure how exactly you tested the SD card, even on a 8 MHz system I would expect something close to 30 KB/s. My test was made at 16 MHz, so no doubt why mine gave more.

raygun3000

Quote
PD: not sure how exactly you tested the SD card, even on a 8 MHz system I would expect something close to 30 KB/s. My test was made at 16 MHz, so no doubt why mine gave more.
As mentioned earlier I initially got 3600 Bytes/sec by simply:
------------------------------------------------------------------------
 file = SD.open("test5.csv", O_CREAT | O_APPEND | O_WRITE);

 while(count < 5){

 for (byte i = 0; i < 200; i++) {
   file.print(i);
   file.println(",");
 }
 count++;
 }
------------------------------------------------------------------------
Just having this while loop + for loop would slow it down.. :smiley-sweat:. I haven't exactly tested it yet by implementing the PROGMEM but surely it will improve in performance!

Quote
sprintf(yourBuffer, "%d, ", value); // for all values but the last

sprintf(yourBuffer, "%d\r\n", value); // for the last value only, it also creates a new line
Thanks for the awesome tips!!  I'll take them into consideration. :smiley:

Lucario448

As mentioned earlier I initially got 3600 Bytes/sec by simply:
------------------------------------------------------------------------
 file = SD.open("test5.csv", O_CREAT | O_APPEND | O_WRITE);

 while(count < 5){

 for (byte i = 0; i < 200; i++) {
   file.print(i);
   file.println(",");
 }
 count++;
 }
------------------------------------------------------------------------
I'm still believing something is wrong. I was expecting a test like this:

Code: [Select]
File f = SD.open("test5.csv", O_CREAT | O_TRUNC | O_WRITE);

if (f) {
unsigned long pTime = micros(); // Start counting from here

for (byte count = 5; count; count--) {

 for (byte i = 0; i < 200; i++) {
   f.println(i);
 }
}
f.flush();

unsigned long cTime = micros(); // Stop counting from here
unsigned long fileSize = f.position();
f.close();

Serial.print("Write speed: ");
Serial.print((fileSize * 1000000UL) / (cTime-pTime));
Serial.println(" Bytes/second.");

} else Serial.println("Nope, file didn't open.");

Try this and tell me the result.

raygun3000

I'm getting around ~2700 Bytes/second now. I did notice a weird rare fluctuation where it would sometimes hop to 5000-7000 Bytes/second and then the next run would be back to ~2700 Bytes/second.


Lucario448

I'm getting around ~2700 Bytes/second now.
Well I'm still surprised for the result.

Looks like I have to try it myself too, to see who's wrong. Although my system is 16 MHz, so I should expect more or less double data rate (perhaps?).


I did notice a weird rare fluctuation where it would sometimes hop to 5000-7000 Bytes/second and then the next run would be back to ~2700 Bytes/second.
I do know that micros() has a resolution of 4 microseconds (8 in 8 MHz systems); thus this is the only known inconsistent factor, but never thought it would lead to such variation.

I don't think file size should vary between runs, or even the duration of the erase/program cycle of the card's flash (NAND) memory.

Lucario448

Looks like I have to try it myself too, to see who's wrong.
And I did, so here's what happened:


The first runs yielded 53-54 bytes/second, so... what? That's definitely not right.
First I've tried to increase SPI's clock frequency to maximum (yes, the SD library initiates at "half-speed" by default), and the result increased up to 64 bytes/second. Really?

Not convinced yet, I modified to code a bit. This time the file size is obtained in a way I trust more, and that and the elapsed time are printed now.
After getting those values separately, I did the calculation on my computer's calcutor; and guess what... 1862.86 bytes/second.

This is getting even more confusing, my result is worse than yours (considering I'm running at 16 MHz).
Binary writting isn't much better either: 2224.52 bytes/second.


So... what's going on? Was I exaggerating? Am I wrong? Well, looks like SD card writting in Arduino is actually slower than I thought.

Does the module itself matter? Because that's another thing I've changed from the long ago test. Before I used the one with the standard-size card slot (no level shifter); now I'm using the one with the micro-size slot (with level shifter).




Used file print test:
Code: [Select]
#include <SD.h>

void setup() {
  Serial.begin(9600);
  
  if (!SD.begin(20000000UL, 10)) { // SCK frequency in Hz (auto-constrained according to hardware capability), CS pin
    Serial.println("SD fail!");
    return;
  }

  File f = SD.open("test5.csv", O_CREAT | O_TRUNC | O_WRITE);

  if (f) {
    unsigned long pTime = micros(); // Start counting from here

    for (byte count = 5; count; count--) {

      for (byte i = 0; i < 200; i++) {
        f.println(i);
      }
    }
    f.close();
    unsigned long cTime = micros(); // Stop counting from here
    f = SD.open("test5.csv", O_READ);
    unsigned long fileSize = f.size();
    f.close();

    Serial.print("File size: ");
    Serial.print(fileSize);
    Serial.println(" bytes.");

    Serial.print("Time elapsed: ");
    Serial.print(cTime - pTime);
    Serial.println(" microseconds.");
    
    Serial.print("Write speed: ");
    Serial.print((fileSize * 1000000UL) / (cTime - pTime));
    Serial.println(" Bytes/second.");

  } else Serial.println("Nope, file didn't open.");

}

void loop() {}



Can't post the binary write version due to character count limit.

Lucario448

Update: the madness doesn't stop here. I've found the code of my "test I've made long ago" (naming, comments and messages are in spanish because I speak that language too; but it should compile in a single copy-paste, watch the assigned CS pin):

Code: [Select]
#include <SD.h>
const char texto[] = "Hola mundo de Arduino, aqui probando la velocidad de mi SD....\r\n";
const byte longitud = 64; // Longitud del texto, calculada manualmente
const unsigned int repeticiones = 65534U;
File archivo;

void setup() {
  Serial.begin(9600);
  if (!SD.begin(20000000UL, 10)) {
    Serial.println(F("Fallo en la SD..."));
    while (1);
  }
  if (SD.exists("Prueba.txt")) {
    SD.remove("Prueba.txt");
  }
  archivo = SD.open("Prueba.txt", FILE_WRITE);
  if (!archivo) {
    Serial.println(F("Fallo en el archivo..."));
    while (1);
  }
  Serial.println(F("Iniciando prueba de velocidad con \"miPrint\"..."));
  unsigned long tAnt = millis();
  for (unsigned int r = 0; r < repeticiones; r++) {
    miPrint(texto);
  }
  archivo.close(); // En teoría, el archivo resultante debe tener un tamaño de 4194176 bytes.
  unsigned long tPos = millis();
  Serial.print(F("Archivo terminado en "));
  Serial.print(tPos - tAnt);
  Serial.println(F(" milisegundos."));
  Serial.println();

  SD.remove("Prueba.txt");
  archivo = SD.open("Prueba.txt", FILE_WRITE);
  if (!archivo) {
    Serial.println(F("Fallo en el archivo..."));
    while (1);
  }
  Serial.println(F("Iniciando prueba de velocidad con \"print\"..."));
  tAnt = millis();
  for (unsigned int r = 0; r < repeticiones; r++) {
    archivo.print(texto);
  }
  archivo.close(); // En teoría, el archivo resultante debe tener un tamaño de 4194176 bytes.
  tPos = millis();
  Serial.print(F("Archivo terminado en "));
  Serial.print(tPos - tAnt);
  Serial.println(F(" milisegundos."));
  Serial.println();
  Serial.println(F("Eso es todo amigos..."));
}

void loop() {
  // Nada por aquí

}

void miPrint(const char *str) {
  // unsigned int longitud = strlen(str);
  archivo.write(str, longitud);
  // return longitud;
}

This sketch measures time in milliseconds, and it gives me... [drums roll]


93823,14 bytes/second (4194176 bytes in 44703 millseconds) in the first run of the code (using write()). Now I remember why I overstated the write speed.
The second run (which uses print()) yields 87045 bytes/second (4194176 bytes in 48184 millseconds), so this means that even the tinyest overhead do stack up in the long run.

Best of all: this time I used the module with level shifter and a micro-sized slot.


Now seriously, WHAT THE HELL IS GOING ON!!!??? :smiley-confuse: THIS IS CRAZY!!!
Or maybe not. My only guesses are: average performance increases the larger the file to write is, or there's a time drift between micros() and millis().





PD: if you didn't noticed:
Code: [Select]
SD.begin(20000000UL, CS);
This is how you force the library to operate at "full speed".

raygun3000

Quote
93823,14 bytes/second (4194176 bytes in 44703 millseconds) in the first run of the code (using write()). Now I remember why I overstated the write speed.
The second run (which uses print()) yields 87045 bytes/second (4194176 bytes in 48184 millseconds), so this means that even the tinyest overhead do stack up in the long run.
Wow, that's awesome! I'll go ahead and test out the results using write. Did you ever get a chance using PROGMEM to store up bytes in a buffer before calling SDcard? Not sure how it determines the free slots in flash memory in order to not replace current flash memory data.

Quote
SD.begin(20000000UL, CS); This is how you force the library to operate at "full speed".
Is there any references I could view about this? :o

Lucario448

Did you ever get a chance using PROGMEM to store up bytes in a buffer before calling SDcard?
Didn't tested yet, but I will... in the "miraculously" faster test.



Not sure how it determines the free slots in flash memory in order to not replace current flash memory data.
It will always replace old data, that's what re-programming does. But how it does not interfere with the actual code, it's because the compiler knows where data and code should be placed (without clashing).

In fact, PROGMEM or not, arrays prefilled with data have to be stored in program memory anyway; if not, how the MCU will remember those after a power cycle?
What's special about the keyword, is that the array is never loaded into RAM (at least automatically); which is why it can be larger than 2 KB (up to 32767 bytes or 32 KB, even on an Arduino Mega).




Is there any references I could view about this? :o
Only by checking the source code, located at [IDE installation path]\Arduino\libraries\SD\src and looking at the files:

  • SD.h (function declaration).
  • SD.cpp (function implementation).
  • Sd2Card.cpp (to see how the clock frequency is changed).

Since the official reference documentation doesn't mention this at all.

Lucario448

Didn't tested yet, but I will... in the "miraculously" faster test.
And here we are once again.



Now, long story short: I've modified the previous sketch to work with a PROGMEM array (it can be reverted back to the original just by commenting a single #define statement):
Code: [Select]
#include <SD.h>

#define USE_FLASH

#ifdef USE_FLASH
const char texto[] PROGMEM = "Hola mundo de Arduino, aqui probando la velocidad de mi SD....\r\n";
#else
const char texto[] = "Hola mundo de Arduino, aqui probando la velocidad de mi SD....\r\n";
#endif
const byte longitud = 64; // Longitud del texto, calculada manualmente
const unsigned int repeticiones = 65534U;
File archivo;

void setup() {
  Serial.begin(9600);
  if (!SD.begin(20000000UL, 10)) {
    Serial.println(F("Fallo en la SD..."));
    while (1);
  }
  if (SD.exists("Prueba.txt")) {
    SD.remove("Prueba.txt");
  }
  archivo = SD.open("Prueba.txt", FILE_WRITE);
  if (!archivo) {
    Serial.println(F("Fallo en el archivo..."));
    while (1);
  }
#ifdef USE_FLASH
  Serial.println(F("Ejecutanto version con texto en memoria flash."));
#endif
  Serial.println(F("Iniciando prueba de velocidad con \"miPrint\"..."));
  unsigned long tAnt = millis();
  for (unsigned int r = 0; r < repeticiones; r++) {
    miPrint(texto);
  }
  archivo.close(); // En teoría, el archivo resultante debe tener un tamaño de 4194176 bytes.
  unsigned long tPos = millis();
  Serial.print(F("Archivo terminado en "));
  Serial.print(tPos - tAnt);
  Serial.println(F(" milisegundos."));
  Serial.println();

  SD.remove("Prueba.txt");
  archivo = SD.open("Prueba.txt", FILE_WRITE);
  if (!archivo) {
    Serial.println(F("Fallo en el archivo..."));
    while (1);
  }
  Serial.println(F("Iniciando prueba de velocidad con \"print\"..."));
  tAnt = millis();
  for (unsigned int r = 0; r < repeticiones; r++) {
#ifdef USE_FLASH
    archivo.print((__FlashStringHelper*)texto);
#else
    archivo.print(texto);
#endif
  }
  archivo.close(); // En teoría, el archivo resultante debe tener un tamaño de 4194176 bytes.
  tPos = millis();
  Serial.print(F("Archivo terminado en "));
  Serial.print(tPos - tAnt);
  Serial.println(F(" milisegundos."));
  Serial.println();
  Serial.println(F("Eso es todo amigos..."));
}

void loop() {
  // Nada por aquí

}

void miPrint(const char *str) {
#ifdef USE_FLASH
  for (byte i = 0; i < longitud; i++)
  archivo.write(pgm_read_byte(str + i));
#else
  archivo.write(str, longitud);
#endif
}

And these are the results:

"miPrint" version: 20323.18 bytes/second (4194176 bytes in 206374 milliseconds).
"Regular print" version (behaving as printing an F() string): 20082.05 bytes/second (4194176 bytes in 208852 milliseconds).

Still faster than our first test attempts, but this demonstrates that, for some reason, fetching data from flash is not as strightfoward as fetching instructions. Or maybe the single byte version of write() is not as efficient as the multi-byte (buffered) version, despite of already having a cache (another kind of buffer).

raygun3000

Quote
Still faster than our first test attempts, but this demonstrates that, for some reason, fetching data from flash is not as strightfoward as fetching instructions. Or maybe the single byte version of write() is not as efficient as the multi-byte (buffered) version, despite of already having a cache (another kind of buffer).
For sure better than the 2 digit data transfer rate lol. Thanks for your input!

Go Up