Fast data logging on SD Card

Hi everyone,
My project involves logging data, between 50 to 70 bytes, at 50hz. I currently using an Arduino Nano and the standard sd card shield. I based the code from :

I’m logging data pretty fast and with no error. In the example code attached, I’m able to store 33kbytes (66bytes, 500 times) in 216 milliseconds. I’m happy with this logging speed.

My issue:
Although fast and reliable, the data logging is not predictable. There are some occasionnal peek in logging time (average is about 2-3 milliseconds) at about 10-15 milliseconds, which I do not like since it might slow down the rest of the code.

I’ve read that by storing the data in a buffer, it might help to smooth out those peeks.

  • Isn’t the SdFat library already doing it ?
  • Could it help if I would do it too ? if so are there any beginnner friendly topics explaining it ? (I’ve never done it)
  • Are there any other solution to make it more reliable ?
  • I’ve ordered a Teensy3.5, would the higher clock speed (120MHz instead of 16MHz for the Nano) help logging the data even faster ?

Any help is truly appreciated.
Thank you

Example.ino (3.24 KB)

SdFat has a 512 byte buffer and all your write go there until you either flush or close (see this in the configuration file) the file or of course if you reach the buffer capacity and writing 512 bytes in one go takes a bit of time... adding your own buffer would just make matter worse. (On arm processors there is a second 512 byte cache to improve performance for large writes that are not a multiple of 512 bytes).

Did you take a peek at the AvrAdcLogger example ?

How many decimal places to you need in the floats that you are saving ?

You might look at the LowLatencyLogger example in the SdFat library. It adopts a completely different approach to fast logging, and ends up writing directly to the flash sectors without updating the file system until the end. And it uses multiple buffers. The idea with the buffers is that with two of them you don't miss any readings or slow down so long as the write time for one buffer is less than the time it takes to fill up the other buffer.

I don't think a faster microcontroller will help much. The delay is in writing to the card, and in the operations the card's controller goes through to carry out the writes.

Hi,

Thanks for all your kind responses.

@ J-M-L
I did took a look at AvrAdcLogger but it’s way above what I understand or what I’m currenty able to understand.

@ UKHeliBob
About 1 or 2 decimal points. I could multiple the floating variables by 10 or 100 and save it as an int, it would be faster and use less space. I used floats just to try out.

@ ShermanP.
Same as for J-M-L, I did try to understand it, but it’s to complicated for me unfortunately. There are too many concepts I’ve never seen before.

Aren’t there any beginner friendly solution which do not involve 500 line long code ?

In any case thanks for help.

I could multiple the floating variables by 10 or 100 and save it as an int, it would be faster and use less space. I used floats just to try out.

That is what I had in mind

RedIron:
Aren’t there any beginner friendly solution which do not involve 500 line long code ?

I’ll give it a short to show how you could architect the code and what it means:

here is a short piece of code which takes an existing buffer (crafted so that you can write a multiple of 10 bytes and it looks nice in the file) and tries to dump it to the SD card every 20ms (50Hz).

const uint8_t SD_CS_PIN = SS;
#include <SdFat.h>              // https://github.com/greiman/SdFat
SdFat SD;

const unsigned long period = 20000ul; // in µs
const byte bufferSizeToDump = 70; // make it a multiple of 10 to get the \n (no more than 100)

const unsigned long acceptableError = period + 1000ul; // 1ms error
unsigned long chrono = 0;
unsigned long nbWrites = 0;
unsigned long nbErrors = 0;
unsigned long timeBucket = 0;

const char* logName = "log.txt";
File logFile;

char byteBuffer[] = "HELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\nHELLO YOU\n"; // 10 bytes x n

void setup() {
  Serial.begin(115200);

  // activate the SD card
  if (!SD.begin(SD_CS_PIN)) {
    Serial.println(F("SD Card error"));
    while (true);
  }

  // open the file for logging
  if (!logFile.open(logName, FILE_WRITE)) {
    Serial.println(F("Can't create logfile"));
    while (true);
  }

  // ready to roll
  Serial.println("enter something in the console to stop the log");
  Serial.flush();
  logFile.flush();
  delay(1000); // wait a bit before getting going
  chrono = micros();
}

void loop() {
  unsigned long deltaT = micros() - chrono;
  if (deltaT > period) { // 50Hz
    logFile.write(byteBuffer, bufferSizeToDump); // write our bytes
    logFile.flush(); // dump to card
    nbWrites++;
    if (deltaT > acceptableError) {
      nbErrors++;
      Serial.print(nbWrites);
      Serial.println("\t>1ms late");
    }
    chrono += period; // trying to keep up the pace, if OK with the lost time, you can try chrono = micros();
    timeBucket += period - (micros() - chrono);
  }

  if (Serial.read() != -1) { // stop collection as soon as user enters something in the console
    logFile.close();
    Serial.print(F("Running at "));
    Serial.print(1000000ul / period);
    Serial.print(F("Hz, "));
    Serial.print(nbWrites);
    Serial.print(F(" writes, generated "));
    Serial.print(nbErrors);
    Serial.print(F(" errors and average time bucket: "));
    Serial.print(timeBucket / nbWrites);
    Serial.println(F(" µs. LOG STOPPED."));

    while (true); // die here
  }
}

There is a test to see if the dump is more than 1ms late against the expected period in which case I print how many lines were recorded till that point.

You’ll notice that I barely do anything else than writing and flushing the 70 byte buffer - the only part of the loop() not related to the file is testing if there is something pending on the Serial line (to correctly close the file and stop logging) and a bit of statistics collection.

The timeBucket variable captures how many microSeconds you have at the end of the write before having to perform another buffer write/flush.

I ran this code on an Arduino Uno with a freshly formatted and empty 2Gb SD card (not UHS-II) that’s what I saw on my Serial Monitor (@ 115200 bauds)

[color=purple]
enter something in the console to stop the log
283	>1ms late
284	>1ms late
561	>1ms late
751	>1ms late
1219	>1ms late
1220	>1ms late
1688	>1ms late
Running at 50Hz, 2012 writes, [color=red]generated 7 errors[/color] and [color=blue]average time bucket: 6958 µs[/color]. LOG STOPPED.
[/color]

so I’m skipping a beat from time to time (7 times on 2012 writes here) and can’t keep up really at 50hz.
this tells you also that you have less than 7ms to complete your data collection on average

I then modified the code and tried writing only 10 bytes by changingconst byte bufferSizeToDump = 10; // make it a multiple of 10 to get the \nand let it run for a while and got

[color=purple]
enter something in the console to stop the log
996	>1ms late
1001	>1ms late
Running at 50Hz, 2163 writes, generated 2 errors and average time bucket: 7580 µs. LOG STOPPED.
[/color]

So although I did get a couple error, clearly writing less data helps and my time bucket has gained half a ms

I tried then the same code again with 70 bytes but at 25Hz by changing const unsigned long period = 40000ul; // in µsand I left it running for recording for a while and I never saw a report that I was more than 1ms late.

[color=purple]
enter something in the console to stop the log
Running at 25Hz, 2711 writes, generated 0 errors and average time bucket: 26961 µs. LOG STOPPED.
[/color]

so clearly with ~27ms in between two consecutive data dump, it’s now plenty enough to avoid running into issues.

==> So - assuming your data collection is say less than 5ms - you could try to reduce the frequency of the write until you find a point where this is acceptable and also work to reduce the size of the data you need to dump as this helps too. Getting an UHS-II SD card might help, I did not have one readily available to test.

feel free to play with the code the two parameters you can change are right at the start

const unsigned long period = 20000ul; // in µs
const byte bufferSizeToDump = 70; // make it a multiple of 10 to get the \n  (no more than 100)

Hi ,
I could reply earlier, had I busy week.

@UKHelibob
Using your technic, I could reduce the data size from 66 bytes to around 36-40bytes. I'm probably gonna be doing that since it make the loggind more reliable and faster.

@ J-M-L
Thanks for your help, it is greatly appreciated. I tried your code and it worked fine. It is very easy to understand and clear.
There are however a few things that I do not fully understand.

  • Using the same code, I achieved an average time bucket of 12'104 microseconds. Does it suggest that the SD card I'm using if faster than yours ? (It's an old one so probable also an UHS-I)
  • Do you know if the difference between an UHS-I vs UHS-II is really that significant ? I've read that UHS-II are faster, but by how much, particulary with an Arduino ?
  • I tried using different data size (which were not multiple of 10, for example 66 bytes) and it still worked fine. Why did you recommand using multiple of 10 ? And why no more than 100 ? (I tried 150 bytes and worked, had 3 error in 20 seconds).
  • It is my understanding that using the flush() command takes about 7-8 miliseconds regardless of the data size right ? If so, wouldn't it be more "effective" to call the flush() command like every 7 cycles (if I were to store 66 bytes: 512/66 =7.75).
  • Using your code, I almost never had errors. What causes thoses random errors (since I have on average 12'104 microseconds left) ?

Thanks for your great answers guys, it always a pleasure to read your responses.

Some answers and guesses for your questions

#1 - probably better UHS has an impact, mine was really old, may be just SD (Not sure the SD shield plays a part). The quality of the flash does matter too.

#2 - the shield is probably just waiting for acknowledgment that data is written, so speed of your card does matter.
See What is UHS-II?

#3 - I recommended multiples of 10 so that you will write text in your file (taken from the canned buffer) up to an ‘\n’ character (new line) so that the file would look “nice”. I said 100 because that was the max size of the buffer I provided (10 times 10 characters HELLO YOU\n). If you go beyond 100 the write will access memory beyond the buffer end. The standard says that once you start accessing memory you did not allocate weird stuff can happen. Most of the time when it’s read only (like you do here) you won’t see bad stuff, but of course you are dumping random bytes in your file (whatever bytes are beyond the 100th one) and very likely non ASCII stuff. So if you open the file on your computer it will look weird.

#4 Well a flush will dump a full block, but as demonstrated smaller buffers did lead to a bit of time improvement. Possibly due to the way 0 and 1 are written and state of your storage unit or other technicalities of the cards themselves. I’m not an expert there… so yes ideally you should group things until the buffer is full which is the default way the library works - but you don’t really control when things happen and sometimes it can take a loooong time… that’s where you need the other complicated technics with extra buffers etc, the purpose of the example was to dumb things down and show what you can achieve with simple code.

#5 I don’t know for sure. Cards’ controller can decide to move things around or erase data. The flash erase groups in SD cards are very large (like 128 KB I if I remember correctly) so re-writing file structures can involve shifting a large amount of data, the card will have a busy status by holding data out low and I think the spec allows a card to go busy for up to 250 ms…

#1 & 2 : I'll be looking into getting a UHS II SD Card an see what difference it might make, great advice.

#3 : Your were right, I did saw weird charaters wenn logging more than 100 bytes.

#4 : I understand now why you chose to flush every time. By doing so you control when data is written to the SD card instead of waiting for the buffer to become full, which might happen in the middle of a cycle, therefore creating peek in logging time.

#5 : ok didn't knew that was a thing.

Thanks for your answers J-M-L

I'm not an expert on this, but I wonder about the benefit of flushing. The library accumulates data going to the SD card until it has 512 bytes, then it writes that to the card. That's one full sector. It can't write less than that. If you try to write a smaller number of bytes, as might happen if you flush when fewer than 512 bytes need to be written, the library will pad with fill. Then if you continue to write data to the card, the library may have to read back in that last sector and figure out where to begin adding the new stuff. I may be completely wrong about how this works, but I don't see why you need to flush at all until you're ready to end the logging session and close the file.

Flushing may also require an extra set of file system updates - updating the file's directory entry, updating the FAT, and updating the second copy of the FAT. All of those would require read/modify/write actions.

The ultimate speed would result from the direct consecutive sector writes, with no file system updates until the end, as is done by the LowLatencyLogger. But short of that, I think the best performance would be to have the data sampling be driven by a timer interrupt, with the ISR saving the raw readings to a circular buffer. Then the main loop would extract those buffer entries and send them (prettified if you like, with commas and such) to the library, which will automatically save that output to the card in 512-byte chunks. The buffer has to be large enough to accomodate the occasional delay that occurs when writing to the card due to the need to allocate the next cluster, or whatever.

If I'm wrong about how this works, I hope someone will correct me.

@ShermanP

Agreed on many front with some caveats:

As seen above in empirical testing, smaller buffer flush were delivering some performance gain: flushing 70 bytes worth of data was faster than flushing 10 bytes of data... So some stuff happens there. I've not explored in details why it behaved this way.

Controlling when the 512 bytes get written helps with data consistency. Say you write 32 bytes per log cycle, if the system crashes you might loose up to 16 recordings that were in memory and never committed to the SD card. Also if you are just using the 512 bytes buffer, and you log say 70 bytes, you will write to the SD part of 8th line and if your system crash your last data recording is not fully there. (this might not matter much is you log at 50Hz, but if you log every hour you might miss a big part of a day).

Would be worth comparing the approach without flush at all

Check out my project for High Frequency Long duration data logging to an SD card
https://www.forward.com.au/pfod/ArduinoProgramming/DataRate/SD_logging/HighFreqLongDur.html

Has both an Uno and Teensy versions.

@ShermanP
As mentionned by J-M-L, my problem was inconsistency in write time, sometime it took 0 milliseconds and ramdomly it went has high as 10 milliseconds.
The idea behind flushing is therefore to have a better control on logging time. The issue now being that the code is "much" slower.
I wanted to logging at 50 Hz, I'm still within my margin and I still have like 11-12 milliseconds to do the rest.

@ J-M-L
As suggested, I tried without flush and the results were even better to my supprise.

enter something in the console to stop the log
Running at 50Hz, 1851 writes, generated 0 errors and average time bucket: 19712 µs. LOG STOPPED.

enter something in the console to stop the log
Running at 50Hz, 7357 writes, generated 0 errors and average time bucket: 19712 µs. LOG STOPPED.

@ drmpf
Your projet looks great. I could'nt found the arduino code, is it linked somewhere ?

Thanks for your help guys

so without the flush you benefit from the standard buffered input. you'll have one call where the data is really flushed which will be much slower than the others, but that might be good enough for your needs

The Uno/Mega version is here
https://www.forward.com.au/pfod/ArduinoProgramming/DataRate/SD_logging/UnoMega_HighFreqLongDur.html

I’ve been trying J-M-L’s code (reply #6) and, even though I kept the structure the same (at least the data logging part), I’m having an issue.

The first 2 writes are much slower that the rest.

1	28844 microseconds late
2	9376 microseconds late

I don’t get any errors after the 2 second write. I’ve tried a bunch of stuff but it never resolved the problem. The weird thing is that I don’t get those errors if I’m using J-M-L’s code.

#include <SPI.h>
#include "SdFat.h"
SdFat SD;

File dataFile;
File myFile;

struct datastore {
  int variable0;
  int variable1;
  int variable2;
  int variable3;
  int variable4;
  int variable5;
  int variable6;
  int variable7;
  int variable8;
  int variable9;
};

struct datastore myData;

const int chipSelect = 8;
const unsigned long period = 20000ul;
const unsigned long acceptableError = period + 1000ul;
unsigned long chrono = 0;
unsigned long nbWrites = 0;
unsigned long nbErrors = 0;
unsigned long timeBucket = 0;
unsigned long deltaT = 0;


void setup() {
  Serial.begin(115200);
  pinMode(10, OUTPUT);

  // Activate the SD card
  if (!SD.begin(chipSelect)) {
    Serial.println("Card failed, or not present");
    while (true);
  }

  // Remove existing files
  SD.remove("logName");
  SD.remove("flightdata.txt");

  // Open the file for logging
  if (!dataFile.open("logName", FILE_WRITE)) {
    Serial.println("Can't create dataFile");
    while (true);
  }

  delay(1000);
  chrono = micros();
}


void loop() {
  deltaT = micros() - chrono;

  if (deltaT > period) {
    GetData();
    dataFile.write((const uint8_t *)&myData, sizeof(myData));

    if (deltaT > acceptableError) {
      nbErrors++;
      Serial.print(nbWrites); Serial.print("\t");
      Serial.print(deltaT - period);
      Serial.println(" microseconds late");
    }
    nbWrites++;
    chrono = chrono + period;
    timeBucket += period - (micros() - chrono);
  }

  if (Serial.read() != -1) {
    dataFile.close();

// Stuff not relevant for the logging part    
//    dataFile = SD.open("logName", FILE_READ);
//    if (dataFile.available()) {
//      myFile = SD.open("flightdata.txt", FILE_WRITE);
//      for (int i = 0; i < nbWrites; i++) {
//        SaveToSd();
//        PrintVariables();
//      }
//    }

    Serial.print(F("Running at "));
    Serial.print(1000000ul / period);
    Serial.print(F("Hz, "));
    Serial.print(nbWrites);
    Serial.print(F(" writes, generated "));
    Serial.print(nbErrors);
    Serial.print(F(" errors and average time bucket: "));
    Serial.print(timeBucket / nbWrites);
    Serial.println(F(" µs. LOG STOPPED."));
//    dataFile.close();
//    myFile.close();
    while (true);
  }
}


void GetData() {
  myData.variable0 = nbWrites;
  myData.variable1 = deltaT - period;
  myData.variable2 = 100 + nbWrites;
  myData.variable3 = 100 + nbWrites;
  myData.variable4 = 100 + nbWrites;
  myData.variable5 = 100 + nbWrites;
  myData.variable6 = 100 + nbWrites;
  myData.variable7 = 100 + nbWrites;
  myData.variable8 = 100 + nbWrites;
  myData.variable9 = 100 + nbWrites;
}

Do you have any idea why ?

in my code I perform a

  logFile.flush();
  delay(1000); // wait a bit before getting going

in the setup() - I imagined that it could help "warm up" the buffer if there is some sort of lazy allocation/handling.

may be that's the difference?

SD cards can 'go to sleep' for upto 2000mS before a write is considered to have timed out.
The 2000mS is the timeout setting in GitHub - greiman/SdFat: Arduino FAT16/FAT32 exFAT Library
In the datalogger an interrupt routine keeps buffering data while trying to write that previous buffer.

@ J-M-L
Adding

logFile.flush();

in the setup didn't change anything. The only thing helping is to change:

chrono += period; // trying to keep up the pace, if OK with the lost time, you can try chrono = micros();

chrono += period; 
to 
chrono = micros();

but by doing that I won't get "precisely" timed 20milliseconds data logging.

@ drmpf
I could found the timeout in the setting. Is it related with the sample rate or something or does it specifically states timeout somewhere ?

I think I found the solution.
I formated the Sdcard (default) and it works fine now. Don't now why it helped but it's working^^