Improving SdCard Write Time

Hello everyone,
I have a FSM that is designed to record two samples to a file on the SdCard at 20-Hz. For the most part the program runs without any issues except I occasionally get some outliers of +200ms in post processing review. It is random and can show up in sample sizes ranging from 500 samples to 28,000 samples. I am sure it is something in my code but I don’t know where I should focus. This is my first time using SdCards so I thought it would be best to post here.

Some information on the project:

  • Board: Arduino Uno
  • SdCard Shield: Adafruit Datalogging Shield
  • SdCard: Toshiba 2GB SD-C02G JAPAN

File Structure Format:
millis(), ADC Value 1, ADC Value 2

I am checking the frequency by performing millis()n+1-millis()n in post processing and this is where I am finding the anomalies.

Am I misunderstanding the buffer therefore causing the issue? Or is it something unrelated to SdCard use in general?

// Libraries
#include <SdFat.h>
#include <RTClib.h>

RTC_PCF8523 rtc;

enum state {
  _readState,
  _displayState,
  _createState,
  _bufferState,
  _writeState,
  _errorState
};

state _currentState;

// Declare Variables
struct sample {
  char* _name;
  const int pin;      // Input pin
  unsigned int raw;   // Raw value
  float value;        // Calculated value
};

sample Val1 = {"Val1", 2, 0, 0};   // Voltage range 0.5 - 4.5 VDC
sample Val2 = {"Val2", 3, 0, 0};    // Voltage range 0.0 - 5.0 VDC

const int ledRecord = 6;
const int pinSwitch = 7;
const int ledRead =  8;
const int ledError = 9;
const int pinSD = 10;

unsigned long currentMillis;  // Used for storing the latest time
unsigned long previousMillis = 0;      // Store the last time the program ran
const unsigned long interval = 50;     // Sample frequency (milliseconds)

int pinState;

String filename;
char _header[20];
SdFat SD;
File dataFile;    // SD Card

String buffer;  // String to buffer output

void setup() {
  // Start serial
  Serial.begin(57600);
  while (!Serial);
  pinMode(ledError, OUTPUT);
  pinMode(pinSwitch, INPUT_PULLUP);
  pinMode(ledRecord, OUTPUT);
  pinMode(ledRead, OUTPUT);
  pinMode(pinSD, OUTPUT);
  // Start RTC
  Serial.print("\n\nStarting RTC...");
  rtc.begin();
  // Show all lights as part of boot sequence
  digitalWrite(ledRecord, HIGH);
  digitalWrite(ledRead, HIGH);
  digitalWrite(ledError, HIGH);
  // Throw away analog read
  pinState = digitalRead(pinSwitch);
  Val1.raw = analogRead(Val1.pin);
  Val2.raw = analogRead(Val2.pin);
  delay(500);
  digitalWrite(ledRecord, LOW);
  digitalWrite(ledRead, LOW);
  digitalWrite(ledError, LOW);
  Serial.println(" RTC started!");
  Serial.print("Initializing SD card...");
  if (!SD.begin(pinSD)) {
    Serial.println("Card failed, or not present");
    digitalWrite(ledError, HIGH);
    _currentState = _errorState;
    return;
  }
  Serial.println(" SD card initialized!");
  // Test Sd Card before continuing
  filename = "TEST.txt";
  dataFile = SD.open(filename, O_WRITE | O_CREAT);
  if (!dataFile){
    Serial.println("Unable to write to Sd Card.");
    digitalWrite(ledError, HIGH);
    _currentState = _errorState;
    return;
  }
  dataFile.remove();
  filename = "";
  // Create header for files
  sprintf(_header, "Time(ms),%s,%s",Val1._name,Val2._name);
  // Print header for serial logging
  Serial.println("Program ready.");
}

void loop() {
  pinState = digitalRead(pinSwitch);
  digitalWrite(ledRead, HIGH);
  digitalWrite(ledRecord, LOW);
  currentMillis = millis();

  if ((currentMillis - previousMillis) >= interval) {
    switch (_currentState) {

      case _readState:
        _currentState = _readState;
        digitalWrite(ledRead, LOW);   // Green LED OFF
        //Read Values
        Val1.raw = analogRead(Val1.pin);
        Val2.raw = analogRead(Val2.pin);

      case _displayState:
        _currentState = _displayState;
        previousMillis = currentMillis;
        if (pinState != LOW) {
          if (filename != ""){
            dataFile.close();
            filename = "";
            buffer.remove(0);
          }
          _currentState = _readState;
          break;
        };

      case _bufferState:
        _currentState = _bufferState;
        buffer += millis();
        buffer += ",";
        buffer += Val1.raw;
        buffer += ",";
        buffer += Val2.raw;
        buffer += "\r\n";

      case _createState:
        _currentState = _createState;
        if (filename == "") {
          DateTime now = rtc.now();
          filename = String(now.unixtime(), DEC);
          filename = filename + ".txt";
          Serial.print(filename);
          Serial.println(" created!");
          dataFile = SD.open(filename, O_CREAT | O_APPEND | O_WRITE);     // Open filename.txt
          dataFile.println(_header);
        };

      case _writeState:
        _currentState = _writeState;
        digitalWrite(ledRecord, HIGH);
        if (buffer.length() >= 312) {
          dataFile.write(buffer.c_str());
          buffer.remove(0);
          dataFile.flush();
        }
        _currentState = _readState;
        previousMillis = currentMillis;
        break;

      case _errorState:
        Serial.println("Current State: _errorState");
        digitalWrite(ledError, HIGH);
        _currentState = _errorState;
    }
  }
}

Here is SdInfo information from the SdFat Examples

init time: 76 ms

Card type: SD2

Manufacturer ID: 0X2
OEM ID: TM
Product: SD02G
Version: 3.8
Serial number: 0X934F70A9
Manufacturing date: 6/2009

cardSize: 1967.13 MB (MB = 1,000,000 bytes)
flashEraseSize: 128 blocks
eraseSingleBlock: true
OCR: 0X80FF8000

SD Partition Table
part,boot,type,start,length
1,0X0,0XC,137,3841911
2,0X0,0X0,0,0
3,0X0,0X0,0,0
4,0X0,0X0,0,0

Volume is FAT32
blocksPerCluster: 8
clusterCount: 479296
freeClusters: 479110
freeSpace: 1962.43 MB (MB = 1,000,000 bytes)
fatStartBlock: 169
fatCount: 2
blocksPerFat: 3752
rootDirStart: 2
dataStartBlock: 7673
Data area is not aligned on flash erase boundaries!
Download and use formatter from www.sdcard.org!

Also the output from StdioBench

Starting test
uint8_t 0 to 255, 100 times 
fileSize: 116500
print millis: 7685
stdio millis: 3813
ratio: 2.02

uint16_t 0 to 20000
fileSize: 128890
print millis: 8436
stdio millis: 4173
ratio: 2.02

uint32_t 0 to 20000
fileSize: 128890
print millis: 8522
stdio millis: 4287
ratio: 1.99

uint32_t 1000000000 to 1000010000
fileSize: 120000
print millis: 8013
stdio millis: 4054
ratio: 1.98

float nnn.ffff, 10000 times
fileSize: 100000
print millis: 10683
stdio millis: 4069
ratio: 2.63

Done

Please let me know if I have left any information out, this is always a learning process for me. Thank you!

I suspect those occasional delays are occurring within the SD card. The SD card controller does all kinds of stuff behind ths scenes, including erasing blocks of sectors before writing data, and moving stuff around when wear leveling. And those things can take a lot of time. Also, I can't think of anything in your controller that would cause delays of 250ms.

My guess is the only way to be sure no card delays cause data collection problems is to have two buffers, with all the data collection being interrupt driven, and the loop() doing nothing but waiting for a buffer to be filled, then writing it to the card while the interrupts are filling the other buffer. Each buffer would have to be large enough to hold 250ms or more worth of collections. A convenient buffer size, if it works, is 512 bytes since data has to be written to the card in that size chunk.

Edit: Your card info data says your data structure isn't aligned with the erase blocks of the card, and suggests you reformat using the Formatter provided by the SD card association. That might make a difference in performance, and would be worth trying.

ShermanP:
Edit: Your card info data says your data structure isn’t aligned with the erase blocks of the card, and suggests you reformat using the Formatter provided by the SD card association. That might make a difference in performance, and would be worth trying.

I saw this too. Today I used the recommended program to format the card but it didn’t resolve the issue in that the results are similar to before; not better but not worse. The format did change from fat32 to fat16. Here is the new output from SdInfo from the SdFat Examples.

init time: 76 ms

Card type: SD2

Manufacturer ID: 0X2
OEM ID: TM
Product: SD02G
Version: 3.8
Serial number: 0X934F70A9
Manufacturing date: 6/2009

cardSize: 1967.13 MB (MB = 1,000,000 bytes)
flashEraseSize: 128 blocks
eraseSingleBlock: true
OCR: 0X80FF8000

SD Partition Table
part,boot,type,start,length
1,0X0,0X6,137,3841911
2,0X0,0X0,0,0
3,0X0,0X0,0,0
4,0X0,0X0,0,0

Volume is FAT16
blocksPerCluster: 64
clusterCount: 60022
freeClusters: 59978
freeSpace: 1965.36 MB (MB = 1,000,000 bytes)
fatStartBlock: 138
fatCount: 2
blocksPerFat: 235
rootDirStart: 608
dataStartBlock: 640

In _writeState, I notice that my buffer length only reaches a max of 463. I changed the condition statement to 512 just to see what the max would be and printed the result to serial. I guess the 463 is not including the \n trailing null character.

      case _writeState:
        _currentState = _writeState;
        digitalWrite(ledRecord, HIGH);
        Serial.println(buffer.length());
        if (buffer.length() >= 512) {
          dataFile.write(buffer.c_str());
          buffer.remove(0);
          dataFile.flush();
        }
        _currentState = _readState;
        previousMillis = currentMillis;
        break;

I haven’t explored interrupts yet to this level, but I am wondering if this expectation of 20-Hz samples is beyond the scope of how my code is written. I have read about the binary high frequency sampling programs but I like the human readability aspect of the raw file on the SdCard.

The Toshiba 2GB SD-C02G is an older design SD card with a smaller and slower internal buffer compared to a newer 16GB or higher capacity card. I'm currently building Arduino based dataloggers recording 8.8KB/sec in CSV format. When I tested my older 2GB SD cards they took 125-250 ms to do an internal block write. The newer 16GB SD cards, I'm currently using, do an internal block write in 10-25 ms. The randomness in your data suggests an internal card process, possible erasure of data before a write. I suggest using a newer SD card.

SpaceJohn:
The Toshiba 2GB SD-C02G is an older design SD card with a smaller and slower internal buffer compared to a newer 16GB or higher capacity card. I’m currently building Arduino based dataloggers recording 8.8KB/sec in CSV format. When I tested my older 2GB SD cards they took 125-250 ms to do an internal block write. The newer 16GB SD cards, I’m currently using, do an internal block write in 10-25 ms. The randomness in your data suggests an internal card process, possible erasure of data before a write. I suggest using a newer SD card.

I can definitely try this! Thanks for the suggestion!

Spacejohn, do you do anything in your code to keep SD internal functions from becoming blocking events that prevent data generation from occurring at strict intervals? It just seems at some point you're going to be waiting on the card to finish erasing, or wear leveling, or whatever. That's why I was thinking of generating the data through timer interrupts so they would take place at a fixed interval. And with multiple buffers, you could be filling one up via the interrupts while the other is being written to the SD card in the main Loop(). I believe writing to the card, or waiting for that to complete, could be interrupted with no problem.

There might be other things you could do to keep everything from slowing down. You could start out with an erased card, and maybe create the file in advance, making it very large so that all the FAT entries and directory entries don't need to be made. The other thing that I keep seeing that I think is bad is closing the file after every write. That just creates a lot of extra reading, erasing and rewriting.

Kinser, 512 bytes is important because that's the sector size in the SD card. If you use a different size buffer, you will cross sector boundaries, which may result in needing two erasures to complete the write. Of course your FAT library will make every attempt to buffer things into 512-byte blocks so long as you don't insist on flushing everything and closing the file. But no matter what you do, all the extra stuff the card controller has to do makes SD less than the ideal medium for fast data logging.

The other way to do it (without interrupts) is to use the yield() function. SdFat.h will call yield() when it's waiting for the card to do something.

Be aware that yield() gets called before setup(), so don't attempt to do anything like digitalRead() or SPI.transfer() inside yield() until you are satisfied that setup() has been run. Use a global variable.

SdFat may leave the card's CS activated when yielding, but it's not actively using the card so you can de-select the card and use the SPI bus for other purposes - just remember to set it back before you exit yield().

Try a FRAM Ferroresonant Random Access Memory. Adafruit sells them. You can get them big enough to buffer all of your readings then dump when you are finished acquiring data. These have unlimited read and write cycles. Put it on the SPI rather then I2C for much faster speed. You can use more then one if you cannot get one large enough. There is software available that will make it look like a file based device.
Good Luck & Have Fun!
Gil

I don’t understand yield(). At all.

How would you change kinser86’s code to use yield()?

To use that code with yield, you would have to first separate out the reading and writing into separate functions. So either loop() or yield() could call the data-reader and append new data to the buffer.

Then it would need to be double-buffered, so that you could be reading into one buffer while writing the other out to the SD card.

But really the very first step would be to remove the big-S string buffer. That's wasting a lot of time and memory.

ShermanP, I start with an erased SD card with one header file. My datalogging does not require precise time intervals. All entries are over sampled and time stamped. Even a 25 ms block write delay does not affect the data analysis. My next datalogger iteration will use a STM32 processor for higher clock speeds, running a variation on RAID 0 with SD cards. Calculations show a 4X increase in recording speed to ~32KB/sec without an external buffer. Addition of a FRAM external buffer will increase data throughput, but at a higher cost compared to Flash.

Closing files after every write is a waste of precious time. I transfer approximately 2.5-3 KBs to the SD card internal buffer between file close commands.

MorganS:
To use that code with yield, you would have to first separate out the reading and writing into separate functions. So either loop() or yield() could call the data-reader and append new data to the buffer.

Then it would need to be double-buffered, so that you could be reading into one buffer while writing the other out to the SD card.

But really the very first step would be to remove the big-S string buffer. That's wasting a lot of time and memory.

I've looked through the entire Arduino IDE v1.8.8 for occurrences of the word "yield". I found a number of example sketches where yield is involved somehow in waiting for the serial monitor to open, or waiting for input over serial. And I found a couple uses in the Arduino source code. Here are some extracts:

from various .ino's
-------------------

  Serial.begin(9600);
  
  // Wait for USB Serial 
  while (!Serial) {
    SysCall::yield();
  }
  delay(1000);

  Serial.println(F("Type any character to start"));
  while (!Serial.available()) {
    SysCall::yield();
  }
 


void error(const char* s) {
  Serial.println(s);
  while (1) {
    yield();
  }
}

void setup() {
  Serial.begin(9600);

  // Wait for USB Serial
  while (!Serial) {
    yield();
  }
}


from Hooks.c
------------

/**
 * Empty yield() hook.
 *
 * This function is intended to be used by library writers to build
 * libraries or sketches that supports cooperative threads.
 *
 * Its defined as a weak symbol and it can be redefined to implement a
 * real cooperative scheduler.
 */
static void __empty() {
 // Empty
}
void yield(void) __attribute__ ((weak, alias("__empty")));



from SysCall.h
--------------
class SysCall {
 public:
  /** Halt execution of this thread. */
  static void halt() {
    while (1) {
      yield();
    }
  }
  /** Yield to other threads. */
  static void yield();
};

And really, I don't understand any of that. But there isn't a single example in which yield() is used for SD logging activity, or for that matter, any SD operation.

My problem with yield is that I don't know how it works or what it does, or even how it could work to prevent a write to SD from becoming blocking. I've never seen any detailed explanation of how it could be used. I don't even understand what effect it has on the Serial operations in which it's used in the above examples. In low level terms, assembly even, what is it actually doing?

I did find LowLatencyLogger.ino which kinser86 might want to look at for coding alternatives. But it's 19K of dense code, so maybe not all that useful. Doesn't use yield for SD.

SpaceJohn:
ShermanP, I start with an erased SD card with one header file. My datalogging does not require precise time intervals. All entries are over sampled and time stamped. Even a 25 ms block write delay does not affect the data analysis. My next datalogger iteration will use a STM32 processor for higher clock speeds, running a variation on RAID 0 with SD cards. Calculations show a 4X increase in recording speed to ~32KB/sec without an external buffer. Addition of a FRAM external buffer will increase data throughput, but at a higher cost compared to Flash.

Closing files after every write is a waste of precious time. I transfer approximately 2.5-3 KBs to the SD card internal buffer between file close commands.

What do you use to erase a card? The official SD Formatter from the SD Association, recommended in the sticky here, has an option to "overwrite" the entire card, but it leaves the card all zeros. My understanding is that erased flash memory is all FFs (is that right?). So the Formatter may optimize the FAT structure, but it appears to leave the rest of the card completely hosed as far as fast writing is concerned - not a single sector left erased. So what's your favorite eraser?

Things certainly are easier if you don't have to log data at a fixed interval. Then it just becomes a question of what you can live with in terms of SD delays. But if you do need a fixed interval, and the longest SD delay would bust that, then it still seems to me that you need multiple buffers, and the fixed interval provided by an interrupt. Well, unless Morgan's mystery yield() could somehow have the same effect. I'm assuming that the actual writing to the card is an interruptible event. I think SPI transmission of a byte proceeds to completion no matter what the processor is doing (it's done in hardware in the module).

I think the FRAM external buffer is pretty slick. In fact, you might want to look at the Texas Instruments MPS430FR microcontrollers. They are vonNeumann processors with all the Flash memory replaced by FRAM. So you can write to FRAM directly instead of sending it out via SPI or I2C, and you don't have to erase anything first (under the hood the processor is doing a bit of that for you). And there's a port of the Arduino IDE for some of these MSP430 parts called Energia, which seems to work petty well.

I see closing the file after each write recommended a lot here. But that sure increases the work load on the SD card. You have to update the file size in the directory entry, and the FAT table, and the second copy of the FAT table, and all of those always require erasure first.

Sherman, you see yield() a lot in ESP32 code, where you have to yield some time to the chip’s operating system so that it can do WiFi and USB tasks.

You don’t need assembly to understand yield(). Look at the SdFat code. Once it has spent more than a certain time waiting for the card, it calls yield().

MorganS:
Sherman, you see yield() a lot in ESP32 code, where you have to yield some time to the chip's operating system so that it can do WiFi and USB tasks.

You don't need assembly to understand yield(). Look at the SdFat code. Once it has spent more than a certain time waiting for the card, it calls yield().

I don't have ESP32 stuff installed. There is some yield stuff in the ESP8266 core, but it appears to deal only with the WDT. Can you point me to the code you're referring to?

In any case, what does the yield function do when it is called? How does it yield time to the operating system? Where is that code?

On most Arduinos, yield() is a virtual function. It is empty and just waiting for you to write a "real" one to do what you need during yields.

It originated on the Arduino Due, with the scheduler library. That yields during any delay() to let other scheduled tasks run. It was such a good idea that it's now on all Arduinos.

Think about servicing the WiFi side of the chip during a delay(). That is what it is for.

Ok there is a lot of information here I need to digest in the previous post but first let me address the change in SdCard.

Per SpaceJohn, I changed to a different card I had which is a Kingston 8GB SDC10/8GB. I formatted the card with the sdcard.org program and then ran the SdInfo.ino program. The init time is substantially lower (from 76ms to 3ms).

init time: 3 ms

Card type: SDHC

Manufacturer ID: 0X41
OEM ID: 42
Product: SD8GB
Version: 3.0
Serial number: 0XC3045500
Manufacturing date: 1/2015

cardSize: 7969.18 MB (MB = 1,000,000 bytes)
flashEraseSize: 128 blocks
eraseSingleBlock: true
OCR: 0XC0FF8000

SD Partition Table
part,boot,type,start,length
1,0X0,0XB,8192,15556608
2,0X0,0X0,0,0
3,0X0,0X0,0,0
4,0X0,0X0,0,0

Volume is FAT32
blocksPerCluster: 64
clusterCount: 242944
freeClusters: 242941
freeSpace: 7960.69 MB (MB = 1,000,000 bytes)
fatStartBlock: 12586
fatCount: 2
blocksPerFat: 1899
rootDirStart: 2
dataStartBlock: 16384

The +200ms blips in my data are now in the 100ms range. I did a few short test and the results were as follows:

  • 33,000 samples (10 samples greater than 50-ms)
  • 7,100 samples (1 sample greater than 50-ms)
  • 8,700 samples (0 samples greater than 50-ms)

Is there an oppourtunity in my current code in the _writeState with respect to checking the length of the buffer?

if (buffer.length() >= 312) {

As I mentioned before, it is maxing out at 463. I have noticed that the program performs better (fewer instances of +50ms) when the size is larger, and worse (more instances of +50ms) when it is smaller.

A general question, am I exceeding the capabilities of a FSM in this scenario? I bet there are endless ways of writing this program and ultimately for me this was an exercise for me in learning FSM. Before I go buying FRAM breakouts and spending money on hardware, am I at a point with this program where I should be writing binary data or am I still within reason to have readable data at 20hz?

I will keep reading the previous posts so I can digest the information some more. Thank you everyone for your input this is great help so far.

What FSM? Your code has no states.

Yes there is a variable called state but it is not being used in a FSM.

MorganS:
What FSM? Your code has no states.

Yes there is a variable called state but it is not being used in a FSM.

Maybe phrasing the question this way would be better.

Am I limited with a switch case structure?

I don't know enough about C++ to comment on your switch case code. But I would be curious about what would happen if you did away with the buffer entirely and simply wrote the successive values to the file, and let SdFat handle the buffering. And only flush and close the file when you want to stop logging.