SDfat -> Can't initialize with sd.begin()

Hello there,

I have a problem with my new Arduino MKRZero and the SDfat library.

The MKRZero has integrated an SD card reader and I would like to read as efficiently as possible data from the SD card.

The reason: during the reading I would like to perform several tasks (Bluetooth communication, stepping motor control, etc).

So I read a little and would like to read the data through the SDfat library. But that is the problem. The library always tells me that the card could not be initialized.

I use the following code parts:

#include <SPI.h>
#include "SdFat.h"
SdFat sd;
void setup() {
  Serial.println(" - Init SD Card");
  if (sd.begin(SDCARD_SS_PIN)){ Serial.println("   done"); } else { Serial.println("   failed"); return; }

I think, that the sd.begin() parameter of SDfat is the SlaveSelect Pin (like in the doku), but the Variable SDCARD_SS_PIN only works with the default SD library. I have already testet SS and without any parameter. Same Problem. I always get the message failed. Also the examples from the SDfat library always fail.

The standard SD library works and reaches a read rate of 70.55 kb / sec (382 kb file).

Someone has an idea what it could be? :frowning:

The MKRZero has a second dedicated SPI bus connected to the SD card. SdFat currently only supports the first SPI bus on SAMD.

I don't currently have an MKRZero to test support for the on-board SD.

Oh, that’s a pity. So there are only two possibilities.

  1. Use the internal card reader and the SD library
  2. Use an external card reader and the SDfat library

Do you know a way to read with the SD library quickly? I have worked with undocumented functions in the meantime (Sd2Card, SdVolume, SdFile) and could speed up the reader reading to 199kb / sec by “block” reading (simplified code):

#include <SD.h>
Sd2Card card;
SdVolume volume;
SdFile root;
SdFile testFile;
//Init card, init Volume
//open root
byte byteBuffer[10000];
int  byteBufferLength = sizeof(byteBuffer);, fileNameChars, O_READ);
while(...){ byteBufferReadCount =, byteBufferLength); }

However, I do not have comfort functions to check the existence of files. Otherwise i have to create own functions.

But maybe u know a better way to read faster than the 70.55kb / sec with default SD funktions.

I can't suggest anything to speed up SD. I wrote the base code in 2009.

/* Arduino Sd2Card Library
 * Copyright (C) 2009 by William Greiman

/* Arduino SdFat Library
 * Copyright (C) 2009 by William Greiman

I think it will be easy to modify SdFat to use SPI1 for MKRZero. You need to edit the SdSpiLibDriver class in SdSpiDriver.h

Place this define before the class:

#ifndef SDCARD_SPI
#endif  // SDCARD_SPI

Then replace SPI with SDCARD_SPI in the SdSpiLibDriver class.

I think these lines need to be changed.

Line 43:     SPI.beginTransaction(m_spiSettings);
 Line 47:     SPI.endTransaction();
 Line 57:     SPI.begin();
 Line 64:     return SPI.transfer( 0XFF);
 Line 75:       buf[i] = SPI.transfer(0XFF);
 Line 84:     SPI.transfer(data);
 Line 93:       SPI.transfer(buf[i]);

To this:

Line 43:     SDCARD_SPI.beginTransaction(m_spiSettings);
 Line 47:     SDCARD_SPI.endTransaction();
 Line 57:     SDCARD_SPI.begin();
 Line 64:     return SDCARD_SPI.transfer( 0XFF);
 Line 75:       buf[i] = SDCARD_SPI.transfer(0XFF);
 Line 84:     SDCARD_SPI.transfer(data);
 Line 93:       SDCARD_SPI.transfer(buf[i]);

You should then use the SdFatEX class which is very fast with dedicated SPI.

You need to set this in SdFatConfig.h


Here is the bench example with a Zero board:

File size 5 MB
Buffer size 512 bytes
Starting write test, please wait.

write speed and latency

Starting read test, please wait.

read speed and latency

The Zero is slower than a UNO for bench so I put a scope on SPI clock for this test.

#include <SPI.h>
void setup() {
  SPI.beginTransaction(SPISettings(50000000, MSBFIRST, SPI_MODE0));
  while (true) {
void loop() {}

See the attached file, samd.png. SPI clock is 12 MHz but there is a huge gap between bytes. It takes 1.8 µs to transfer a byte.

See uno.png. SPI clock is 8 MHz but the time to transfer a byte is about 1.4 µs.

The SAMD controller is slow and buggy. From the SAMD driver, 12 MHz is the max clock.

  // Even if not specified on the datasheet, the SAMD21G18A MCU
  // doesn't operate correctly with clock dividers lower than 4.
  // This allows a theoretical maximum SPI clock speed of 12Mhz

First of all many many thanks, because you have done so much work. Totally interesting data with the different processors and signal timings. I would not have thought that there are such differences. I will save the thread for me and if necessary edit the SdSpiLibDriver.

But these changes are very challenging even if it is, as you write, small adjustments.

However, I have here a new sd card reader (Amazon, 3 € :slight_smile: ). I have now tested with the reader.

Interestingly, the SD library ONLY works with the internal card reader.
As already described, the SDfat library (by default) works only with the external card reader.
Thus I could not carry out my tests of the two libraries and class on the same hardware. But, the results are very interesting !!!

So I tested the different classes with the same data (381909 byte). Here are the results:

Library/Class File Reading Performance SD Card Reader
SD Bibliothek (File Class) "while(file.available()){; }" 70.55kb/Sek internal
SD Bibliothek (File Class) "while(...){, 1000); }" 153.28kb/Sek internal
SD Bibliothek (SdFile+Sd2Card+SdVolume) "while(...){, 1000); }" 183.50kb/Sek internal
SDfat Bibliothek (File Class) "while(file.available()){; }" 86.78kb/Sek external
SDfat Bibliothek (File Class) "while(...){, 1000); }" 214.34kb/Sek external
SDfat Bibliothek (SdFile Class) "while(file.available()){; }" 90.21kb/Sek external
SDfat Bibliothek (SdFile Class) "while(...){, 1000); }" 214.11kb/Sek external

Thus, the decision falls on the following constellation: the SDfat library with the external (cheap) card reader and the blockwise reading into a buffer.

In my application the data should not be read completely to process them at the end. I need to load the data in small pieces to process them in real time. I have only max 45 microseconds to read and process.
Therefore, I have been testing reading small blocks. And that I would not have thought possible:

Data Length Duration Performance
----- 2 Mikrosekunden
0 Byte 4 Mikrosekunden
1 Byte 8 Mikrosekunden 122 kb/sec
2 Byte 8 Mikrosekunden 244 kb/sec
4 Byte 9 Mikrosekunden 434 kb/sec
6 Byte 9 Mikrosekunden 651 kb/sec
8 Byte 9 Mikrosekunden 868 kb/sec
10 Byte 10 Mikrosekunden 977 kb/sec
12 Byte 10 Mikrosekunden 1172 kb/sec
14 Byte 11 Mikrosekunden 1243 kb/sec
16 Byte 11 Mikrosekunden 1420 kb/sec
18 Byte 11 Mikrosekunden 1598 kb/sec
20 Byte 12 Mikrosekunden 1628 kb/sec
30 Byte 13 Mikrosekunden 2254 kb/sec
40 Byte 16 Mikrosekunden 2441 kb/sec
50 Byte 18 Mikrosekunden 2713 kb/sec
100 Byte 27 Mikrosekunden 3617 kb/sec
300 Byte 64 Mikrosekunden 4578 kb/sec
500 Byte 102 Mikrosekunden 4787 kb/sec
512 Byte 104 Mikrosekunden 4808 kb/sec
550 Byte 2497 Mikrosekunden 215 kb/sec
600 Byte 2511 Mikrosekunden 233 kb/sec
750 Byte 2535 Mikrosekunden 289 kb/sec
1000 Byte 2584 Mikrosekunden 378 kb/sec

Unbelievable that the performance with 512 byte or less is so high !!!

Why this is so, I can not understand. But I think there is an 512 byte buffer implemented somewhere (for example, on the sd card reader).

Because I only have a maximum of 45 microseconds to load min 2 byte + process, I will probably load about 20-50 byte as a block (12-18 micros.). So I have enough time in the next loops to take care of the bluetooth communication and the stepper motors, ....

Thanks again and Greetings, DrDooom. :slight_smile:


Yes there is a 512 byte cache so when data is in the cache, read is very fast.

When the cache needs to be filled, read of even one byte can take a long time.

Some times two 512 byte sectors need to be read, one from the FAT to find the next cluster in the file plus the data sector.

The scope measurements of the Arduino SAMD SPI driver show that minimum time to read a sector is about one millisecond. This is just the time to transfer the data over the SPI bus. Reading two sectors will take at least two milliseconds.

The bench example for the new SdFat shows that the maximum latency to fill the cache is 2609 microseconds. The minimum latency to fill the cache is 1067 microseconds.

Your program must function with an occasional read latency of at least 2.6 millisecond.

First, I looked at the current version of SD.h and there is an undocumented call to begin() that will speed up read about the same as you custom use of hidden classes.

The call is

boolean begin(uint32_t clock, uint8_t csPin);

Just use a very large value for clock and the maximum for you board will be used. Try this.

sd.begin(25000000, SDCARD_SS_PIN);

I get this result for a 550 byte transfer with SD.h

File size 5MB
Buffer size 550 bytes

Read 322.74 KB/sec
Maximum latency: 4383 usec, Minimum Latency: 1465 usec, Avg Latency: 1701 usec

I was able to get an MKRZero for testing. I did the mods to the driver and did a test with 32 byte transfers.

I have updated SdFat on github to support MKRZero and any other cards with a built-in SD using a dedicated SPI controller. You can download it from github and the Arduino library manager will have it soon. Be sure to use the SdFatEX class with the MKRZero for best results.

Here are the results for read using the SdFatEX class for dedicated SPI on MKRZero.

File size 5 MB
Buffer size 32 bytes

read speed and latency

The average speed for reading a 5 MB file is 385 KB/sec. The average read latency is 80 microseconds. Most reads have a latency of 13 microseconds which appears to be 2.46 MB/sec for 32 bytes. Read latency can be as great as 2737 microseconds which would be 11.7 KB/sec.

You must be prepared for occasional long read latency.

Thanks again fat16lib,

i def. have to test the SdFatEX :slight_smile: But first of all i need to check the new version of SDfat with the internal SD card reader.
Via library manager i could install version 1.0.3! i think this is right even if u wrote (pm) that 1.0.2 whoud be the newest!

Well, i have downloaded the version and testet it with the internal card reader!
The result! Great.... it works and the performance is equal to the external reader: 214.48 kb / sec -> reading 381909 byte with 1000 byte buffer.

But since i updated to the new Version, i can not use the external Reader. do i have to change my sd initialization:

while(!sd.begin(PIN_EXT_SD_CS)){ Serial.println("   failed"); delay(1000); } Serial.println("   done");//SDfat -> ext. reader -> CS Pin 7

I do not need it necessarily since I can use the internal reader. But it would still be interesting to know if there is a way to select first/sec SPI bus. But by the test just performed I suspect the performance is identical. the way. where is the SdFatEx Library included? i could not find it in the arduino lib manager. also i could not find it here: GitHub - greiman/SdFat-beta: Beta SdFat for test of new features

EDIT: got it!

I think you could force use of the shared SPI controller by editing SdFatConfig.h and adding this:


In the future I will add an option to select the SPI controller.

For STM32 I use an argument in the SdFat creator:

SdFat sd1(1);  // use the first SPI controller.

SdFat sd2(2);  // use the second SPI controller.

The other option is an added parameter to begin().

Hello fat16lib,

hmm... i think i need ur help / opinion once again.

Loading / defining the class SdFatEx instead of SDFat was no problem:

#include "SdFat.h"
//SdFat sd;       //SdFatConfig.h  ->  #define ENABLE_EXTENDED_TRANSFER_CLASS 0
SdFatEX  sd;     //SdFatConfig.h  ->  #define ENABLE_EXTENDED_TRANSFER_CLASS 1

Then I checked the available classes.
My previous test with SdFile + Sd2Card + SdVolume does not work (as expected). The class SdFile + SdVolume does not exist. So I have initialized as follows:

while(!sd.begin(PIN_INT_SD_CS)){ Serial.println("   sd failed"); delay(1000); } Serial.println("   done");//SdFatEx

It does not work. I always get "false".

So I looked at the source code and initialized the class Sd2Card:

Sd2Card  sd2Card;
while(!sd2Card.begin(PIN_INT_SD_CS)){ Serial.println("   sd2card failed"); delay(1000); } Serial.println("   done");//SdFatEx

That works without problems. I think the class Sd2Card extends the class SdSpiCard !?

So I read the class SdSpiCard and simply read the size of the sd card.

uint32_t cardSize = sd2Card.cardSize();//n * 512 byte data blocks
cardSize = cardSize / 2 / 1024;//Blocks -> kb -> mb
Serial.println("Card Size: " + (String)cardSize + " MB");

Card Size: 1910 MB
Test Duration: 85 microsec.

It runs as desired! But: I think the class SdSpiCard is working for really low level with the sd card ??? There are no comfort functions, e.g. The simple reading of byte blocks. Is that correct?

I have found functions like this:

bool readBlock(uint32_t lba, uint8_t* dst);

But first I need to identify the logical 512 byte block of a file on the sd card.

I think that would be too big for my project and I would limit myself to the SdFat class.

I love to get deeper into techniques / code / performance etc, but that will take me too much time.

It is clear that comfort functions cost. So you have to get closer to the low level programming. But that will be too close to me (I think). :wink:

Since I want to load the data in small pieces and cant wait ~2ms, I have tried to improve the performance (in my application case) for reading the next 512 byte.

The last Problem:
contentBytesCount +=, byteBufferLength);
//Doing some other staff until we need more data!
contentBytesCount +=, byteBufferLength);

Duration Block 1: 104 microsec.
Duration Block 2: 2178 microsec.
As i have already seen.

Problem, when i need more data, i have to wait 2074 micro's + 104 micro's! so i have tried to 'inform' the sd card to load next block and be prepared for reading.

1. Try: Peek one byte!
contentBytesCount +=, byteBufferLength);
contentBytesCount +=, byteBufferLength);

Duration Block 1: 104 microsec.
Duration Peek : 2183 microsec.
Duration Block 2: 104 microsec.
Sure, it is still a reading process. Even if peek() does set back the position to the previous one.

2. Try: Seek
contentBytesCount +=, byteBufferLength);;;
contentBytesCount +=, byteBufferLength);

Duration Block 1: 104 microsec.
Duration Seek : 6 microsec.
Duration Block 2: 2178 microsec.
Ok, it is a simple positioning. Does not request anything.

3. Try: Available
contentBytesCount +=, byteBufferLength);
contentBytesCount +=, byteBufferLength);

Duration Block 1 : 107 microsec.
Duration Available: 2 microsec.
Duration Block 2 : 2178 microsec.
It is a simple calculation between position and size. Nothing more.

My Question: Do u know an async call to preload one block for realy fast reading? I found a function for writing all data (async?) to a filesystemobject:
bool FatFile::sync()

I found the function

cache_t* read(uint32_t lbn, uint8_t option);

in the class FatCache (file FatVolume.h). The description is:

Read a block into the cache.

Is it possible to use this function to simple inform the sd card to preload without blocking the arduino?

I warned you earlier that read latency could be 2-3 ms.

The average speed for reading a 5 MB file is 385 KB/sec. The average read latency is 80 microseconds. Most reads have a latency of 13 microseconds which appears to be 2.46 MB/sec for 32 bytes. Read latency can be as great as 2737 microseconds which would be 11.7 KB/sec.

You must be prepared for occasional long read latency.

Is it possible to use this function to simple inform the sd card to preload without blocking the arduino?

There is no SD command to preload data. Plus the MKRZero has very slow SPI and the entire 512 byte sector must be transferred.

I also warned:

Some times two 512 byte sectors need to be read, one from the FAT to find the next cluster in the file plus the data sector.

SD cards are designed to be used with lots of host buffering in a multi-threaded environment. A read thread can read ahead to fill buffers. Also DMA helps so the read/write thread mostly sleeps.

I warned you earlier

Didnt read over that part and didnt forget it. But i needed to do it step by step to understand most problems and possibilities. I would like to solve it in the area of ​​the storage.
If it is not possible, I have to go other ways e.g. interrupts.

If there is no way to preload a block, i would say thank u again and check other ways.

Greeting DrDooom...

Interrupts are one way. Another possibility is a RTOS.

I do complex high performance projects using ChibiOS/RT on STM32 boards. ChibiOS does a context switch in a fraction of a microsecond.

I do SD stuff in a low priority thread and stuff like Bluetooth and stepping motors in one or more high priority threads.

I started using RTOSs on micro-controllers in the 1980s for large physics experiments.

The modern approach at labs like CERN is to go back to the old simple Arduino style programs using soft CPUs.

Each hardware board has a large FPGA chip and implements as many soft CPUs as required.

Up to 8 CPU cores, with private code/data memory
Programed in bare metal C, using standard GCC tool chain
Each core runs a single task in a tight loop.
No caches, no interrupts.

Just like when I started with the first micro-controllers in the 1970s.

.....this is incredible!

:o Oh, I did not see that you wrote. I thought you were out! Thank you very much for the info. I've never heard of RTOS.

Since 1988 i have only worked with "default" computers like amiga, pc, etc. An OS sounds like the system between hardware and program. so I would expect that the microcontroller can no longer be programmed over the conventional way (e.g. Arduino IDE).

ChibiOS is available in different versions (TR / NIL) for different requirements. Incredible.
Even if the thread has left the original theme:
Can you run ChibiOS on Arduino?

If i am right: I've seen that there is a RTOS specifically for Arduino: FreeRTOS

Apparently, you only have to integrate several libraries to work with several "software" threads.

This could also totally change my current larger project. A board computer (not for cars) with many devices, sensors and 1x Bluetooth communication.

A RTOS is the perfect solution to the problem. I'll read some information tomorrow.

Well, I've read about RTOS. It would be the solution of my problem, if I could run relatively independently (without trigger) two parallel threads.

Then I tried FreeRTOS because it is for Arduino. Unfortunately, the library is incompatible with my MKRZero since an AVR is expected.

Then I tried the port FreeRTOS-Cortex-M0 (Arduino Zero, not MKRZero). When compiling and also in the examples I get the message that variables (time variables) in the helper file are missing.
Unsuccessfully i tried to change the variables to existing time variables.

Then I tried the library scheduler. In the library you have to switch between the "tasks" with sleep and yield. Does not take me any further.

Then (again) I tried to minimize the times when reading from the SD card.
I have accessed the SdFat object on SdSpiCard to read continuously.

SdFat sd;
SdSpiCard *sdSpiCard;
sdSpiCard = sd.card();
uint32_t bgnBlock, endBlock;
uint8_t  buffer[512];
file.contiguousRange(&bgnBlock, &endBlock);

512 bytes are transfered in 1.05 ms continuously.
This speed up reading (476.19 kb / sec), but has the disadvantage that I can read only 512 byte blocks completely (Arduino is blocked for ~ 1.05 ms) and not e.g. 50 byte blocks.

After all, I learned a lot about working with sd cards.

Also with interrupts I will probably again encounter obstacles. So I'll break the project.

Thanks again.