SdFatEX - up to 10X faster writes

Can you use higher SD write performance? A factor of ten better on Due, and double on Uno.

Modern SD cards have very poor performance on most development boards since they only emulate 512 byte blocks devices.

I have been experimenting with a new high performance class, SdFatEX. This class uses extended multi-block transfers to get higher performance.

The down side is that a dedicated SPI bus is required since the card must remain selected to maximize the size of multi-block transfers.

This is not a problem on boards like the STM32 Maple Mini, about $5.00 on ebay, since it has two SPI controllers. It is supported by Arduino for STM32. The Maple Mini has 18 MHz SPI but other STM32 board have up to 50 MHz SPI and get 10X improvement.

Here are some results for 512 byte transfers with the SdFat bench example. The card is a SanDisk 16GB Ultra MicroSD.

Due with standard SdFat:

FreeStack: 91587 Type is FAT32 Card size: 15.93 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3 OEM ID: SD Product: SL16G Version: 8.0 Serial number: 0X91203A25 Manufacturing date: 4/2014

File size 5 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 340.35,92031,1149,1502 342.19,21583,1147,1494

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 1697.68,1230,286,300 1698.26,599,287,299

Due SdFatEX:

FreeStack: 91579 Type is FAT32 Card size: 15.93 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3 OEM ID: SD Product: SL16G Version: 8.0 Serial number: 0X91203A25 Manufacturing date: 4/2014

File size 5 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 4440.21,7298,110,113 4464.00,6885,110,112

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 4616.51,1230,108,109 4620.78,981,108,109

Maple Mini with SdFat:

FreeStack: 11799 Type is FAT32 Card size: 15.93 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3 OEM ID: SD Product: SL16G Version: 8.0 Serial number: 0X91203A25 Manufacturing date: 4/2014

File size 5 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 310.83,77589,1273,1645 312.95,21471,1256,1634

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 1164.07,872,425,438 1164.07,872,425,438

Maple Mini SdFatEX:

FreeStack: 11791 Type is FAT32 Card size: 15.93 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3 OEM ID: SD Product: SL16G Version: 8.0 Serial number: 0X91203A25 Manufacturing date: 4/2014

File size 5 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 1946.92,91744,247,260 2038.19,7978,247,249

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 2053.26,1511,247,248 2053.26,1325,247,248

Uno standard SdFat:

FreeStack: 555 Type is FAT32 Card size: 15.93 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3 OEM ID: SD Product: SL16G Version: 8.0 Serial number: 0X91203A25 Manufacturing date: 4/2014

File size 5 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 243.59,77036,1740,2095 244.59,18120,1740,2086

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 507.63,2000,980,1002 507.74,1996,980,1002

Uno SdFatEX:

FreeStack: 550 Type is FAT32 Card size: 15.93 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3 OEM ID: SD Product: SL16G Version: 8.0 Serial number: 0X91203A25 Manufacturing date: 4/2014

File size 5 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 677.83,75824,728,748 684.61,29980,728,741

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 641.89,2368,784,791 641.97,2368,784,791

where would one find the library to download/implement? checked github where i pulled your sdfat, and only found sdfat-beta.

thanks

I am testing now and it will be in the the next SdFat-beta.

It's taking a bit longer than I planed since I decided to restructure SdFat to simplify configuration options.

The Maple Mini has 18 MHz SPI

FYI - the Maple Mini (and the Blue/Red Pills) has got two SPIs - the SPI1 @36MHz max, and the SPI2 @18MHz max. The SPI1 @36MHz works fine with SdFat beta and CL10 cards.. FYI - MMini @36MHz and SdFat beta, Sandisk Ultra 16GB CL10 UHS-I:

File size 5 MB
Buffer size 8192 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
3073.26,8747,2532,2659
3052.61,9452,2580,2679
3052.61,10577,2536,2679
3048.88,12224,2537,2681
3013.94,14400,2538,2713
3043.31,15502,2537,2687

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
2997.67,3104,2724,2732
2995.88,3103,2688,2732
2995.88,3112,2687,2732
2997.67,3103,2687,2732
2995.88,3104,2683,2733
2995.88,3106,2686,2732

pito,

Yes, I pointed out in the original post that Maple Mini has two SPI controllers.

I found not all STM32F103x8/B chips run with SPI1 at 36MHz. Worse, some occasionally fail. Check the datasheet, both SPI controllers are rated at 18 MHz max.

Yes you get high performance with 8 KB writes but SdFatEX gets very high performance with 512 byte or smaller writes.

Try 512 byte writes on Maple Mini at 36 MHz. A buffer pool with 8 KB buffers on a Maple Mini is not practical.

With the SD I used on Maple Mini at 18 MHz, 512 byte writes were 310 KB/sec SdFat and 2,000 KB/sec SdFatEX.

On a Nucleo STM32F411 SdFatEX gets 3,985 KB/sec with 50 byte writes and 5,279 KB/sec with 512 byte writes.

Even SDIO doesn't help small writes with SdFat or FatFS. You must do massive multi-block writes to take advantage of SDIO at 48 or 50 MHz.

Yes, the 512bytes writes/reads on MM with SdFat even at 36MHz are quite slow. 8kB buffer with MM is still feasible (it has got 20kB ram available). But the sdcard's write latencies are much worse issue we have to cope with :) I did a naive exercise with stm32f407 SDIO in past and the best results were with 16kB and 32kB buffers. What is the main difference between SdFat and SdFatEX? - the speedup is enormous :astonished:

Modern SD cards now have a lot of RAM buffering and even understand the FAT structure for a properly formatted card.

There is a large RAM buffer for a single data stream to the data section of the card. In very high end cards there are separate smaller RAM buffers for the two FAT areas and a directory block. (Note this is the programmers model - who knows how it is really done.)

If you keep the card selected and access the card properly, you get the advantage of this buffering.

I sent an email to you earlier. for others that want to understand this:

You should read the requirements for the latest generation of SD cards. The SD spec has had requirements for max performance for some time but you could get away with not following it.

Look at section 4.13 of the simplified spec. Too bad I can’t give you the full spec. Be sure to look at “4.13.1.3 Write Performance”.

Note in table 4-56 that an AU (Allocation Unit) can be as large as 4MB and you should plan for an RU (Recording Unit) size of 512KB.

Newer high end SD cards have controllers that perform extremely well when you follow the spec but a new $30 card may not perform as well as a five year class 4 card for small transfers.

I suspect there is major flash wear when the $30 card is used in the current SdFat.

I have been experimenting with extreme access patterns and some cards appear to fail. I think it is because they exhaust their pool of erased flash. They come back to life if you let then rest with power on and apply clock but don’t access data.

My Maple Mini boards won't run reliably with SPI1 clock at 36 MHz but I found a cheap, $2.72 with free shipping, ebay STM32F103C8T6 board that will run at 36 MHz.

Here are the results for 512 byte write/read with SdFatEX using a 20 MB file.

File size 20 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 3779.96,11540,132,134

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 3249.88,1312,155,156

I ran the test a number of times and the write latency is never more than 13 ms.

Here is the block number and latency in usec for for cases over 1 ms. SdFatEX uses a second cache block for FAT/directory entries on this board so the long latencies happen in areas where this cache is written. This involves a write to each copy of the FAT and a read of the new FAT block. There are 128 FAT entries in a block and the cluster size is 64 blocks so this happens with an interval of about 8192 blocks.

for example 16064 - 7872 = 8192

Wear leveling and erasing flash can cause other long latencies.

block,usec 0,1052 7872,9400 7936,5490 16064,11540 16128,4775 24256,9819 24320,5403 32448,11380 32512,4754 32513,2246

Here are results for an 18 MHz clock on SPI1.

File size 20 MB Buffer size 512 bytes Starting write test, please wait.

write speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 2039.13,12739,247,249

Starting read test, please wait.

read speed and latency speed,max,min,avg KB/Sec,usec,usec,usec 2041.00,1253,249,249

The over 1 ms latency list.

block,usec 0,1152 7872,10843 7936,6104 16064,12739 16128,7068 24256,10673 24320,6113 32448,11658 32512,5251

Pretty excited for this.

Moved to an Adafruit WICED feather to get a powerful chip (STM32F205, 120MHz, multiple hardware SPI) to enable fast datalogging (moved from 16MHz Adafruit Pro Trinket(s), my current go-to controller).

want to get 500+ samples a second of 8 analog inputs (12bit ADC, so 6Kb/sec; but 4x that would be great - 24Kb/sec). I'd also like to output some data to 4 OLEDs over SPI (clock of 10 MHz maximum), but first things first.

My other problem is finding a decent micro sd card, but am checking a sandisk ultra in a second. My new Samsung evo+'s were pretty average/slow for latency.

I'd move away from Arduino, but i'm such a beginner that i just can't.

thanks for the work!

I have posted a version of SdFat-beta with the fast SdFatEX class.

Hi, does it also improve read performance? Thanks