SdFat for bigger RAMs

Hi, is there an SdFat variant for chips with bigger RAM - ie. 1284p (16kB)? The buffering may increase the throughput significantly..
P.

Have you tried fat16lib's SD software?
It worked on my Bobuino '1284 cards under -0022 with SD and uSD cards.
Haven't tried it under 1.0.1 yet.

Yes, of course I run SdFat on 1284p and others. The Q is whether SdFat is able to utilize its large RAM for improved performance.
p.

Oh, I don't know. You'd have to PM fat16lib and ask him.
Might depend on what the SD/uSD card can accomodate too. If they only have 512b byte buffers, does it help to buffer more in the uC SRAM? I don't know.

I have thought of using a multiple block cache but I don't think it would improve performance much for most applications. The problem is that the copy to/from the cache costs so much with the AVR.

The best way to get better performance is to avoid the cache and use raw read/write of blocks with a contiguous file.

The other big variable is the SD card. Each card has varying performance depending on the access pattern. SD cards have occasional delays when written and the delays depend on the timing and pattern of writes.

Do you have some specific application in mind or is this just a general question?

It comes from an exercise I did with F4discovery and SDIO where the FatFS write speedup ratio between using 512bytes block transfer and for example 4kbytes block is 1:7 (32kB 1:15).. So it may help with steaming ADC data to the card, for example. Of course SPI clock frequency could be a bottleneck here..
p

speed.jpg

You don't need buffering to use the streaming multi-block mode. I do it in my binaryLogger on the 328. You just send a multi-block write command to the SD and then write a block each time a block buffer is full. Finally you send a write end command to the SD.

The binaryLogger is an example in fastLoggerBeta20110802.zip Google Code Archive - Long-term storage for Google Code Project Hosting.. I will soon post another example that uses this mode to log 100,000 8-bit samples per second from the built-in AVR ADC.

An absolute minimum time to write a block due to SPI speed is 520 us. It takes longer since you must fetch the data, check that the SPI data register is empty, and do loop control. The current SdFat block write function takes about 820 us to write a block or about 620 KB per second.

This is the max rate for streaming in multi-block raw write mode. It might be possible to improve a bit on this with some cleaver assembly code in the loop. Currently I have optimized it with two bytes per iteration.

In a practical application like my fast ADC logger which uses this mode, the main overhead is the ISR for the ADC. At 100,000 samples per second that's two interrupts ever 10 us. One to clear the timer flag which starts the next conversion and a conversion done interrupt to read the data. The SD write is a small part of the overhead. The write needs to be reliable with no random delays and streaming raw write does that. On a Mega I use thirteen 512 byte buffers to increase reliability. This means a write can take as much as 65,000 us and still not lose data.

For normal file operations extra cache won't payoff on the Arduino. The AVR is at most a few percent as fast as the STM32F4 for data handling. SDIO is a 4-bit bus and much faster so the Arduino is hopelessly outclassed here also.

I did a test writing the same block in a tight loop 100,000 times which is a 5,1200,000 byte file. It took 8223 ms which is 623 KB/sec.

If you are using 4-bit SDIO at 25 MHz, your bit rate is 12.5 times as fast as the Arduino's 8 MHz SPI bus.

If rates scaled by bus bit rate you would get 7.78 MB/sec on the STM32F4. Of course it is unreasonable to expect that kind of scaling but the Arduino is not doing too bad.

The STM32F4 scores 501 in the CoreMark benchmark CPU Benchmark – MCU Benchmark – CoreMark – EEMBC Embedded Microprocessor Benchmark Consortium.

AVR processors do about 0.54/MHz so a 16 MHz Arduino would score about 8.64.

That's a factor of 59 more powerful!!

On a Mega I use thirteen 512 byte buffers to increase reliability.
This means a write can take as much as 65,000 us and still not lose data.

This may help with 1284p as well (16kB ram).
p.

You can use as much memory as you want with the fast logger examples. It's just a constant. I use the SdFat cache buffer also so you get an additional block buffer. That's how why the total is 13.

|| defined(__AVR_ATmega2560__)
// Mega - use total of 13 512 byte buffers
const uint8_t BUFFER_BLOCK_COUNT = 12;
#else