I'm working on a project where I have a large array of integers. Using a rotary encoder, I will scroll through those integers (in both directions), outputting one or two at a time on a 7 segment display.
The array is far too large to store all on SRAM (on the Mega), so I store it in a CSV on an SD card, and only load a portion onto RAM, and load more as needed as I scroll. My question has to do with the size of this portion of the array. If it's small, I'll be accessing the SD card more frequently. If it's large, I don't have to load as often, but load times will increase.
At the extreme, I could only load the element(s) that I'm currently displaying, and access the SD every time I increment up or down with the rotary encoder.
Does anyone have any experience/thoughts on how to optimize this aspect of my code?
jimLee:
Just get it working any way you can first. Deal with optimizing it after. You'll understand more and maybe you won't need to optimize it at all.
-jim lee
Of course testing will be necessary
However, I will need to choose a size of said portion of the array. It's this size that I'm unsure about. Not sure how I can avoid optimizing this number since I'll set it arbitrarily at first. I'd just like some perspective so I don't set it entirely arbitrarily at first.
If those values are constant, or if you don't mind reprogramming the whole thing just to do a little change, then why not storing them in its flash memory? Although due to compiler limitations, arrays of any kind cannot exceed 32 KB (32767 bytes) in size, so in that case splitting up the lookup table is the only option.
Nevertheless, continuing to your original question:
If your concern is about optimal performance on an SD card data access (read), then consider this when deciding a size for your cache:
The library already caches 512 bytes of data, so any read attempt within this boundary will be almost instantaneous (1 microsecond at most). Outside of this boundary (every 512th byte) will trigger a cache reload, that should take 1-4 microseconds depending on the SPI's clock frequency.
There's also another key factor inside the filesystem itself: something called cluster (aka allocation unit) size; you can know that by either running the CardInfo example, or by creating a non-empty and no larger than 512 bytes big file, and then checking its "size in disk" property in the file explorer of your PC. Like with the previously mentioned cache, every cluster-size bytes will trigger another kind of reload; but this time it's the slowest one because of how FAT systems work. This special process might take 10-40 microseconds in the worst case; probably not a big deal anyway, but on time-critical events who knows...
In summary: reading a byte from an SD card most of the time it's nearly instantaneous, eventually it will take a bit longer (once a little while), and sometimes even longer (not so often though).
The first situation occurs when the data is already in RAM by the library. The second one when accessing data outside the library's cache (just loads the next data block). The third one when accessing data outside the current filesystem-defined "cluster" (queries the FAT for the next file's cluster and then loads the corresponding data block).
Too technical I know, but bear with me this is more exactly what you're dealing with when deciding the size.
jimLee:
So its burning up 512 bytes of RAM from the paltry 2K we have? (For a standard UNO)
Unfortunately yes, but for a good reason.
SD cards are essentially a (NAND) flash memory and not an actual EEPROM. Although reading data in a random fashion is straightfoward, writing like that is not.
Due to how NAND flash memories write data (erase-program cycle), it has to be divided in blocks called "pages" for that matter. In an SD card, this "page" size is coincidentally the same as the standard disk sector size (and you guessed, it's 512 bytes). NAND flash memories always have to write data in "pages", even if only a single byte has to be updated.
And since an erase-program cycle of a flash memory is relatively time-consuming, the library has to have a cache of at least a "page" big. It would be terribly inefficient (not to mention faster wearing) to trigger this erase-program cycle for every byte you wanna put or overwrite on the file.
So in a nutshell: the cache is that big mostly for the write operations; otherwise it could be smaller or even absent at all (but still at a cost of performance since data is still transferred in a serial manner).
The closest thing an SD card has on its own, is a buffer rather than a cache. But this buffer only works after a "write command", so it means it's there only to temporarily hold the data (block/"page") while is being received (remember, SPI is a serial communication), and before it's actually written into its flash memory (initiate the erase-program cycle).