Go Down

Topic: Compare logged data on SD card to current reading - £25 for working answer  (Read 894 times) previous topic - next topic

wvmarle

That snippet won't work as you don't actually open any file.

Code: [Select]

File Nautilus = SD.open("Nautilus.txt"); // The file with your data in it.
uint32_t fileSize = Nautilus.size(); // The total file size.
Nautilus.seek(fileSize - 52);

Serial.print(Nautilus.size());
  //  Serial.println("---");


This will work a lot better.
Quality of answers is related to the quality of questions. Good questions will get good answers. Useless answers are a sign of a poor question.

cedarlakeinstruments

So, after working with GadgetCanyon on this for a while it turns out to be a deceptively complex project.

I originally said I'd do it because at first glance it seemed like a simple 10-minute project, and on a PC recording continuously, it would be. However after asking a few questions about usage, the complexity became evident.

If this were continuous data capture on a PC, it would be pretty much dead simple. You have a stream of data being logged every 10 seconds, and after 24 hours of logging (8640 records), you can start to index a pointer that always points 24 hours behind in time and you can read the data at that point in the stream and compare it to the current reading. The memory usage on a modern PC would be trivial. In and done in 15 minutes!

But this is an Arduino and we can't keep 24k of data in RAM and the file manipulation libraries are not as fully-implemented as we'd like. The easy way to do this would be to follow the concept above: keep a pointer in the SD card data and index it forward as needed. Even on a PC this is a bit tricky when working with files, as opposed to memory streams. Due to how the FILE* methods work, I ended up having to open the file in write mode, save the data, close the file and then re-open it in read mode. This was because of limitations on how the seek() and related API calls work.

Even so, this doesn't meet all the requirements. One of the problems is that the logger can be turned on and off at some times, so you're not guaranteed  that between now and 24 hours ago, you would have 8,640 records. Therefore the only thing that will work properly, without changing the logging time to use Unix epoch time, or getting involved in complex calculations involving leap years (what happens if yesterday was February 29?), is searching through the data. Easiest is searching from the beginning, but the time becomes O(n) as the file grows. So ideally, we want to search backwards in time, since the maximum search distance is bounded regardless of file length.
I should probably have done that, since in retrospect it would have been simpler, but hindsight and all that...
 

@GadgetCanyon: As I mentioned, I wasn't able to test it as much as I'd like, so let me know if you have any issues integrating the code I sent into your main sketch.
Electronics and firmware/software design and assistance. No project too small

wvmarle

In a spreadsheet application the date/time may be shown in human readable format in the cells, the underlying storage is always in UNIX time format, as that's easy to work with for a computer. It's just much more hidden from the user than when working with an Arduino - and that's part of the fun of working with such microcontrollers.

For your search problem, you can of course just start searching line by line from either end of the file until you get to the one you need, but it's not exactly efficient.

If there is indeed the possibility of a gap in the data, what will work a lot more efficient is to take the data 8,640 records ago (where you expect it). If you have been running for the past 24 hours you'll see the exact time stamp you expect, and you're done searching. If not, check the difference between the time you see now, and the time you expect, calculate how many records you are off, adjust your seek position, and try again. Within a few iterations you'll have the required record.

Then, as it's a continuous thing and you want to find yesterday's record every 10 seconds: keep that search position (store it in a global variable). 10 seconds later, you simply read the next record, it's most likely the one you need. If it's suddenly much later a time than you expect, just keep the position and return an error or whatever you think appropriate. It simply means there's a gap in the data, and you will catch up. You will have to account for the difference to not be exactly a multiple of 10 seconds as well, of course.

I'm about to start a similar logging battle - but with SPIFFS (ESP8266 internal), and with monthly logging, so at least a lot less records. But enough work to be done on them, as it's got to be handling data of up to 246 nodes :-)
Quality of answers is related to the quality of questions. Good questions will get good answers. Useless answers are a sign of a poor question.

cedarlakeinstruments

That's an interesting algorithm and it's certainly worth considering. In case like this, though unless there's a need for the performance, I'd probably just go with a binary or even linear search. Honestly if it didn't seem so dead simple at first glance, I would probably have just done that and gotten it over with.

That's the thing with one-off applications: since software dev time has such a high cost, it's usually worthwhile to just go with the simplest, brute-force approach that works.
Electronics and firmware/software design and assistance. No project too small

wvmarle

You have to search 8,640 records at least - assuming you start from the end of the file. From the start of the file it gets worse, fast.

A quick search on SD card read speeds gives me a read speed of about 6 µs per byte, that would suggest an SPI speed of >1 MHz. At this speed those 8,640 lines of 26 characters each take 1.35 seconds to read. That is not including any access overhead. Other posts give me a speed of 250 kB/s - that would come to 0.9 seconds.

If this is done every 10 seconds, that'd mean you're spending 10% of the time searching! That'd be acceptable only if it has to be done once (upon startup), after that better keep the pointer and just advance it line by line, pausing when there's a gap in the data. No need to search over and over again, as you know the next record you need is either the next one in the file, or not there.
Quality of answers is related to the quality of questions. Good questions will get good answers. Useless answers are a sign of a poor question.

Go Up