I'm working with an Arduino-based datalogger, sampling three analog channels (from an accelerometer) and writing the value to a MicroSD card. I've been tasked with cranking the sampling rate up as high as possible in order to capture impact events, and while I'm currently at around 1000 Hz I think that it would be possible to sample somewhat faster than that.
At this point, my loop basically does three things: call analogRead() for three channels, iterate a counter, and write the values to the memory card. Since analogRead() should take about 0.1 ms, I believe that the major bottleneck lies in storing the data. When I remove the logfile.println() statement, I end up with about 0.5 ms per sample, which is reasonable.
I have tried three strategies in order to optimize my program:
- Write all three values in separate print statements
int16_t accX = 0;
int16_t accY = 0;
int16_t accZ = 0;
uint16_t counter = 0;
while(counter < 20000) {
counter++;
accX = analogRead(A0)-512;
accY = analogRead(A1)-512;
accZ = analogRead(A2)-512;
logfile.print(accX);
logfile.print(",");
logfile.print(accY);
logfile.print(",");
logfile.println(accZ);
}
This one takes about 30 s for 20,000 samples, so 1.5 ms per sample.
- Calculate the square of the norm, so that I only have to store one value. I would then calculate the square root of this value when I'm post-processing the data
int16_t accX = 0;
int16_t accY = 0;
int16_t accZ = 0;
int16_t normsq = 0;
uint16_t counter = 0;
while(counter < 20000) {
counter++;
accX = analogRead(A0)-512;
accY = analogRead(A1)-512;
accZ = analogRead(A2)-512;
normsq = accX*accX+accY*accY+accZ*accZ;
logfile.println(normsq);
}
This one takes 22 s for 20,000 samples, so 1.1 ms per sample. The time it takes to calculate the norm seems to vary somewhat with the size of the values, and whether or not the values are constant from one measurement to the next. I'm also aware that I might want to use long rather than int16_t as in my example.
- Used sprintf in order to write all three values to a char, and then write that char in a single line.
int16_t accX = 0;
int16_t accY = 0;
int16_t accZ = 0;
uint16_t counter = 0;
char buf[15];
while(counter < 20000) {
counter++;
accX = analogRead(A0)-512;
accY = analogRead(A1)-512;
accZ = analogRead(A2)-512;
sprintf(buf,"%d,%d,%d",accX,accY,accZ);
logfile.println(buf);
}
This strategy takes 21 s for 20,000 samples, so 1.05 ms/sample.
I'm not really used to embedded development and this level of optimization, but wouldn't it make sense to do bitwise concatenation, from 3*16 bits to one 48 bit variable, and then write that single variable? Then I would be able to split it's value to three binary strings in postprocessing and retrieve my three 16-bit ints. I'm thinking that it might be faster than taking the route via sprintf.
Does this line of thinking make sense at all? In that case, how would I go about doing so in an efficient way? Is there a better way of optimizing the data storage?