Just joined and hope I'm posting this in the correct area/giving enough info etc! I'm completely self-taught with very little knowledge beyond using the simple commands...
I've simplified the question as I'm struggling to convey it clearly - appreciating this might also limit the potential helpfulness of answers...
Simplistically, I'm trying to record and display the mean, max, min and SD of a measured value. But this measured value needs to be recorded at the coordinate of two other inputs. These two other inputs change independently and constantly. i.e. I'd like an array of x against y and record a running average of z (ideally plus the other data) at the various x - y integer values. x and y can be from 1 to 5 so an array of 25 pieces of data.
I first thought I could 'simply' create 25 running averages using RunningAverage.h library, one for each x-y 'coordinate'.
Unfortunately I can't see an easy to way to use this with out it being very clumsy - very difficult to add data into and also ultimately print the data to the serial port.
Is there anyway I can set this up as an array instead?
Or are there any running average libraries that work for an array of data? I've tried searching the internet but the results only seem to come back as how to use an array to calculate a running average, not having an array of running averages!
I am not quite clear on what problem you are trying to solve, but would it help if the array could hold variables of different types instead of all data items needing to be of the same type. Perhaps something like this dummy code
dataLayout data[int x, int y, float runningAverage, int max, int min, float standardDeviation]
The Smoothing Example uses a simple moving-average array. (Actually, just the array pointer moves, not the array contents).
...I have an application that uses this, and every time I update the array I find & save the peak as well as the average. I have two arrays of 20 elements each, updated & re-calculated once per second.
Just keep in mind that all of this calculation will take time and that might limit how fast you can sample the data.
I think I'm probably asking for something that can't be done with a standard library and I don't think I'm doing a very good explanation ether of what I'm looking to do!
Ignoring the max/min/sd so as not to confuse things further - what I'd like is a 2 dimensional array of data with each value being itself a rolling average of some data.
e.g. Array[x][y]
(hope that inserted screen shot works!)
Where RA1 through to RA25 are all rolling averages of the value that's been measured (z).
The data could be coming in like the below this:
x=1, y=1, z=13
x=4, y=5, z=12
x=3, y=3, z=13
x=3, y=4, z=11
x=1, y=1, z=15
(etc)
so the z value would be added to the rolling average for each x - y 'coordinate' if that makes sense? i.e. at x=1, y=1 there have been two z values of 13 and 15 so the rolling average for this 'cell' would be 14.
I think the closes I can see of a way to do this is a 3 dimensional array (?) i.e. Array[5][5][5]; where the data is manually 'put' into that 3rd element in a sequential way such that the rolling average can then be calculated without using a library.
So using the above data and x=1 y=1 example:
Array[1][1][1] would have the value of 13 and Array[1][1][2] would have the value of 15 so I'd then have to calculate the rolling average (somehow) at each 'cell' for e.g. Array[1][1][n] where 'n' is the sample size (in this instance 5).
It would be nice to do this on the Arduino but I might be easier to simple save all the raw data into a CVS file and do it in Excel! Down side is, I'd have to take the data away to analyse it and not be able to see it 'live'.
"struct" you can imagine as a collection of variables, this type we named Data.
and than we declaring a two dimensional array of such collections named myData.
for each array cell is like a third dimension of array, but stored variables are not identical types. even Bits are allowed.
if you run on a 32 bit platform you could use the vector class to make a circular buffer
#include <vector>
template <typename T>
class CircularArray {
public:
CircularArray(size_t maxSize) : maxCnt(maxSize), buffer(maxSize), front(0), rear(0), count(0) {}
void push(const T& value) {
if (count == maxCnt) {
// Buffer is full, overwrite the oldest element
front = (front + 1) % maxCnt;
}
buffer[rear] = value;
rear = (rear + 1) % maxCnt;
count = std::min(count + 1, maxCnt);
}
size_t currentSize() const {
return count;
}
void print() const {
if (count == 0) {
Serial.print("Empty ");
} else {
size_t index = front;
for (size_t i = 0; i < count; ++i) {
Serial.print(buffer[index]);
Serial.write(' ');
index = (index + 1) % maxCnt;
}
}
}
double average() const {
double result = 0;
if (count == 0) return 0; // arbitrary
size_t index = front;
for (size_t i = 0; i < count; ++i) {
result += buffer[index];
index = (index + 1) % maxCnt;
}
return result / count;
}
private:
size_t maxCnt;
std::vector<T> buffer;
size_t front;
size_t rear;
size_t count;
};
then you decide how deep your running average should be say we keep the last 10 elements and the type of data you keep in there (say double) and you create your 2D array of the circular buffers
constexpr size_t Rows = 5;
constexpr size_t Columns = 5;
constexpr size_t N = 10; // Maximum number of elements to keep for the rolling average
// Create a 2D array of CircularArray objects
CircularArray<double> circularArray2D[Rows][Columns] = {
{{N}, {N}, {N}, {N}, {N}},
{{N}, {N}, {N}, {N}, {N}},
{{N}, {N}, {N}, {N}, {N}},
{{N}, {N}, {N}, {N}, {N}},
{{N}, {N}, {N}, {N}, {N}}
};
note that you could decide to have some elements in your 2D array registering more values
now when you want to save an incoming value into the array you use push()
circularArray2D[y][xl].push(aValue);
and when you want to compute the average for a given (x,y) entry you call
circularArray2D[y][xl].average();
here is an example with one circular buffer
click to see the code
#include <vector>
template <typename T>
class CircularArray {
public:
CircularArray(size_t maxSize) : maxCnt(maxSize), buffer(maxSize), front(0), rear(0), count(0) {}
void push(const T& value) {
if (count == maxCnt) {
// Buffer is full, overwrite the oldest element
front = (front + 1) % maxCnt;
}
buffer[rear] = value;
rear = (rear + 1) % maxCnt;
count = std::min(count + 1, maxCnt);
}
size_t currentSize() const {
return count;
}
void print() const {
if (count == 0) {
Serial.print("Empty ");
} else {
size_t index = front;
for (size_t i = 0; i < count; ++i) {
Serial.print(buffer[index]);
Serial.write(' ');
index = (index + 1) % maxCnt;
}
}
}
double average() const {
double result = 0;
if (count == 0) return 0; // arbitrary
size_t index = front;
for (size_t i = 0; i < count; ++i) {
result += buffer[index];
index = (index + 1) % maxCnt;
}
return result / count;
}
private:
size_t maxCnt;
std::vector<T> buffer;
size_t front;
size_t rear;
size_t count;
};
constexpr size_t N = 5; // Maximum number of elements to keep for the rolling average
CircularArray<double> circularArray(N);
void setup() {
Serial.begin(115200);
for (int i = 1; i <= 10; ++i) {
circularArray.push(i);
circularArray.print();
Serial.printf(" => average = %f\n", circularArray.average());
}
}
void loop() {}
Another means of smoothing incoming data is to use a leakyintegrator.
Google all the theory you want, it comes down to
average = 0.9 * average + 0.1 * newReading;
So take most of the old average and add in a little of the new reading.
Use any 0.9 and 0.1 that add up to 1.0. Generally
average = alpha * average + (1.0 - alpha) * newReading;
where alpha is set between 0.0 and 1.0.
Making alapha like 0.7 means the old average hangs around less, new values are more important so it would take fewer steps to converge on a new value coming in that was held constant.
So you don't need a store N values and take the average of them, just maintain the average(s) according to the leaky integrator equation.
Try it on your data.A graph of it can show you it smoothing and converging and so forth.
I've just had a play with the "leaky integrator" method in Excel and that looks like the way forward! It doesn't give me the max/min/SD but those values are only to give me initial confidence that the system is 'under control' - I've used RS232 DataLogger in the past so planning to use this again to assist here.
I'd agree and say that if you are logging data then, in principle, log direct values where possible and leave any treatment such as averaging, smoothing or other processing etc. to an offline process. That way you can, if necessary, modify the processing algorithm and re-analyse any collected data.
Thanks but to be perfectly honest, I don't understand the example enough to implement it!
You've lost be with what a vector class is...
This is a good example of code where there seem to be a step-change in programming complexity that I can't get my understanding past. If its in Arduino Reference then I can generally work it out and/or search for examples on the internet. But there is so much stuff here that I don't recognise, it seems to be impenetrable to decipher! Is this where things become more based on C (or C++)?
I'm using a genuine Uno for this application (it's a good few years old so assuming it's an R3) and a cheap R3 for bench testing. I understand only the new R4 is 32-bit? But upgrading wouldn't be a problem...
Is the sampling rate for each of the (x,y) coordinates independent of each other - in other words, at any given time, does each (x,y) coordinate have a number of samples for the z value that is independent of the number of samples for every other (x,y) coordinate?
Seems that you might need a circular buffer for each possible (x,y) coordinate.
I'm measuring three variables (x,y,z) with z being able to potentially vary the most. Some x-y 'coordinates' will be very common to occur and others may not even practically get used. If I have confidence that the z value at a each x-y location is doing what I expect, then I can simply use a rolling average or (looking better) leaky integrator to keep track of these values. I can then make changes to an output to bring the z value to where I want it to be. But if the z value is all over the place and has a high SD, then I know my control algorithm isn't working! I don't mined a few spurious values (I could pick up on the max/min) but would hope to be seeing a tight 'group' around the mean.
To give the 'real world' application, this is to record manifold pressure ('z') for a turbo-diesel engine where 'x' is the engine RPM and 'y' is the engine load.