Sensor reading: Average or median or something else?

I want to read the values of some sensors (DS18B20, BME280, wind vane with analog output) that I want to read with an Arduino nano.

To get a more accurate value it is better to read multiple times and then calculate the result. But which method is the best?
Average, median, or...?

That depends on what you want

Average would normally be taken as "add up the individual values and divide by the number of values" Is that what you want ?

Using median in this context would seem to be meaningless

UKHeliBob:
Average would normally be taken as "add up the individual values and divide by the number of values"

Yeah I meant like that.

That is more formally known as the mean value and is easy to calculate

You might want to try [u]Smoothing[/u] (taking a moving average). It doesn’t make the readings more “accurate” but it makes them more stable.

…You can adjust the timing or the number of readings. I have an application where I save a reading once every second and I calculate a 20-second moving average.

DVDdoug:
You might want to try [u]Smoothing[/u] (taking a moving average). It doesn't make the readings more "accurate" but it makes them more stable.

…You can adjust the timing or the number of readings. I have an application where I save a reading once every second and I calculate a 20-second moving average.

I have read about that but my project is going to run from battery with a few minutes of sleep mode, then wake up and get the readings and so on.
If I understand, in this case I couldn't use smoothing.

tosoki_tibor:
I have read about that but my project is going to run from battery with a few minutes of sleep mode, then wake up and get the readings and so on.
If I understand, in this case I couldn't use smoothing.

I agree that a running average is not for you in this case.

You're gonna deal with two types of noise here: noise generate by the sensor and noise generated by wind fluctuations. I have no experience in measuring wind, but my intuition is that the latter will be so much more present that you don't care about the former.

With that established, you need to think about how long you will need to measure in order to get a representative sample. Think about what is the lowest frequency fluctuation you want to filter out, and make sure you measure at least for a few periods. Is the wind making the vane oscillate at 1 Hz (=1 period/second), and you want to average that out, make sure you measure for at least 5 seconds, and take at least 50 samples in that period. If you then take a arithmetic mean, you will have a pretty representative sample.

So in short: use an arithmetic mean, but consider timescales for determining how long and often you measure.

Thanks, I have also read about an other method. We sort the values and we throw away the ~10% of them from the beginning and from the end, then take the mean of the rest.
So if i have 10 values, I only take the mean of the "middle" 8.

How about this?

tosoki_tibor:
Thanks, I have also read about an other method. We sort the values and we throw away the ~10% of them from the beginning and from the end, then take the mean of the rest.
So if i have 10 values, I only take the mean of the "middle" 8.

How about this?

That's very nice, but it changes nothing about the things I mentioned before. No amount of statistical trickery can fix a non-representative sample.

My first step would be to connect the sensor and record a continous stream of data for, let's say a minute. Then have a close look at those data and determine what is the smallest sample you can take while still having a representative mean.

in this case I couldn't use smoothing.

Of course you can.

jremington:
Of course you can.

I disagree. A running average is nice when you want a continuous stream of data. OPs application involves waking up, taking a discrete sample and going back to sleep. Smoothing is not a desirable technique here.

Utter nonsense.

jremington:
Utter nonsense.

Allright. Thank you for being helpful and respectful and all that.

@TimMJN: This is a technical forum, and if you want to be taken seriously, you are going to have to do a lot better than that.

What the processor does between making measurements is completely irrelevant to the data. If the time interval between data points is important, then measure at timed intervals, or timestamp the data.

@tosoki_tibor: The median filter is a way of dealing with large outliers in a collection of data and is useful if those outliers make simple averaging or smoothing impractical. It is slow, because each set to be filtered must be sorted. You will have to look at some representative data series to determine if such an approach is required.

For weather data, most people use simple averaging of some number of points collected at regular intervals. You need to experiment to determine what number of points and what interval make the most sense.

jremington:
@TimMJN: This is a technical forum, and if you want to be taken seriously, you are going to have to do a lot better than that.

You're first making a few totally unconstructive comments to finally come with the exact same advice I was giving. This is a forum for helping people, and if you want to be helpful member you're going to have to do a lot better than that.

Sometimes it's not about being technically correct, as that often confuses people even more. It's about telling them what is relevant for them. Of course you can still use smoothing even when the system is asleep in between measurements, but that is clearly not what OP is looking for. So why bother confusing them with it?

This discussion is getting off-topic, peace out.

@OP
Have You followed all advices regarding the electrical integrety of the sensors, using decoupling capacitors, resistors, cabling etc.? Personally I sometimes use shielded cables conneting the shield to GND to cut down electronical noice from affecting the signals.

If the signal is shit no mathematical magic can cure the illness.