A 3D equivalent to a median

A median of a set of values is the middlemost value, when the values are sorted, right? A median might tell you what the typical value is, better than the average or mean of all values, if we have a few very extreme values, which affect the mean very much. If you want to know what a typical income of a person named Musk is. Take the median, not the mean.

So, I'm creating this colour recognition machine. The machine needs to recognise four cases, a red bead, a green bead, a yellow bead and no bead. I might extend the colour selection later. So far everything works great. I have a reference value for each case. I do a colour scan (using an RGB LED and an LDR). Checking a yellow object, I get the following values:

|478|463|38|
|548|505|24|
|530|513|42|
|537|522|49|
|489|474|36|
|549|551|62|
|530|507|33|
|520|504|40|
|521|502|39|
|524|496|41|
|527|519|44|

These are the R, G and B values read by my sensor. We see that R and G are high and B is low. That's yellow, for sure. But I'd like to use these values to define the reference RGB triplet for yellow. I could take the mean, which is just calculating a mean separately for R, for G and for B. But as an academic question, I thought a median value could be better. Some of these triplets could be a bit off. They should be excluded.

There might not be a 3D equivalent to a median in 1D. This has been discussed a lot on the net. I'd like to hear your thoughts. Where taking the mean as described above really works. It pretty much describes the center of gravity. So my own proposal for a 3D median could be as follows:

  1. Calculate the mean (center of gravity) of the set of triplets
  2. Search for the point that is farthest from the center and exclude it from the set
  3. If you have more than 2 points left (or 5, or 10%, or something else), jump to 1
  4. Your last calculated center is your 3D median

Think about a two-dimensional space. The equivalent of a median would be a line with half of the points on each side. If you think about it, you should see that for any set of points there is an infinite number of lines that put half the points on each side. Similarly, in 3-D a plane divides the space in half, and there are an infinite number of planes that divide any set of points in half.
(The same is true with one-D values, there are an infinite number of points that will have half the values on each side. But since the median is defined as being a member of the set of points you're limited to one or possibly two possible points).

Going back to the 2-D space, if you wanted to sort points into one of four possible colors what you would want is lines that divide the space into four regions, with each region corresponding to a color. If you could somehow define an "epicenter" of each color, for each color the region would be the area that is closer to that epicenter than the epicenter of any other color.

In three dimensions you would use planes to divide the space into regions.

You may find that you want to impose some additional rules, such as having a minimum distance to the epicenter, and if a point is on the boundary between two regions having a minimum distance from the boundary to be considered in that region.

That just kicks the problem down the road to a similar problem, how do you calculate the epicenter? You could stipulate it. You could do a simple mean. Or you could do a least-squares method to minimize the distance to all the points.

Try reducing the problem to 1D by transforming the RGB values into HSV or HSL color space, and just take the H (hue) values, as H represents the color perceived by the human eye.

C code is available, or use

Not sure why you think a median would be better. A mean is fine for numbers that follow a "normal" aka "Bell curve" distribution. The reason a mean does not work well for comparing salaries is because they don't follow a normal distribution.

However, the "center" of a set of points in N-dimensions is probably called the centroid. You can calculate the closeness of a point to a centroid by calculating the length of the vector to the centroid.

The only reason is that some measurements might be way off, due to some disturbance. I'd be happy with the centroid, as long as there are no bad measurements.

Looking at your data, most of the variation is from intensity changes (which you can correct for by normalizing), as well as some residual variability in the Red channel.

After normalizing the readings, the coefficients of variance are significantly reduced (dramatically so in the Green and Blue channels):

Normalized? Red     Green Blue  
No 4.2% 4.6% 23.4%
Yes 1.1% 0.1% 0.0%

Thus, a small amount of variance remains in the Red channel after normalizing, but the original variability in the Green and Blue channels appears to have been caused almost entirely by intensity variations.

In addition, after normalizing, there do not appear to be any extreme outliers that would skew your distribution

Therefore, I would suggest normalizing the measured triplets, and then using either the mean or the median as a reference value. Even for the Red channel, after normalization, there is less than a 0.15% difference between the mean value and the median value.

In that case, all this talk of means and medians is irrelevant. A simple heuristic rule should be enough. The blue channel may not be needed at all.

If you needed to distinguish a fake yellow bead that was a slightly different shade of yellow than the genuine beads, then, fair enough.

There's a heuristic rule for yellow.

Why? By your own admission, you don't need to do that, because the heuristic rule works.

To understand whether mean, median or mode is the best estimate of a typical value, you need to understand the distribution of the values. If the distribution is normal (Gaussian/bell-curve) then mean is the most accurate.

Then this is merely a theoretical problem.

I don't think the concept of normalizing is unambiguous here. Or perhaps my setup is too simple. I have a small chamber, 3D printed in black PLA. It contains the bead I want to examine. It's a wooden bead with a hole. A glossy colour. The chamber is covered with a black plate with the LED and the LDR. The LDR forms a voltage divider with a 10 kOhm resistor. This will give me readings from an analog pin between 120 and 970.
I have no idea what the range 120 to 970 actually means in terms of lux or lumen or anything. I only know that 120 is very dark and 970 is very bright.
My colour recognition works very nice, as long as I only have four cases. A red, a green or a yellow bead in the chamber. Or an empty chamber.
So, what I want to know (maybe for future projects) is, what would be a more scientific approach to this.
I did a kind of normalization by converting each reading to what the percentage was for each colour channel to the sum of all channels. But there I lost the info about brightness. If I'd include orange and brown beads, they would end up in the same value, if normalized this way.

I get that the basic concept of a median doesn't work in 2D or 3D (or any higher). But going back to 1D, we can get the median by picking away the highest and the lowest values. Then we continue untill we have one or two left. And that's the median. If we do the same for a set of points in 3D, it could go as follows (this is an extension of the OP algorithm):

  1. Calculate the centroid of the set of triplets
  2. Search for the point that is farthest from the centroid
  3. Search for the point that is farthest from the point found in step 2
  4. Delete these two points
  5. Repeat from step 1 until only 1 or 2 are left
  6. Your last calculated centroid is your 3D median

[edit/]

Here's a video showing this algo. Sorry for not adjusting x and y axes to equal scale. I have a random set of points, though I changed some x values to make the cloud a little biased to the left. I had an odd number of points, so picking them away in pairs, one point remained.

Using the mean works well for defining a reference color, but I see your point about outliers affecting the result. A possible approach for a '3D median' could be to take the median of each channel (R, G, B) separately. This might give a more robust reference without extreme values skewing it.

Hi @Johan_Ha ,

I cannot say whether the median is a helpful value for your application but for the discrimination of objects a perceptron might be a solution.

There is a lib available

https://github.com/arduino-libraries/Arduino_Perceptron

with an example for classification of two objects by colour...

It might be a start for trained classification incl. a rejection class.

Good luck
ec2021

But there's something fundamentally wrong in picking a median separately from each channel. The median for the R channel might be a value, which doesn't occur in a combination of the median of G and median of B. If we have a 1D array of an even number of points, I believe the median is defined as the average of the two middle most points. This in itself is problematic, if the average itself is a point far from any of the original points in the set.

Anyway, this seems like an overkill, since I have a working solution. But the academic question remains. A geometric median in higher dimensions might be the best generic equivalent to a median in 1D. I might return to that, if I ever increase the number of colours I need to recognise.

There's a library to map the triplets into HSV and HSL:

For a randomly oriented colored bead with a hole, a HSV/HSL might work well because the shadowy bit might stay constant in H space, as the darker bits mix with more or less of the image.

Then, one-dimensional means or medians on the H S or V axes wont change interact to change the color

Since LDRs are highly nonlinear, the numbers you have aren't simply related to light intensity, and the RBG color scheme you are using has no well defined meaning.

If you were instead to use a photodiode based color sensor like the TCS34725, then the RGB values would be proportional to light intensities in each of the R, G and B bands, and actually be useful.

I suspected something like that. But while the LDR doesn't have a linear response, it still responses with strictly decreasing resistance to increasing light intensity. So an LDR based colour detector could still be pretty accurate, as long as it would be calibrated. We just need more points in its skewed RGB space to minimize the error when interpolating between the points. In a perfectly linear RGB space we only need the 8 corners of the colour cube.

Not necessarily, if the spectrum of the illumination also changes. Typical CdS LDRs respond primarily to green light, but very weakly to blue or red.

CdS detectors are useful for roughly estimating outdoor light levels, but not much else.

Good to know. But it is still a strictly decreasing resistance to increasing light intensity, when only one colour is used. Of course, I don't adjust the light intensity of the RGB LED. I shed one colour at a time, with full intensity. The more of that particular colour there is in the object, the intenser the light is that reflects on the LDR, and the lower the resistance gets. This is the strictly decreasing function.

interesting, I imagined it mentally as a bubble that got smaller and smaller.
In 2D it would be a circle.

Any indication of the performance O(n2) ? or better?


from wikipedia: Geometric median

In geometry, the geometric median of a discrete point set in a Euclidean space is the point minimizing the sum of distances to the sample points.
This generalizes the median, which has the property of minimizing the sum of distances or absolute differences for one-dimensional data.
It is also known as the spatial median,[1] Euclidean minisum point,[1] Torricelli point, [2] or 1-median.
It provides a measure of central tendency in higher dimensions and it is a standard problem in facility location, i.e., locating a facility to minimize the cost of transportation.

This means (1) making a distance table of all points, (2) sum per colum or row and (3) take the minimum of that. ==> O(n^2)

I have no idea. Never bothered to learn to calculate those things. Time is not money in my projects :sunglasses:
As I see it, my algo gets rid of any bad readings first. Then it narrows the set towards the median candidate. To make it end in one single point instead of two, like the geometric median does, I could make the number of points an odd number by throwing away the first point without finding its pair. That is, if I start with an even number of points. After that I pick them away two and two.
As I see it, the geometric median might be affected by the farthest point, which might be an error reading. Think again about Musk. If Musk earns one billion a week or ten billion a week, it won't affect the median sallary of everyone named Musk. But it will affect the geometric median. So, the geometric median won't be the same as the true median in 1D. My algo results in the true median in 1D (without the latest add on with dealing with an even number of points). That's why I think this might be a better multidimensional median than the geometric median.
Then again, everyone picks whatever method they think suits the purpose. There's no mathematical truth hiding here somwhere, which would tell us what the traditional 1D median is in higher dimensions.

Not convinced, as the point will be farther away from all other points. Salary is one dimensional so all points get a new sum that is increased by same extra amount.
I made a note on my todo list to dive into this subject.

Think the GM has an undefined number of points, as one can easily construct examples (at least in 2D and 3D) that has far more GM's. E.g. imagine a sphere of points that are equidistant on the sphere. Then all of them have the same sum of distances and by definition they are all the minimum so all are the GM. Any symmetrical shape would have multiple GM's. Or imagine a sphere in a sphere, then the points of the inner sphere would be the GM.
That said, these are artificial constructed, not impossible but not very probable.
A nice property of those symmetrical shapes is that even a slight disturbance of the symmetry let the system "collapse" and a median is formed. Like the collapse of a quantum superposition into a defined state.