This is a bit tricky... I have taken great care to gather data from the Arduino in the form of time intervals between events. I then have to go through 2 differentiations and a couple of other math operations for perhaps half a dozen math manipulations total before coming up with a data set to plot. By the time I've done all that, the final data set is invariably full of skiky staircase evil. I tried putting a simple 10-pt average on it and that calmed it down a little but not enough to be acceptable. Excessive averaging will produce too much lag in the data and destroy detail so I'd rather go for some sort of real-time statistical regression or whatever you call it where you use special Runge Kutta, Gauss, Lagrange or other spooky language to take the spikes away but using a much smaller sample of data.
Any ideas as to what the present-day conventional wisdom says about how one should filter such noisy data in a processor/time efficient way that would lend itself well to computation? Attached is a sample of what the data looks like and this is after 10-pt averaging.
Can you describe the data you are originally collecting, and the mathematical treatment of that data. It could be that any “clean up” could be started earlier in the chain.
One expression I'm used to is: "Shit in, shit out". In other words, bad signals into the Arduino gives bad values out of it. I suggest a checkup of the sensors and their surroundings like decoupling capacitors, cabeling, GND etc.
6v6gt: I attached a picture showing the main data processing and the formulae involved. Basically I have a nano generating ticks with a timer, a mega measuring those ticks with Input Capture and then it sends those events to the PC over USB serial and the PC performs all the physics calculations as shown. The torque is the final output but actually the noise appears at the acceleration column since after the 2nd derivative, everything following (torque, horsepower) are just acc being multiplied by a constant, which doesn't introduce any noise. Velocity looks good on a graph too so really it's between vel and acc that things go haywire.
Dr D: The processes represent an encoder spinning up so while the events do not occur are regular intervals, they do follow a pattern of sorts. They are changing in a continuous fashion and not all over the place.
Idaho: Wow that is a fantastic find! Now in my case, I am not doing any sort of filtering on the Arduino - all processing occurs at the PC. This is to save precious CPU time for other much needed tasks. However I can see other uses for this. For the current problem I'm really interested in something on the PC side (which should have major advantages owing to GHz and GBs of RAM) which can do something like that in real-time without causing the filtered curve to lag too far behind the real data and not significantly reduce the resolution of the data, as averaging would.
Railroader: I live by that mantra (GiGo in my case). That's why I've spent the last month or so toiling over the arduino side (and many here on this forum have helped me!). I now believe the code is quite sublime and is producing near perfect output. The problem is there's an inherent granularity in "ticks" because they are not continuous. When you do physics on discrete time deltas, it wreaks havoc on downstream derivatives like velocity and acceleration. Therefore considerable signal conditioning needs to happen on the ticks to make them continuous, even if the ticks themselves are being recorded perfectly.
Do we understand each other? I suggest hardware signaling, de-noicing the sensors. Digitally outputting sensors do have noice and then decoupling capacitors attached from Vcc to GND can make a difference. The same goes for analog sensors. Maybe that would give more nice, less wild, data for the army of math.
But all the data coming from the arduino are integer timer counts and there is no noise in them. I have simulated this phenomenon by programming the signal generator to simply decriment the timer counter (540, 539, 538, 537... etc) and the mega's input capture is picking them up the same way (540, 539, 538, 537... etc).
The problem is not with signal acquisition. The problem is that 540 ticks is 540 x 0.000064 (64uS tick timing in my case) and 539 is 64uS less than that. If an event occurs between these numbers, the Arduino is incapable of presenting the data in between 539 and 540. Thus if my physical encoder is sort of hovering between these 2 numbers, you just get a random oscillation between 2 whole numbers and this produces near infinities on the physics when you see the dt suddenly change from 1 stable value to the next.
To clarify on my previous post, what I truly believe will fix the problem is NOT noise cancelling, as I am producing the noise myself by nature of the granularity of the data. What I really need in all likelihood is some sort of continuous piece-wise polynomial function that updates in real-time on a small sample set of say 5-10 events. So it looks at the last 5-10 ticks and produces a continuous smooth algebraic polynomial function with 0 granularity and then as the points move along, the sample shifts and re-computes the polynomial so you have this snake-like polynomial that keeps bending and twisting to adapt to the updated set of data. In this way you have no granularity and NO noise at all in the torque.
I have confirmed this to be true by doing all my math with algebra and calculus instead of doing it in the programming space with discrete data points. When I use my graphing calculator to graph polynomial derivatives (where my functions f(x) are almost a perfect curve fit to my actual data), the output torque curve naturally has no noise on it and is the shape I expect it to be. This simply tells me that doing discrete data point physics is the wrong way to do it. I need to take the data in that way but can't work on it that way. I need a coding solution to transform points into algebra really.
Smaj: Thanks. Indeed I need instructions on how to post pictures.
You gather data using an Arduino. That means that You take one reading from one sensor at time T0. Suppose You in the next line of code register the Micros at T1. Then some microseconds have elapsed, T1 - T0. Reading several sensors do take time…..
How do You approach that?
You are measuring your intervals in units of 64 microseconds? Why not measure in 16ths of a microsecond? Wouldn't the increased resolution help with the massive jitter in your data?
What speed is this motor running at during that data snapshot ?
How many pulses per revolution is the encoder delivering ?
It looks like if the encoder is delivering 6 pulses per second, that the motor is running at about 300 RPM.
If that is the case, maybe you are simply collecting too much and too fine data. 1 revolution is approx. 200mS. What about doing your velocity calculation every one rotation and your acceleration calculation every 5 rotations, and, if necessary, do a linear interpolation between the results.
Where in your data is the recorded input capture register value ? It looks like only derived values appear since this would be an integer. Since your calculations appear to be to 5 decimal places, your low t value and especially low delta t may lose significant precision. Maybe change the units to increase precision
Railroader: I'm not relying on CPU tasking to measure time. The CPU clock and all the timer clocks run independent of one another so any code I execute will not interrupt the timer's pacing. The "Input Capture" unit, I've come to understand, is specifically designed to handle timer compares as quickly as possible so that there are no timing discrepancies between compares. You get the most accurate timing this way and it doesn't care what's going on in the main loop because it's just chugging along at its own frequency in parallel with the rest of your code.
John: I may be able to drop the prescaler 1 level but there is a concern that prevents me from doing this arbitrarily. The range of a 16 bit number is finite, so when you change the prescaler, you change the range you can detect. I have to pick up frequencies between 0 and 2000 Hz so there are only so many prescalers that will allow me to do this cleanly. When the counter is too low (<100 in my experience), the interrupts happen so quickly that the counter whole number granularity becomes a serious problem for timing precision. Long story short, a detection prescaler of 1024 works nicely and I am considering possibly going to 256 but there is a risk of the timer overflowing at that prescaler so I have to test it. If I don't run into that problem, it will definitely be the furthest I could go. You are correct about smaller being better but even still, the granularity is always there. In my excel file you may have noticed I don't even use whole ticks. I generated those numbers with a Java program with maximum double precision and still there aren't enough decimal places to prevent the noise from cropping up. At the end of the day, you need algebra, not data points to do these sorts of physics calcs on, or you have to live with a mediocre plot that's either full of noise or over-smoothed. This is why I think a real-time interpolated polynomial algorithm (Lagrangian poly or similar) might be the way to go, but I'm going back 20 yrs in my memory to guess that. I figure people here would be a better authority than my failing memory on how to employ this sort of tactic.
6v6gt: The industrial sized encoder (which is the mechanical entity this experiment represents in real life) spins from 0 - 2000 RPM and this is expected to be done habitually over the full range. There is a need to measure fairly low speeds as well as high ones with my accepted minimum being 10RPM and max 2000. The 6 holes were deemed necessary to permit accurate detection at lower speeds and the data could in theory be discarded at high speeds if they were to cause too much noise (something I had considered). However rather than trash useful data, I thought it might be preferable to rely on an algorithm that make use of the data, perhaps for statistics, to predict future values. At least this way, the data is not destroyed, as is often the case with simple averaging. The other problem of course is lag. Cutting out 5 points here and 5 points there means the latency of my graphing would be somewhat poor at low speeds and would only be imperceptible (maybe) at high speeds. The data you see in my snapshots isn't collected directly from the Arduino. I had done that earlier but the step-wise nature of the data: {300,300,300,300,300,299,300,299,299,299,.. etc} was just awful when I ran it through the math. I realized I NEEDED to change them all to floats and try to interpolate the subtle changes over time so the math wouldn't give me crazy infinities. Just imagine this visually where you have a straight line at 300 and then a step to 299. The physics thinks you have an infinite acceleration because the slope of that step is infinite. In reality I don't get infinity but I do get something like 50000 ft-lbs when it should be 300. The values you see were generated with a Java program with double precision so I'm not sure if I can add more decimal places but I feel that would never completely make the problem go away, just make it slightly better.
Dr. D: I was wondering about the sawtooth phenomenon. I don't have a Gaussian point cloud like you might expect from random EMI noise or anything like that. This has structure and is probably mathematical. I suspect with a correction factor you could precisely reverse the noise but this only convinces me more that I'm creating the noise by trying to force digital information to work with analog transforms.
If you have a high prescaler because you are concerned about overflowing a 16bit timer counter, you can look at this example of creating a 32bit counter using an input capture register: https://www.gammon.com.au/timers Scroll way down for: "Timing an interval using the input capture unit". In principle, you count also overflows of the 16bit counter.
Gahhhrrrlic:
John: I may be able to drop the prescaler 1 level but there is a concern that prevents me from doing this arbitrarily. The range of a 16 bit number is finite, so when you change the prescaler, you change the range you can detect.
The good news is that Timer1 has an overflow interrupt that you can use to easily increase the 16-bit timer count to 32 bits. With 4.2 billion counts to work with you can easy measure tens of thousands of cycles per second down to one interval every 200+ seconds. See for example this demonstration I wrote of using the UNO's Input Capture Register to measure frequency and duty-cycle:
// Measures the HIGH width, LOW width, frequency and duty-cycle of a pulse train
// on Arduino UNO Pin 8 (ICP1 pin).
// Note: Since this uses Timer1, Pin 9 and Pin 10 can't be used for
// analogWrite().
void setup()
{
Serial.begin(115200);
while (!Serial);
// For testing, uncomment one of these lines and connect
// Pin 3 or Pin 5 to Pin 8
// analogWrite(3, 64); // 512.00, 1528.00, 2040.00, 25.10%, 490.20 Hz
// analogWrite(5, 64); // 260.00, 764.00, 1024.00, 25.39%, 976.56 Hz
noInterrupts (); // protected code
// reset Timer 1
TCCR1A = 0;
TCCR1B = 0;
TCNT1 = 0;
TIMSK1 = 0;
TIFR1 |= (1 << ICF1); // clear Input Capture Flag so we don't get a bogus interrupt
TIFR1 |= (1 << TOV1); // clear Overflow Flag so we don't get a bogus interrupt
TCCR1B = _BV(CS10) | // start Timer 1, no prescaler
_BV(ICES1); // Input Capture Edge Select (1=Rising, 0=Falling)
TIMSK1 |= _BV(ICIE1); // Enable Timer 1 Input Capture Interrupt
TIMSK1 |= _BV(TOIE1); // Enable Timer 1 Overflow Interrupt
interrupts ();
}
volatile uint32_t PulseHighTime = 0;
volatile uint32_t PulseLowTime = 0;
volatile uint16_t Overflows = 0;
ISR(TIMER1_OVF_vect)
{
Overflows++;
}
ISR(TIMER1_CAPT_vect)
{
static uint32_t firstRisingEdgeTime = 0;
static uint32_t fallingEdgeTime = 0;
static uint32_t secondRisingEdgeTime = 0;
uint16_t overflows = Overflows;
// If an overflow happened but has not been handled yet
// and the timer count was close to zero, count the
// overflow as part of this time.
if ((TIFR1 & _BV(TOV1)) && (ICR1 < 1024))
overflows++;
if (PulseLowTime == 0)
{
if (TCCR1B & _BV(ICES1))
{
// Interrupted on Rising Edge
if (firstRisingEdgeTime) // Already have the first rising edge...
{
// ... so this is the second rising edge, ending the low part
// of tghe cycle.
secondRisingEdgeTime = overflows;
secondRisingEdgeTime = (secondRisingEdgeTime << 16) | ICR1;
PulseLowTime = secondRisingEdgeTime - fallingEdgeTime;
firstRisingEdgeTime = 0;
}
else
{
firstRisingEdgeTime = overflows;
firstRisingEdgeTime = (firstRisingEdgeTime << 16) | ICR1;
TCCR1B &= ~_BV(ICES1); // Switch to Falling Edge
}
}
else
{
// Interrupted on Falling Edge
fallingEdgeTime = overflows;
fallingEdgeTime = (fallingEdgeTime << 16) | ICR1;
TCCR1B |= _BV(ICES1); // Switch to Rising Edge
PulseHighTime = fallingEdgeTime - firstRisingEdgeTime;
}
}
}
void loop()
{
noInterrupts();
uint32_t pulseHighTime = PulseHighTime;
uint32_t pulseLowTime = PulseLowTime;
interrupts();
// If a sample has been measured
if (pulseLowTime)
{
// Display the pulse length in microseconds
Serial.print("High time (microseconds): ");
Serial.println(pulseHighTime / 16.0, 2);
Serial.print("Low time (microseconds): ");
Serial.println(pulseLowTime / 16.0, 2);
uint32_t cycleTime = pulseHighTime + pulseLowTime;
Serial.print("Cycle time (microseconds): ");
Serial.println(cycleTime / 16.0, 2);
float dutyCycle = pulseHighTime / (float)cycleTime;
Serial.print("Duty cycle (%): ");
Serial.println(dutyCycle * 100.0, 2);
float frequency = (float)F_CPU / cycleTime;
Serial.print("Frequency (Hz): ");
Serial.println(frequency, 2);
Serial.println();
delay(1000); // Slow down output
// Request another sample
noInterrupts();
PulseLowTime = 0;
interrupts();
}
}
This is very useful but I'm thinking I'll be unable to take advantage of it. If it requires 2 timers that doubles the timer consumption on the board and I may need all 4 of the 16 bit timers for different tasks. I'll have to see if I can get away with using 2 in a ganged setup like this but it's great that you came up with this work-around. A 32 bit timer would make a huge difference for time-sensitive applications.
My reading has suggested that a Bessel function could be used for polynomial interpolation, similar to a Lagrangian polynomial but without the curve going all funky at the end points, as Lagrange tends to do. Just looking for a PC implementation of the algorithm. If I can find a suitable interpolation model, I can do calculus on interpolated polynomials over a data set of as little as 3 points and integrate to get torque. This would completely eliminate any noise issues.
This is very useful but I'm thinking I'll be unable to take advantage of it. If it requires 2 timers that doubles the timer consumption on the board and I may need all 4 of the 16 bit timers for different tasks. I'll have to see if I can get away with using 2 in a ganged setup like this but it's great that you came up with this work-around. A 32 bit timer would make a huge difference for time-sensitive applications.
Using a smaller prescaler for precision and counting overflows to increase the range does not involve a second timer. It involves a second interrupt vector.
In neither example of extending a 16 bit timer to 32 bits is a second timer used. That is, neither the JohnWasser version nor the NickGammon version. In both cases, a single 32bit unsigned long variable is used. The 16 low order bits hold the timer counter and the 16 high order bits hold the number of times the timer register has overflowed. This overflow count is achieved by triggering an interrupt service routine at the point the 16bit timer register overflows.
It works in a similar way to micros() , however the 32bit value represents units of 62.5 nanoseconds.