Optimizing subtraction of float from unsogned long long

I need to subtract a float value from an unsigned long long variable.

The long long is a coarse timer that increments every millisecond, potentially forever. The float is a fine timer that interpolates between the coarse ticks. The final result is (coarse - fine), e.g., 123,456,789,012,345,678,901 - 0.0982. Ultimately, two of these results will be subtracted from each other to yield the time interval between them.

What's the most efficient way to perform that calculation (on a Mega 2560, if that matters)?

As an alternative would it be more efficient to keep the coarse count in a shorter integer type, and count rollovers in a separate variable? Then I'd need to test if time2< time1, and if so use the rollover value to "unwrap" the count.

Thanks for any tips!

John

Why not count microseconds in a long int, and keep the overflow of 1,000,000 microseconds in a long int?

The "fine" timer is external to the Arduino and provides resolution in the 10s of picoseconds but with a maximum range of a few milliseconds. The coarse timer is an external clock at 1000 Hz derived from a very accurate frequency source. Together they are used to timestamp an external event.

I guess I could process the fine time as an integer number of picoseconds, but that still requires converting the float that comes out of the fine timer to an int, and scaling the values by nine places (picoseconds vs. milliseconds) leading to even bigger numbers. (In my example, "0.0982" was in milliseconds, or 98.2 microseconds. I should have added a couple more digits to make the resolution more clear.)

I hope this all makes sense...

e.g., 123,456,789,012,345,678,901 - 0.0982

It is simply not possible to perform this subtraction on an Arduino, or on most computers, for that matter.

Using a float variable as a "fine timer" is not a good idea and you should rethink your approach. It also makes no sense to use an Arduino to time events on the order of 10 picoseconds.

Our replies crossed... the timing is being done by external hardware that has the resolution/accuracy/stability to make such measurements. It sends sends several chunks of data to the Arduino via SPI and an algorithm converts those values into a float result. I suppose the algorithm could be reworked to return integer picoseconds if that made a difference.

Why is the computation impossible? Because of the length of the counter variable? If so, would using a shorter counter with a separate rollover counter solve the problem? I can certainly do that.

There is a fundamental incompatibility with floating point and integer math on any computer.

Floating point adds and subtracts require that the decimal places of the two numbers are first aligned and the result is almost never exact -- there is almost always a loss of precision.

Example: try adding 1.0 and 1.0E-9 on an Arduino.

If you are timing in picoseconds, how could it possibly make sense to consider intervals that require long long integers to store milliseconds? That implies time measurements that span 28 orders of magnitude!

Floating point has a limited precision - see Floating-point arithmetic - Wikipedia and https://www.arduino.cc/en/Reference/Float (specifically the bit about 6-7 decimal digits of precision)

I think you're better off keeping them all as unsigned numbers. They can be as many digits as you want.
Here's a page Nick Gammon did on big numbers:

This measurement system may run for years between reboots. It's logging perhaps only a few measurements per hour, but uses picosecond resolution to measure devices over both short and long time scales. One of the applications is characterizing atomic clocks, measuring their deviation to parts in 10e14 over periods of days to months and hopefully years.

This is actually a replacement of an existing system that uses a 30 year old Hewlett Packard time interval counter. We're hoping to use modern hardware to match the existing performance with less size/cost/power.

We've tested a chip that provides the high resolution timer, and are now trying to integrate that into a complete system. The challenge is that the chip has a maximum measurement range of a few milliseconds. So the idea is to use the millisecond coarse clock to extend the measurement range. The precision counter interpolates between the millisecond ticks of the coarse counter.

Two additional points:

  1. The coarse clock is derived from a high quality frequency standard; we're not relying at all on the capability of the Arduino's clock.

  2. The granularity of the precision counter is around 50 ps and its standard deviation is 50-100 picoseconds. Its output data is formatted as nanoseconds to 3 decimal places (e.g., 1.023 equals 1023 picoseconds). if there is a loss of precision at the third decimal place, it won't be noticed in the noise.

Why would you want to use an Arduino (an extremely limited toy, intended for learning and experimentation) with equipment that has to run for years? The clock on the Arduino can't keep accurate time of day, for even a single day.

Its output data is formatted as nanoseconds to 3 decimal places (e.g., 1.023 equals 1023 picoseconds).

Multiply by 1000 and treat this as an integer. Add it to other integers.

Thanks for the multiply idea to yield integer picoseconds. Along with that, I'd also have to multiply the coarse counter by 1e9 to convert the millisecond ticks to picoseconds.

I have to figure out just when and where to do that. Incrementing the counter by 1e9 on each tick to record picoseconds rather than milliseconds would cause a rollover in just over 6 months -- the maximum unsigned long long value of ~1.8e19 picoseconds would equal ~1.8e7 seconds; there are ~3.2e7 seconds in a year.

(By the way, the coarse counter is formed by an external 1ms pulse train applied to an interrupt pin; the ISR increments the counter by 1 and returns. I'm not relying on the Arduino's clock for any of the timing.)

I'm hoping the code in the Arduino will be simple enough for high reliability -- basically poll for event trigger, grab coarse count value, get fine counter value via SPI, do some math, spit result out via USB, poll until next event. Data logging and more complex analysis are done in the host computer at the other end of the USB.

I appreciate the responses to my request; I know my problem sounds kind of weird but I'm pretty sure the overall design makes sense. It's getting all the implementation details right that's critical, and this is a great forum for that.

Why not just report both pieces of data and figure it out with post processing?

Can you post links to documentation for your two timers? Also, the code that you use that returns the float would be useful.

I understand what you want to do and it should be possible, but I need more detail about the interface to the timers.

Hi BigBobby --

I'm using a TI TDC7200 chip as the precise timer. Here is a link to its data sheet:

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwipxMXat5PLAhUNyWMKHSYrBHsQFggdMAA&url=http%3A%2F%2Fwww.ti.com%2Flit%2Fds%2Fsymlink%2Ftdc7200.pdf&usg=AFQjCNE4ns7SFTseaMV_32rwTJWQjFox8w&sig2=a6vGuDwA2D34hNAmMeP-nQ

I am driving the chip's START input with an external event, and its STOP input with a cycle of the 1ms coarse clock (there are some gating complexities that I'm ignoring here; those will be dealt with in hardware).

The chip returns several data registers that the Arduino pulls via SPI. A few calculations (including division) on those yield the START to STOP time interval. The SPI communication and calculation code will be stolen from the TI example code; I don't have that handy at the moment.

When I subtract that value from the counter value associated with the clock cycle sent to the STOP input, I end up with a high-resolution timestamp of the external event.

There's no data sheet for the coarse clock. It's just a 1 kHz pulse chain derived from an external high-stability oscillator. Those pulses will go to one of the Arduino hardware interrupts, and the associated ISR will increment the coarse counter (the dreaded "unsigned long long" of my initial post) on each tick. I'm trying to keep the ISR as short as possible -- hopefully just "ms_counter++;" and return.

Thanks!

n8ur:
The chip returns several data registers that the Arduino pulls via SPI. A few calculations (including division) on those yield the START to STOP time interval. The SPI communication and calculation code will be stolen from the TI example code;

Ah, er, can't you just calculate the precise time interval in the same format as the coarse timer? I understand that will require writing some code vs. just copy & paste...

Thanks for posting that. As I suspected, the IC itself is already outputting integers.

You are using the device with Measurement Mode 2? If so, then you must have a function that is performing the math on page 18. Could you post that function here? It should be possible to modify it so that outputs the time as an integer in picoseconds.

I'd have to read the datasheet some more, but at first glance it doesn't seem to be designed to handle a free-run type of application. If so, then the overhead of starting and stopping the timer is going to complicate using it to get finer resolution between your ms ticks.

Hi BigBobby and JimEli --

Yes, I'm using Mode 2. The function's not written for the Arduino yet -- I'm trying to address the issues I can think of before I start writing, so the code I do write is correct.

Thanks to this conversation, I've been reading about floats and the limitation of the Aduino to 4-byte values. When I implement the calculation, I'll use scaling to keep things int.

I still have the problem of scaling the coarse value without causing it to overflow every six months, but I'm noodling a couple of ideas to deal with that -- possibly just using another variable as an overflow counter.

Regarding the setup time, the measurement rate is relatively slow, usually 1 to 10 readings per second. So there should be ample time at the end of processing one event to set up for the next.

(The 1 kHz coarse rate was chosen because it's convenient to generate from the 10 MHz master clock, and fits neatly within the 6ms maximum time interval the chip can measure.* It's not related to measurement rate.)

Thanks again!

  • The data sheet says 8ms maximum, but that's using an 8 MHz clock. My clock runs at 10 MHz so the chip counters overflow sooner.

Well, here is code that does the mode 2 calculation with integer math:

#define PS_PER_SEC        (1e12)  // ps/s
#define CLOCK_FREQ        (1e8)   // Hz
#define CLOCK_PERIOD_PS   (uint32_t)(PS_PER_SEC/CLOCK_FREQ)  // ps
#define CALIBRATION2_PERIODS 10   // Can only be 2, 10, 20, or 40.

uint16_t CALIBRATION2 = 1;
uint16_t CALIBRATION1 = 2;
uint16_t TIME1 = 3;
uint16_t TIME2 = 4;
uint32_t CLOCK_COUNT1 = 5;

  {
    // Calculation from 8.4.2.2.1 of datasheet.

    // These registers are 23 bits, but in example never got larger than 16bit when CALIBRATION_PERIODS = 10.
    CALIBRATION2 = 23133;
    CALIBRATION1 = 2315;

    // These registers are 23 bits, but in example never got larger than 16bit.
    // If I understand their example correctly, they should never be larger than that in your application either.
    TIME1 = 2147;
    TIME2 = 201;

    // This register is 23 bits, and but in example never got larger than 16bit.
    // If I understand their example correctly, I think it will be >16 bits in your case.
    //  CLOCK_COUNT1 = 3814; // Datasheet said 39.855us results in 3814 count.
    CLOCK_COUNT1 = (uint32_t)(1.0F * 3814*1000000/39855 + 0.5); // If 39.855us results in 3814, then 1ms should result in proportionally more.

    uint32_t calc_time;
    uint16_t tempu16;
    uint32_t tempu32;

    // Perform calculation while measuring the calculation time.
    calc_time = micros();
    tempu16 = (TIME1 - TIME2); // since TIME1 will always be > TIME2, this is still 16 bit.
    tempu32 = tempu16 * (CLOCK_PERIOD_PS * (CALIBRATION2_PERIODS-1)); // After multiplying by the constants, you will now be a 32 bit number.
    tempu32 = (tempu32 + ((CALIBRATION2 - CALIBRATION1 + 1) >> 1)) / (CALIBRATION2 - CALIBRATION1); // This division sort of sucks, but since I assume these must be variables there's no way around it.
    tempu32 += CLOCK_COUNT1 * CLOCK_PERIOD_PS; // Add in another 32bit variable.  Given the limitations on inputs, these two 32 bits still won't overflow.
    calc_time = micros() - calc_time; // Calculate the time it took for function.

    LT_printf(F("calculated %lu in %lu us\r\n"), tempu32, calc_time);
  }

If you test it with CLOCK_COUNT1 = 3818, the output is:
"calculated 38148413 in 48 us"

Oddly, this doesn't match the number on page 18 of the datasheet because TI did a calculation wrong. They said that normLSB = (1/8MHz)/2313.11 = 5.40e-11. It doesn't, it = 4.323e-12 (Gotta love that TI). If you do the calculation on page 18 correctly, you get 38.148us which = 38148413ps.

If you use their example to estimate the CLOCK_COUNT1 in 1ms, it should be around 95697. Using that number as input, the code output is:
"calculated 956978413 in 48 us"

You could make this calculate faster if needed, but you'll likely need to start using assembly routines for the math.

I'm still worried that you might not be able to do what you're looking to do with this part. I attached a timing diagram to explain.

In order to combine your 1ms and high resolution timers you would ideally want them synced together. At worst, you'd need to understand the shift between the two.

When you start your timer for the first time, you have no way of knowing how long the communication took to start the timer (not to the picosecond anyway).

After 1 ms, when you read the values out of it and start it again, it will take even more time that you can not estimate to the picsecond.

If the goal of using this timer is to measure with picosecond accuracy, you are going to have trouble achieving that when you can't relate it to your 1ms timer with the same accuracy.

BTW - what are your START and STOP signals for this timer IC?

Timing_Diagram.png

Hi Bobby -- Thanks so much for the code! That's way more assistance than I was expecting. I really appreciate it.

The external signals that I'm measuring are normally pulse-per-second ticks from electronic clocks of varying type and quality, though there are other applications as well. The complete system has two identical measurement channels, and measures two PPS signals from two independent and hopefully uncorrelated sources, one a device under test, and the other a reference (like a GPS, or a clock of higher quality than the DUT). By looking at changes in the time difference between them over time, we can determine the frequency offset, aging, stability, and other statistics of the DUT compared to the reference. The time interval between the two signals could be anywhere from a few hundred nanoseconds to a few hundred milliseconds.

Again, I'm NOT trying to do a measurement for each cycle of the coarse timer. It's not 1,000 measurements per second, but 1 (or maybe 10 for some configurations). If I could guarantee that the time interval between REF and DUT would be <6ms, I could just use the TDC chip by itself. But I can't -- all I can say is that the time difference is less than the measurement interval. So the coarse clock scheme is intended to extend the measurement range to meet that requirement.

At a high level, here's the program logic:

In setup(), use SPI to send configuration to the TDC7200s on both channels, and then send each the start command. Set up the coarse clock interrupt ISR, so the coarse counter increments on each interrupt.

Within an infinite loop, poll the various hardware pins for flags indicating measurement activity:

  1. The channel 0 TDC7200 waits patiently until it detects an incoming edge on its START pin. Then its measurement cycle starts automatically.

  2. The incoming edge also starts a 2-bit hardware counter clocked at 10 MHz to generate a pulse after a 300 nanosecond delay. That pulse opens a "gate" (probably a flip-flop, but I'm still working through this part of the design) that passes the next pulse of the coarse clock to the 7200 STOP pin, then closes. The circuit ensures that the selected 1ms clock edge meets the minimum and maximum interval requirements of the 7200. The actual time between START and the gated clock edge could be from 300 nanoseconds to (2ms + 300ns) if we just miss an edge and have to wait for another. (I know there are some additional timing subtleties here, but I'm trying not to get below the weeds.)

  3. The gated clock pulse is also routed to an input pin on the Arduino. When the Arduino sees that it's high the current contents of the coarse timer are copied into a variable (call it channel0_coarse for that channel). The pulse 50/50 duty cycle, so there's a reasonable time for the Arduino to catch the signal and copy the counter value before the next increment.

  4. The 7200 sets its interrupt pin when its measurement is complete, and when the Arduino catches that that, it initiates SPI communication to grab the data. As soon as that transaction is complete, it sends the "start" message, arming the chip for the next measurement.

  5. The Arduino is also polling for the same set of events on channel 1, which is waiting for the other PPS source. Channel 1 is processed the same as channel 0 (into its own set of data variables, of course).

  6. When data from both channels is in hand, the Arduino calculates the timestamp for each channel by subtracting its '7200 time interval from its channel_coarse value, and outputs the channel0 and channel1 timestamps, and optionally their difference, via USB.

As long as the loop completes faster than the measurement rate (in this example, once per second), the 7200s will be armed and waiting when the next set of PPS signals arrive. The loop spends most of its time polling the hardware pins for activity, reading the data when needed, and only does calculations after data from both channels is in hand. By this time, both channels are already armed and ready for the next cycle.

Although the master 10 MHz clock and the 1 kHz clock are in fact synchronous, I can't see that's a requirement for this to work. And of course the external PPS signals are asynchronous, because the whole point is to measure their arrival time.

Am I missing something? I hope I'm not just adding to the confusion. (And, thanks again.)