DS3234 is gaining 2 seconds per day

I am building a project which requires reasonable timekeeping over about 1 year without reference to an external accurate time source. The DS3234 seemed to fit the bill with approximately +/- 2PPM accuracy. This was good enough for the application. I chose a DS3234 breakout board for ease of use.

I am running my project in the final test phase now where it runs a month at a time to ensure it is working and bug free, at least not bug evident. I am seeing the DS3234 gain approximately 2 seconds per day which is both disappointing and unacceptable being an order of mangnitude worse than expected.

Given the state of the project I am unwilling to seek an alternative and also I cannot see why the DS3234 is not doing its job as expected.

The project program receives an interrupt from the SQW pin of the DS3234 breakout board which has been programmed to produce a 1Hz square wave. It feeds the MEGA's pin2.

// set  interrupt 2 to interrupt on the falling edge of the 1 second pulses from the RTC
  attachInterrupt(2, RTC_interrupt, FALLING);

I do not attempt to read the DS3234 in the interrupt routine but rather set a flag and read it in loop(). My scope says the square wave looks good and I can flash a led controlled by software as I go through loop() driven by the same flag if I want. The interrupt works as expected.

Initially I was reading the time on a rising edge but have tried falling and falling delayed by 200 mS using millis(). I tried this to test the hypothesis that the time updating was being interfered with by reading as the registers were being updated. Reading on rising edge, falling edge and delayed falling edge all have the same effect: the time gains by 2 seconds a day.

Here is my initialisation code:

// set the RTC control register 0x0E
int RTC_init()
{
	  //set control register 
	  digitalWrite(cs, LOW);
	  SPI.transfer(0x8E);
	  SPI.transfer(0x40);                           
	  digitalWrite(cs, HIGH);
	  delay(10);
}

// update the RTC control/status register 0x0F
void RTC_control_status_update()
{
  //set control register 
  digitalWrite(cs, LOW);
  SPI.transfer(0x8F);
  SPI.transfer(0x48); 
  digitalWrite(cs, HIGH);
  delay(10);
}

Please note that I am setting the control register 0xe (or 0x8e in write mode) to 0x40 which does NOT set bit 5. Some examples I have seen set bit 5 on the basis that it enables the temperature compensation. I read the datasheet differently and assert that bit 5 triggers a manual temperature compensation operation and does not need to be set. See page 14 of the DS3234 data sheet identified 19-5339; Rev 3; 7/10 from Dallas/Maxim. I could be wrong though and if I am then I am happy to change the change the start up value of 0x60.

I have just noticed that I am programming the temperature compensation bits to a less frequent value in register 0xf then the default and I will try this at the default. It's cool here at the moment and it could be a temperature effect not properly compensated.

The 0x60 is not the fix however as I have tried it and it still gives a gain of 2 seconds a day.

Note also that I have 2 breakout boards and they both behave the same way. Not all boards have been tried in all combinations but I do not believe either is faulty. It is something I am doing. Finally I am not playing about with the ageing registers so that should not be it

I am perplexed, has anyone else seen this problem and are there any ideas?

Regards, Fred.

It feeds the MEGA's pin2.

If you are referring to the ATMega328 or variation, the code that follows is wrong.

attachInterrupt(2, RTC_interrupt, FALLING);

The pin 2 interrupt is number 0, which is the first argument to the attachInterrupt() function.

If you are referring to the Arduino Mega, interrupt 2 is on pin 21.

Hi urbantigerau,

Maybe you have you accidentally written a value to Crystal ageing offset register 0x10 (I know you say not) or failing that you could try writing something there to adjust the clock speed. I use the DS3231 and it's very accurate at about 1 second per month. I also have a DS1307 that looses 7.5 seconds per day but to overcome this at midnight I read the time, add 8 seconds, wait 0.5 seconds and write the time. This now keeps overall time to within about 3 seconds per month though it obviously drifts during the day. I could do the read/write every hour to reduce drift but for a word clock it's good enough.

The RTC_control_status_update function turns on the 32kHz output. You didn't mention using that.

It might also help to see all your code, especially the loop() function.

Pete

urbantigerau:
I do not attempt to read the DS3234 in the interrupt routine but rather set a flag and read it in loop(). My scope says the square wave looks good and I can flash a led controlled by software as I go through loop() driven by the same flag if I want. The interrupt works as expected.

Can you clarify something here? Are you reading the clock per se? That is, asking it the time? Or are you just getting it to generate a pulse and picking that up in an interrupt?

Thanks to all of you who have taken the time to help me.

Paul S from Seattle is quite correct about the interrupt. When I wrote it I meant interrupt but said pin 2. Sorry for confusion. As I noted though the interrupt is working. For interest I also use interrupt 3 on pin 20 to detect mains failure. Pin 20 is fed from the power supply which gives a pulse (suitably filtered and squared) every half cycle. That's every 10mS, here the frequency of the mains being 50Hz. In a kind of software monstable the interrupt resets a counter to to 12mS which is then counted down in loop() every mS. If this counter ever reaches zero the mains has stopped and I have detected the failure in at most 1 half cycle. This also works fine. I would have preferred to use 11mS for the time but it false triggered sometimes and 12 mS is completely reliable.

Thanks too to Riva from the UK. I sympathise with him on the DS1307. I used one of those in another clock (binary readout) which used the mains as as its primary tick and the 1307 to hold the time up should the mains go down. The mains time was excellent and I updated the 1307 just after midnight daily to keep the two times in synch. I am pretty sue I am not altering the ageing registers but I will be checking that. I had considered using the ageing registers but that is a palliative. I am am certain (and I have evidence) that it is something that I am doing.

Thanks also to Pete who eagle eyed that I had turned on the 32kHz chip output which is a bit useless given that it does not come off the breakout board. I've fixed that but I don't expect it will alter the problem. I will post the code if I have to Pete but I will avoid it if I can. This isn't a 15 liner that you can just scan over to identify the issue. Loop() is the equivalent to a good thousand lines or so and implements a quite complex state machine (anyone else remember state machines??) to cover the user interface and the operation of the project. The state machine also allows recovery from from a power failure which brings me to another feature of the DS3234 that I am using. The project has a number of status flags it must remember should the mains fail as it needs those to recover properly when the power returns. Cyclically (typically every 700 seconds) the project writes these parameters to the DS3234 RAM. When the power recovers the parameters are read back from the DS3234, placed into the right program variables and the state machine figures out what it has to do to recover. It may be that this is where I am going wrong and it is an avenue I will investigate.

Finally Nick the process is that the clock interrupts and sets a boolean variable (a semaphore) to true in the interrupt routine and nothing else. In loop() I check the boolean and if false I do nothing more with that. If true I call a routine to read the DS3234 time registers and update the associated program time variables. Then I clear the boolean to false so that nothing happens until the next interrupt. As afar as I can see this is working.

It so happens that I have another project under development here which includes another DS3234 and a GPS receiver. I have modified the code on that project so that on start up, optionally under the control of a hardware switch connected to an input pin, the GPS time (only when the GPS has a valid lock) is loaded into the DS3234 and then the 2 run independently with both the GPS and DS3234 times displayed on its LCD. Once started and synchronised I changed the position of the switch and update of the DS3234 is inhibited so that another power up will not reload the DS3234 from the GPS.

The target here is to observe both the GPS and the DS3234 time together over a period. I did that yesterday and guess what? The GPS and DS3234 times are in lock step! This project uses code copied from the troublesome project but does NOT use the RAM of the DS3234. This evening (9:30PM here) I swapped the DS3234 in the troublesome device into the GPS project. So far lock step but too soon to say. If they remain at the same time this will definitively rule out a hardware problem with the DS3234. I will update with progress.

Further I have connected DS3234 to the MEGA removed from the troublesome project on a simple breadboard and am letting it run the simplest of code to read the time, delay 1000 and so on and display that via Serial.print to the PC display. If it stays on time then it is definitely a problem somewhere in my thousand odd lines of loop() code. When I find it I am sure it will be blindingly obvious - are they not always so?

I think I need to go back to my troublesome project and start disabling functions that affect the DS3234 until the DS3234 comes good. A bit like pulling boards out of HAL until the bad behaviour stops. Easier said than done as that state machine I talked of earlier will get upset.

When I have done a bit more I'll write back. Thanks to all, Fred.

urbantigerau:
(anyone else remember state machines??)

Yes.

http://www.gammon.com.au/serial

I will post the code if I have to Pete but I will avoid it if I can

What I would have looked for in the loop() code was somewhere that you are reading or writing a DS3234 register on every entry to the loop function. If you access any one or more of those registers very frequently it will alter the clock speed.

Pete

Hi All,

I have some further observations.

  1. All the DS3234 I own run in perfect lockstep with a GPS reference over a 24 hour period. I'll let one of them run continuously for a few days and see what happens. But it is my bet they will stay in lockstep for quite some time.

  2. I have been using my PC clock as a test reference. Maybe not so good an idea. I'll re-synch it with the internet time server before I complain about the DS3234 again. Last time I tried that it adjusted a whole minute! It makes a DS1307 look good. But it is a domestic PC after all. I should have known better.

  3. The DS3234 which has been running bare bones retains synch with the GPS clock as well as I can tell by eye. I'll run it as well to see what happens.

So far it confirms that the DS3234 seems as "dead on" as claimed and I doing something to mess it up. Question is: What?

I am going to take Pete's advice and examine the loop() code to find all of the DS3234 accesses. This should not be impossibly hard. The code is structured and disciplined in that the state machine is implemented in nested "switch case" not spaghetti code. It is not simple but also not impossible. Every access to the DS3234 is via a function call. Nothing is "in-line". It is easy to identify.

During first time startup and power up recovery there can be be masses (worst case nearly 3600) of DS3234 RAM accesses (write mostly), much greater in frequency than the 1 in 700 seconds when the unit is running routinely. If these cause the time to deviate I am truly deep in the weeds. I am reliant on this functionality to be able to survive a power off function. I could use the 1280 EEPROM in stead but it has a limited number of writes and the idea has no appeal. I could wear it out in a relatively few starts.

I have 3 options for testing I guess. I can add to the bare bones implementation and write frequently to the RAM and see if it disturbs the time-keeping. This has the advantage of not requiring me to modify my highly optimised state machine and the changes are evident. It does, however, involve writing a swag of code for no other useful purpose. I have nothing else to do of course.

I will check the code of the problem child to see if I am accessing the DS3234 when I should not. An access LED (or two - one for time and one for RAM)on the DS3234 might be enlightening. If it is lit like a Christmas tree I have a problem. It should only light up once a second for time access.

Alternately I can remove parts of the functionality of the problem child and see which cures the problem. Whichever way it will take time as I have to wait 24 hours to see if the time has drifted.

I will post as I find something. Thanks for your support and suggestions.

Regards,
Fred.

Hi All

Here's an update.

The DS3234 in the GPS project remains in lockstep with the GPS. Dead on so far. If there was any hardware problem I'd have seen it by now with a gain of 2 seconds in 24 hours.

The DS3234 running in the bare bones state is in lockstep with the GPS and the PC clock provided I synch the PC clock with an Internet timer server once every 12 hours. I'll be decidedly annoyed if the whole problem is a phantom caused by me failing to remember that the PC clock can't be trusted and then trusting it when checking. However I will seek root cause and assume that there is a problem for the moment.

I have decided to work from the bare bones up and the next step is to enable the 1 Hz interrupt and use that to initiate the time read instead of cyclic millis(). That is in place just now and we will see what happens next.

Thanks,
Fred.

Here's a further update.

I have found that I had moved the function call to read the clock OUTSIDE of the if statement which is driven by the DS3234 semaphore during debug. As a result I am reading the time every LOOP(). I moved it for a reason to fix a problem without considering the consequences.

I have confirmed that calling the read time function every loop causes the DS3234 to gain time.

Running normally the DS3234 maintains perfect time with GPS time.

I will edit the code, once I have determined the consequences, and determine if I have rectified the problem.

Thanks to all, Fred.

I'm glad you have found the cause of the time gain, lets hope the fix does not break something else.

There's always atomic clocks if you need more accuracy! http://www.ebay.co.uk/itm/Supper-Low-JITTER-Clock-11-2896Mhz-Rubidium-Frequency-/180497234806

(Actually the above is actually being targeted at audiophiles, as if quartz isn't accurate enough!

Yeah I got one of those on eBay quite cheaply. Gets quite hot though, and uses a moderate amount of power. You would need to have strategies in place to keep it up if you did not want to lose much time over a year.

So here’s an update:

I have the DS3234 which was troublesome running beside another project where its DS3234 is resynched to GPS time once a day or at start-up once a GPS lock has been achieved. Both display the time on separate side by side LCDs and run 24/7. Visually the original clock and the GPS synched clock are identical over the time frames tested so far, about 10 days. Not long enough for a substantive comment on accuracy but enough to say I have cured my 2 seconds a day gallop. I’ll write wasted time off to a fit of stupidity and thanks to all of those who made useful suggestions.

So far I have not noticed that I have broken anything else. My projects never get past the beta test stage really and I often only build one. Testing to find what the project should do is not a problem. Testing for what it should or should not do when someone does something unexpected is a different matter. The set of unexpected behaviour is much larger than that of expected. So far so good.

I have yet another DS3234 and I will set it to the GPS time and let all 3 of them run displaying to yet another LCD. This test project is really simple - no interrupts, etc just a cyclic read with a delay of 1s using millis(). I’ll let all 3 of them run for as long as practical and see what happens and report the results.

One final point about the DS3234: I have found it be a bit of a cranky device to get access reliably. I have found a 22pF capacitor between SS and ground improves access reliability massively. Without the cap some devices read sometimes, with it all devices are OK. Anyone else had this experience.

Regards,
Fred.