[ATmega328p]"uint32_t" value doesn't roll over and sticks to "0"

Hello.

[001 The issue]

uint32_t micro_seconds = ((m<<8) + t) * (64/clockCyclesPerMicrosecond());

The issue occurs when this value overflows after reaching around 4,200,000,000. The expected behavior is that the value should overflow to 0 and then increment by 1(ex - 1,2,3,4), but instead, it stays stuck at 0.

when I print "(m<<8) + t", it increments correctly even after the overflow.
additionally, since "64/clockCyclesPerMicrosecond()" is constant double value, I can't understand why it result in 0.

I'm looking for the cause of this phenomenon, a solution is welcome, but it is not the primary focus of my question.

[002 Environment]
MCU : ATmega328p
IDE : Microchip Studio
System clock : 16MHz -> 14.7456MHz.
Reason for changing system clock : To reduce serial communication errors.

due to change in clock to 14.7456mhz, variables that were previously integers are now became real number, which is causing issues with the micros() and millis() function from the arduino library. So I'm editing the code.

[003 Source Code]

#define F_CPU 14745600UL

#include <avr/io.h>
#include <avr/interrupt.h>
#include <stdlib.h> // itoa

#define clockCyclesPerMicrosecond() ((double)F_CPU/1000000.0) // 14.7456
volatile uint32_t timer0_overflow_count = 0;

#define BAUDRATE 9600
void uart_write(int8_t data);
void uart_write_string(int8_t * data);
void uart_write_string(const char * data);
uint32_t micros();

int main(void)
{
	/*
	millis, micros()
	*/
	TCCR0A = TCCR0B = TIMSK0 = 0;
	TCCR0A = 0; // PWM Mode X
	TCCR0B = (1<<CS01) | (1<<CS00); // Normal Mode(CTC X)
	TIFR0 &= ~(1<<TOV0); 
    TIMSK0 |= (1<<TOIE0);
	   
	/*
	Serial
        */
	UBRR0 = (F_CPU/(BAUDRATE*16UL)) - 1;
	// UCSR0B |= (1<<TXEN0) | (1<<RXEN0) | (1<<RXCIE0);
	UCSR0B |= (1<<TXEN0);
	UCSR0C |= (1<<UCSZ00) | (1<<UCSZ01); // Async, 8-bit data, 1 Stop bit, No Parity
	
	/*
	LED
	*/
	DDRB |= (1<<PORTB0);
	PORTB &= ~(1<<PORTB0);
	
	DDRD &= ~( (1<<PORTD5) | (1<<PORTD6) | (1<<PORTD7) );
	PORTD |= (1<<PORTD5) | (1<<PORTD6) | (1<<PORTD7);
	
	DDRB &= ~( (1<<PORTB1) | (1<<PORTB2) | (1<<PORTB3) | (1<<PORTB4) | (1<<PORTB5) );
	PORTB |= (1<<PORTB1) | (1<<PORTB2) | (1<<PORTB3) | (1<<PORTB4) | (1<<PORTB5);
	
	DDRC &= ~( (1<<PORTC0) | (1<<PORTC1) | (1<<PORTC2) | (1<<PORTC3) | (1<<PORTC4) | (1<<PORTC5) );
	PORTC |= (1<<PORTC0) | (1<<PORTC1) | (1<<PORTC2) | (1<<PORTC3) | (1<<PORTC4) | (1<<PORTC5);
	
	sei();
	
	uint32_t previous_led_time = 0;
	while (1) 
    {
		uint32_t now = 0;
		now = micros();
		
		if(now - previous_led_time >= 10000000)
		{
			if(PORTB & (1<<PORTB0))
			{
				PORTB &= ~(1<<PORTB0);
				previous_led_time = micros();
			}
			else
			{
				PORTB |= (1<<PORTB0);
				previous_led_time = micros();
			}
		}
    }
}

uint32_t micros()
{
	uint32_t m;
	uint8_t oldSREG = SREG;
	cli();
	m = timer0_overflow_count;
	uint8_t t = TCNT0;
	if((TIFR0 & (1<<TOV0)) && (t<255))
	{
		m++;
	}
	SREG = oldSREG;
	
	uint32_t micro_seconds = ((m<<8) + t) * (64/clockCyclesPerMicrosecond());
	
	char _buffer1[100];
	uart_write_string("micros : ");
	ultoa(micro_seconds, _buffer1, 10);
	uart_write_string(_buffer1);
	uart_write_string("   ");
	
	char _buffer2[100];
	uart_write_string("overflow count : ");
	ultoa(m, _buffer2, 10);
	uart_write_string(_buffer2);
	uart_write_string("   ");
	
	char _buffer3[100];
	uart_write_string("m<<8 + t : ");
	ultoa(((m<<8) + t), _buffer3, 10);
	uart_write_string(_buffer3);
	uart_write_string("\r\n");

	return micro_seconds;
}

void uart_write(int8_t data){
	while(!(UCSR0A & (1<<UDRE0)));
	UDR0 = data;
}

void uart_write_string(int8_t * data)
{
	while(*data != '\0')
	{
		uart_write(*data++);
	}
}

void uart_write_string(const char * data)
{
	while(*data != '\0')
	{
		uart_write(*data++);
	}
}

ISR(TIMER0_OVF_vect)
{
	timer0_overflow_count++;
}

after running the MCU for about 70 mins, micro_second value overflows after reaching around 4200000000 and becomes 0.

[004 Error Screenshots]


As shown in the screenshots, (m<<8) + t is incrementing correctly. (64 / clockCyclesPerMicrosecond()) is constant. so, result value should not be 0, but it is.

this is not accidental case, It happens consistently. I repeated the test 4 times.

before 70 mins, micros() function works fine.

[005 Experiment of uint32_t rollover]
Below is an experiment for checking uint32_t overflow behavior.

I confirmed that the uint32_t value increments by 1 after overflow. I could not find the situation that it became consistently 0 after overflow.

while (1) 
    {		
		uint32_t value_before_overflow = 4294967295;
		char _buffer1[15] = {' '};
		uart_write_string("value_before_overflow : ");
		ultoa(value_before_overflow, _buffer1, 10);
		uart_write_string(_buffer1);
		uart_write_string("\r\n");
		
		char _buffer2[15] = {' '};
		uart_write_string("value plus 1 : ");
		ultoa(value_before_overflow + 1, _buffer2, 10);
		uart_write_string(_buffer2);
		uart_write_string("\r\n");
		
		char _buffer3[15] = {' '};
		uart_write_string("value plus 2 : ");
		ultoa(value_before_overflow + 2, _buffer3, 10);
		uart_write_string(_buffer3);
		uart_write_string("\r\n");
		
		char _buffer4[15] = {' '};
		uart_write_string("value plus 3 : ");
		ultoa(value_before_overflow + 3, _buffer4, 10);
		uart_write_string(_buffer4);
		uart_write_string("\r\n");
		
		char _buffer5[15] = {' '};
		uart_write_string("value multipyling 2 : ");
		ultoa(value_before_overflow * 2, _buffer5, 10);
		uart_write_string(_buffer5);
		uart_write_string("\r\n");
		
		char _buffer6[15] = {' '};
		uart_write_string("value multipyling 4 : ");
		ultoa(value_before_overflow * 4, _buffer6, 10);
		uart_write_string(_buffer6);
		uart_write_string("\r\n");
		
		char _buffer7[15] = {' '};
		uart_write_string("value + 105 + multipyling 4 : ");
		uint8_t plus = 105;
		ultoa((value_before_overflow + plus) * 4, _buffer7, 10);
		uart_write_string(_buffer7);
		uart_write_string("\r\n");
		
		char _buffer8[15] = {' '};
		uart_write_string("((value << 8) + 105 * 4) : ");
		ultoa(((value_before_overflow<<8) + plus) * 4, _buffer8, 10);
		uart_write_string(_buffer8);
		uart_write_string("\r\n");
		
		char _buffer9[15] = {' '};
		uart_write_string("((value << 8) + 105 * 4.1) : ");
		ultoa(((value_before_overflow<<8) + plus) * 4.1, _buffer9, 10);
		uart_write_string(_buffer9);
		uart_write_string("\r\n");
		
		char _buffer10[15] = {' '};
		uart_write_string("((value << 8) + 105 * (64/13.8) : ");
		ultoa(((value_before_overflow<<8) + plus) * (64/13.8), _buffer10, 10);
		uart_write_string(_buffer10);
		uart_write_string("\r\n");
		
		char _buffer11[15] = {' '};
		uart_write_string("((value << 8) + 105 * (64/14.7456) : ");
		ultoa(((value_before_overflow<<8) + plus) * (64/14.7456), _buffer11, 10);
		uart_write_string(_buffer11);
		uart_write_string("\r\n");
		
		#define clockCyclesPerMicrosecond() ((double)F_CPU/1000000.0)
		char _buffer12[15] = {' '};
		uart_write_string("( ((value<<8) + 105) * (64 / clockCyclesPerMicroseconds()) ) : ");
		ultoa( ( ((value_before_overflow<<8) + plus) * (64 / clockCyclesPerMicrosecond()) ), _buffer12, 10);
		uart_write_string(_buffer12);
		uart_write_string("\r\n");
		
		value_before_overflow = 3865411;
		char _buffer13[15] = {' '};
		uart_write_string("((3865411 << 8) + 105) / 4.1 : ");
		ultoa( ( (value_before_overflow<<8) + plus ) / 4.1, _buffer13, 10);
		uart_write_string(_buffer13);
		uart_write_string("\r\n");
		
		value_before_overflow = 3865478;
		char _buffer14[15] = {' '};
		uart_write_string("((3865478 << 8) + 105) / 4.1 : ");
		ultoa( ( (value_before_overflow<<8) + plus ) / 4.1, _buffer14, 10);
		uart_write_string(_buffer14);
		uart_write_string("\r\n");
		
		value_before_overflow = 3866000;
		char _buffer15[15] = {' '};
		uart_write_string("((3866000 << 8) + 105) / (64 / clockCyclesPerMicroseconds() ) : ");
		ultoa( ( (value_before_overflow<<8) + plus ) / ( 64 / clockCyclesPerMicrosecond() ), _buffer15, 10);
		uart_write_string(_buffer15);
		uart_write_string("\r\n");
			
		// result = ((m<<8) + t) * (64/clockCyclesPerMicrosecond());
		
		/*
		value plus 1 : 0
		value plus 2 : 1
		value plus 3 : 2
		value multipyling 2 : 4294967294
		value multipyling 4 : 4294967292
		value + 105 + multipyling 4 : 416
		((value << 8) + 105 * 4) : 4294966692
		((value << 8) + 105 * 4.1) : 4294967295
		((value << 8) + 105 * (64/13.8) : 4294967295
		((value << 8) + 105 * (64/14.7456) : 4294967295
		( ((value<<8) + 105) * (64 / clockCyclesPerMicroseconds()) ) : 4294967295
		((3865411 << 8) + 105) / 4.1 : 241352528
		((3865478 << 8) + 105) / 4.1 : 241356704
		((3866000 << 8) + 105) / (64 / clockCyclesPerMicroseconds() ) : 228026000
		*/
		_delay_ms(3000);	
	}
}

Produce a simple example to see what is going on.

void setup() {
   Serial.begin(115200);
   uint32_t xyz = 1000000000UL;
	
   Serial.print("Example 1 : ");
   Serial.println((uint32_t)(xyz * 5));
   Serial.print("Example 2 : ");
   Serial.println((uint32_t)(xyz * 5.0));
   Serial.print("Example 3 : ");
   Serial.println((uint32_t)5000000000.0);
}

void loop() {
}

produces

Example 1 : 705032704                                   
Example 2 : 4294967295  
Example 3 : 4294967295  

Casting a float that is larger than the range of uint32_t results in UINT32_MAX. It does not wrap like unsigned multiplication does.

Thank you for your reply; it helped me understand the concepts.

However, [005 Experiment of uint32_t Rollover] is a simple example I’ve already written.

From the example, I don't see the value becoming 0; instead, it reaches UINT32_MAX.

As you mentioned, the value should reach UINT32_MAX, but now my code prints 0, and I can't understand why this is happening

Your example is anything but simple.

The original code relied on unsigned multiplication rolling over. Casting floats outside of the range of uint32_t does not roll over. End of story.

I don't see a float in the expression. The division (64/clockCyclesPerMicrosecond()) is integer

clockCyclesPerMicrosecond() is a macro

F_CPU is a defined value passed at compile time like -DF_CPU=240000000L so it's a long which you divide by a long ➜ it's not a double, it's a long.

when you do (64/clockCyclesPerMicrosecond()), 64 is an int and you divide thus by a long, so 64 is promoted into a long and is divided by clockCyclesPerMicrosecond and the result is a long, not a double.

There is no double in that story.


how did you change the clock to 14.7456MHz and did you modify the compiler's defined value of F_CPU to 14745600 that is passed by the IDE ?

Also even if you did modify that, the macro would still be done using integral maths, and 14745600 / 1000000 = 14 not 14.7456

Hello,

Thank you for your reply.
It helped me a lot in understanding the concept.

I now understand that rollover does not happen when casting floats outside the range of uint32_t.
However, I still can't figure out why my code prints 0.

Here's the example code I wrote for testing:

char _buffer16[100];
uint32_t fake_overflow_count = 3866000;
uint8_t fake_timer_count = 105;
uint32_t result = ((fake_overflow_count << 8) + fake_timer_count) * (64 / clockCyclesPerMicrosecond());
ultoa(result, _buffer16, 10);
uart_write_string("result : ");
uart_write_string(_buffer16);
uart_write_string("\r\n");

With this code, it prints UINT32_MAX instead of 0.
But in my original code—which is basically the same—it prints 0, not the maximum value.

This seems different from just "not rolling over," and I can't understand why it's happening.

No matter how I try—using double numbers, different castings, or other approaches—I can’t get my example code to print 0

Check the value of this, the byte is promoted to an int before the shifting - is this what you had in mind?

As mentioned before, there is No double or float in your stuff

Thank you for your kind reply.

However, in my case, I'm dealing with a double value.

uint32_t micro_seconds = ((m << 8) + t) * (64 / clockCyclesPerMicrosecond());

Here, 64 / clockCyclesPerMicrosecond() results in a double, so the expression ((m << 8) + t) is implicitly converted to double during the calculation.

This behavior follows the compiler’s type promotion rules.

because I'm using Microchip studio instead of Arduino IDE, it is pure atmega328p,

and F_CPU is 14745600UL

no, the result has a unsigned long type

please explain, how the rules got you a double in division int to unsigned long

I debug-printed the value, and it turned out to be a double.

64 / clockCyclesPerMicrosecond() results in 4.34027770.

refer to this.

"int / unsigned long" outputs double.

please show the full code

it seems that you don't fully understand a compiler’s type promotion rules.
Please provide a quote where it would be said that dividing two integers gives a double.

char _buffer4[100];
uart_write_string(" 64 / clockCyclesPerMicrosecond() : ");
dtostrf(64/clockCyclesPerMicrosecond(),18,8,_buffer4);
uart_write_string(_buffer4);
uart_write_string("\r\n");

this outputs 4.3402770. because clockCyclesPerMicrosecond() returns 14.7456.

refer to my full code on the first post. I'm casting it to double.

Okay, sorry,

"int / unsigned long" can't output double.

but now I'm doing "int / double".

You can refer to my full code on the first post.

clockCyclesPerMicrosecond() is double.

I still can’t figure out why the output value of micros() is 0.

also is not:

#define clockCyclesPerMicrosecond() ( F_CPU / 1000000L )

But you can count as you like.
Good luck

So may be they have a specific version of clockCyclesPerMicrosecond() then that would return a floating point value ?

Any doc?