Why is division by 4096 not the same as ">>12"?

Hi,

I have programmed a discrete filter using divisions and multiplications of variables of the type long. I have tested it and all is fine.

But as long-divisions are costly and actually is done by a fixed number to the power of 2, I had the idea in optimizing the code by replacing the division by a right-shift operation.

By comparing the results differences of the two versions were detected. Add the end, I was nailing it down to one division, and could create a little test-sketch showing the unexpected behaviour.

See my code:

void setup() {
  Serial.begin(115200);
  delay(1000);
  int   m_a   =  3934;
  long  uScal = 64000;
  long  m_xShift = 0;
  long  m_xDiv   = 0;
  long  m_xNextShift, m_xNextDiv;
  int   delta;
  Serial.println("n\tm_xNextShift\tm_xNextDiv\tDelta");
  for(byte n = 0; n <= 5; n++) {
    m_xNextShift  = (((long)m_a * (m_xShift - uScal)) >> 12  ) + uScal; 
    m_xNextDiv    = (((long)m_a * (m_xDiv   - uScal)) / 4096L) + uScal;
    delta         = m_xNextShift - m_xNextDiv;
    Serial.print(n);            Serial.print("\t");
    Serial.print(m_xNextShift); Serial.print("\t\t");
    Serial.print(m_xNextDiv);   Serial.print("\t\t");
    Serial.println(delta);
    m_xShift = m_xNextShift;
    m_xDiv   = m_xNextDiv;
  }
}

void loop() {
  while(1);
}

Following result is obtained (Code is running on Uno R3 using IDE 1.8.19.):

n	m_xNextShift	m_xNextDiv	Delta
0	2531			2532		-1
1	4962			4964		-2
2	7296			7299		-3
3	9538			9542		-4
4	11692			11696		-4
5	13760			13765		-5

I have expected to be all Delta are zero, but there aren't.
Why is /4096 not the same as >>12 ? Providing deeper understanding is appreciated :slightly_smiling_face:

I recently went down the same path on a particular DSP. It appears division rounds, but shift truncates. Although I had colleagues who assured me that division and shift were identical, this is clearly not the case.

It turns out that right-shift on signed integers is actually "implementation defined" behaviour, as mentioned in this thread fast divide of signed number using SHIFT ?

Had a look into your provided link. And it has remained me on not replacing division by shift operation for negative integers. So I keep the division.

But, my example is about positive integers so the limitation should not apply in the provided example.

I forced proper rounding by, modifying lines like

    m_xNextDiv    = (((long)m_a * (m_xDiv   - uScal) + 2047) / 4096L) + uScal;

For both filter update equations with both 2047 and 2048. Difference get smaller for

    m_xNextShift  = (((long)m_a * (m_xShift - uScal) + 2048) >> 12  ) + uScal; 

but still remain. :thinking:

Try these.

  unsigned long  uScal = 64000;
  unsigned long  m_xShift = 0;
  unsigned long  m_xDiv   = 0;
  unsigned long  m_xNextShift, m_xNextDiv;

The compiler goes by the type, not the value, and you used signed integer types.

In C/C++, integer division always truncates toward zero. For unsigned integers, division by a power of 2 or use of the corresponding right shift operation give the same result.

This little test, performed on an Arduino Uno R3, suggests that code compiled by gcc obeys that rule.

void setup() {
  Serial.begin(115200);
  while(!Serial);
  Serial.println("starting");
  long m;
  for (int i=0; i<30000; i++) {
  m = random(100000000L);
  if ((m>>12) != (m/4096L))
    Serial.println((m>>12) - (m/4096L));
  }
  Serial.println("done");
}

void loop() {}

However, the behavior of the right shift operator is not so well defined and the result is likely to be different than division for negative signed integers.

Trying to use unsigned long is not really working because the difference

    m_xShift - uScal

or

    m_xDiv - uScal

will be negative. Despite, filter shall work with both positive and negative numbers.

Would it be more accurate to say that the direction of truncation rather than rounding versus truncation is the cause of the difference?

Here's a real MRE

void setup() {
  Serial.begin(115200);
  delay(1000);

  long q1, q2;

  long dividend = -251776000;

  q1  = dividend >> 12; 
  q2 = dividend / 4096L;

  float asFloat = dividend / 4096.0;

  Serial.print("        this as float = "); Serial.println(asFloat);

  Serial.print(q1); Serial.print("\t\t");
  Serial.print(q2);   Serial.print("\t\t");
}

void loop() {
  while(1);
}

a7

It will, but expect small errors when using right shift. In any case, filter designers usually substitute integer operations that are only approximations of the ideal (real) coefficients.

Well, I had to update the code taking care of negative, intermediate results to (also code of @alto777 added)

void setup() {
  Serial.begin(115200);
  delay(1000);
  int   m_a   =  3934;
  unsigned long  uScal = 64000;
  unsigned long  m_xShift = 0;
  unsigned long  m_xDiv   = 0;
  unsigned long  m_xNextShift, m_xNextDiv;
  int   delta;
  Serial.println("n\tm_xNextShift\tm_xNextDiv\tDelta");
  for(byte n = 0; n <= 5; n++) {
    m_xNextShift  = (unsigned long)((((long)m_a * ((long)m_xShift - uScal)) >> 12  )) + uScal; 
    m_xNextDiv    = (unsigned long)((((long)m_a * ((long)m_xDiv   - uScal)) / 4096L)) + uScal;
    delta         = m_xNextShift - m_xNextDiv;
    Serial.print(n);            Serial.print("\t");
    Serial.print(m_xNextShift); Serial.print("\t\t");
    Serial.print(m_xNextDiv);   Serial.print("\t\t");
    Serial.println(delta);
    m_xShift = m_xNextShift;
    m_xDiv   = m_xNextDiv;
  }
  Serial.println("--- alto777 ---");
  long q1, q2;
  long dividend = -251776000;
  q1  = dividend >> 12; 
  q2 = dividend / 4096L;
  float asFloat = dividend / 4096.0;
  Serial.print("        this as float = "); Serial.println(asFloat);
  Serial.print(q1); Serial.print("\t\t");
  Serial.println(q2);
}

Which gives

n	m_xNextShift	m_xNextDiv	Delta
0	1051107			1051107		0
1	1012066			1012066		0
2	974569			974569		0
3	938555			938555		0
4	903965			903965		0
5	870743			870743		0
--- alto777 ---
        this as float = -61468.75
-61469		-61468

So difference is zero, but totally wrong for a 1st order low pass filter ...

1 Like

I would suggest looking at the object code, it is probable that any optimization is minimal anyway.

I guess you are raising the point of "severity". Well, to say it straigt forward: there is none - using divisions is not killing my use case at all.

But I was suprised by the result, and so I like to understand the root cause because there might be an undetected misunderstanding of mine.

1 Like

Now idea on how to look at the object code. But what I have seen is that the total size of the code (the *.ino) is just differing by 10 bytes - if this is telling anything meaningful :thinking:

The print() functions uses division, so the code for performing division will remain regardless of whether you use it or not.

Upps? Where is division invoked in the print() function?

I recommend https://godbolt.org/

With the following example :

long test_fn (long m_a) 
{
  long  m_xNextDiv;

    m_xNextDiv    =  m_a / 4096L;
    //m_xNextDiv    =  m_a >> 12L;

    return m_xNextDiv;
}

You can see that the compiler does some sort of shift instead of calling a function to do the division. I selected the Arduino Uno (1.8.9) compiler. I think all modern compilers will detect division by a power of 2 and optimize it.

Assembler output:

__SP_H__ = 0x3e
__SP_L__ = 0x3d
__SREG__ = 0x3f
__tmp_reg__ = 0
__zero_reg__ = 1
test_fn(long):
        push r28
        push r29
        in r28,__SP_L__
        in r29,__SP_H__
        sbiw r28,8
        in __tmp_reg__,__SREG__
        cli
        out __SP_H__,r29
        out __SREG__,__tmp_reg__
        out __SP_L__,r28
.L__stack_usage = 10
        std Y+5,r22
        std Y+6,r23
        std Y+7,r24
        std Y+8,r25
        ldd r24,Y+5
        ldd r25,Y+6
        ldd r26,Y+7
        ldd r27,Y+8
        tst r27
        brge .L2
        subi r24,1
        sbci r25,-16
        sbci r26,-1
        sbci r27,-1
.L2:
        mov r0,r23
        ldi r23,12
        1:
        asr r27
        ror r26
        ror r25
        ror r24
        dec r23
        brne 1b
        mov r23,r0
        std Y+1,r24
        std Y+2,r25
        std Y+3,r26
        std Y+4,r27
        ldd r24,Y+1
        ldd r25,Y+2
        ldd r26,Y+3
        ldd r27,Y+4
        movw r22,r24
        movw r24,r26
adiw r28,8
        in __tmp_reg__,__SREG__
        cli
        out __SP_H__,r29
        out __SREG__,__tmp_reg__
        out __SP_L__,r28
        pop r29
        pop r28
        ret

Buried rather deeply in the Print.cpp file, this is one instance, most numbers are printed this way:

size_t Print::printNumber(unsigned long n, uint8_t base)
{
  char buf[8 * sizeof(long) + 1]; // Assumes 8-bit chars plus zero byte.
  char *str = &buf[sizeof(buf) - 1]; 

  *str = '\0';

  // prevent crash if called with base == 1
  if (base < 2) base = 10; 

  do {
    char c = n % base;
    n /= base;

    *--str = c < 10 ? c + '0' : c + 'A' - 10; 
  } while(n);

  return write(str);
}

I would have expected something like this, too. And have to admit that last time I was playing with asm-code was back in the times of the 6502 µProcessor; so I don't understand anything :grimacing:, sorry for that.

So values are the same, but printing is different? Well, shouldn't it be the same if the values are the same, and different when not? I think it is already on the value side, isn't?

I was pointing out that the compiled code includes the code needed to do division, regardless of whether your calculation uses right shifting or division, so the size of the compiled code will not change substantially.

Looking at the size of the .ino file is meaningless anyway, that is just a text file containing the source code. The program memory usage reported by the compiler is the size of the compiled code.

For negative integers, right shift often gives values one less than the equivalent division by power of 2. Try this, or any variation you like:


void setup() {
  Serial.begin(115200);
  while(!Serial);
  Serial.println( (-1)>>1);
  Serial.println( (-1)/2);
}

void loop() {}

For a digital filter using rational approximations for coefficients, the difference in most cases is probably negligible.