Fixed point slower than floating point?

(deleted)

test_math.ino (596 Bytes)

test_fixmath.ino (895 Bytes)

I don't think asm("") is preventing the optimizer from removing code.
Declare the result array to be volatile. That should keep the calculation intact.

And for such a miniscule piece of code, use code tags

Pete

el_supremo:
Declare the result array to be volatile. That should keep the calculation intact.

Or do something with it. Like printing it to the Serial console (use a Serial.write() - it'll show as garbage but that doesn't matter). After recording the time, of course, but it'll guarantee you it's still there :slight_smile:

(deleted)

This has nothing to do with the thread’s question but why are you initializing first element of array to be 0?

float result[100] = {0};

If you wanted to initialize all 100 elements to have value of 0 then use memset or for/while loop.

(deleted)

I think the FixedPoints library is just slower than the floating point emulation when it does multiplication.
If you try addition, the fixed point is much faster than the float.
Multiplying with floating point is actually quite easy. Multiply and normalize the result and add the exponents. Addition is more complicated in floating point.

Pete

(deleted)

Seba07:
...I found that SQ15x16 multiplications are slower than floats...

I assume that is 1 sign bit + 15 bits whole number + 16 bits fraction for a grand total of 32 bits. I also assume you are multiplying two 31 bit numbers.

Floats on the other hand have a 24 bit mantissa.

The fixed-point multiplication is simply doing more work. My guess is about 30% more.

It depends a lot on the data type, implementation and compiler optimization. According to the linked sketch, unsigned integer multiplication is .32us for 8 bit, .76us for 16 bit, 5.03us for 32 bit and a whopping 21.57us for 64 bit. For 32 bit floats it is 6.09us which is quite close to the 32 bit integers.

EDIT: Another take at it

(deleted)

Seba07:
But isn't it the whole sense of fixed points to be faster than floats if you don't have a FPU?

I've heard that too.

Increasing / ensuring precision is another reason to use fixed-point (31 bits versus 24 bits in your case).

Hello Seba07

Seba07:
Hi, I'm doing some tests with the FixedPoints library and I found that SQ15x16 multiplications are slower than floats and I can't really believe that. Both testfiles are attached.
Compiled using the Arduino IDE with -O2 on an Arduino Uno (atmega328p). float is ~10us per multiplication, fixed ~18us.

Thanks in advance.

Trying to compile your file I get:

Arduino : 1.8.4 (Windows 10), Carte : "Arduino/Genuino Uno"

F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\arduino-builder -dump-prefs -logger=machine -hardware F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware -hardware C:\Users\Philippe10\AppData\Local\Arduino15\packages -tools F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\tools-builder -tools F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -tools C:\Users\Philippe10\AppData\Local\Arduino15\packages -built-in-libraries F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\libraries -libraries C:\Users\Philippe10\Documents\Arduino\libraries -fqbn=arduino:avr:uno -vid-pid=0X2A03_0X0043 -ide-version=10804 -build-path C:\Users\PHILIP~1\AppData\Local\Temp\arduino_build_578477 -warnings=all -build-cache C:\Users\PHILIP~1\AppData\Local\Temp\arduino_cache_864143 -prefs=build.warn_data_percentage=75 -prefs=runtime.tools.avr-gcc.path=F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -prefs=runtime.tools.arduinoOTA.path=F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -prefs=runtime.tools.avrdude.path=F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -verbose F:\DonnéesM6\Données_M6\Données_WD160D\Electronique\Arduino\Calcul_Math_Flottant\test_fixmath\test_fixmath.ino
F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\arduino-builder -compile -logger=machine -hardware F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware -hardware C:\Users\Philippe10\AppData\Local\Arduino15\packages -tools F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\tools-builder -tools F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -tools C:\Users\Philippe10\AppData\Local\Arduino15\packages -built-in-libraries F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\libraries -libraries C:\Users\Philippe10\Documents\Arduino\libraries -fqbn=arduino:avr:uno -vid-pid=0X2A03_0X0043 -ide-version=10804 -build-path C:\Users\PHILIP~1\AppData\Local\Temp\arduino_build_578477 -warnings=all -build-cache C:\Users\PHILIP~1\AppData\Local\Temp\arduino_cache_864143 -prefs=build.warn_data_percentage=75 -prefs=runtime.tools.avr-gcc.path=F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -prefs=runtime.tools.arduinoOTA.path=F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -prefs=runtime.tools.avrdude.path=F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr -verbose F:\DonnéesM6\Données_M6\Données_WD160D\Electronique\Arduino\Calcul_Math_Flottant\test_fixmath\test_fixmath.ino
Using board 'uno' from platform in folder: F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\arduino\avr
Using core 'arduino' from platform in folder: F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\arduino\avr
Detecting libraries used...
"F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr/bin/avr-g++" -c -g -O2 -w -std=gnu++11 -fpermissive -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -flto -w -x c++ -E -CC -mmcu=atmega328p -DF_CPU=16000000L -DARDUINO=10804 -DARDUINO_AVR_UNO -DARDUINO_ARCH_AVR "-IF:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\arduino\avr\cores\arduino" "-IF:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\arduino\avr\variants\standard" "C:\Users\PHILIP~1\AppData\Local\Temp\arduino_build_578477\sketch\test_fixmath.ino.cpp" -o "nul"
"F:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\tools\avr/bin/avr-g++" -c -g -O2 -w -std=gnu++11 -fpermissive -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -flto -w -x c++ -E -CC -mmcu=atmega328p -DF_CPU=16000000L -DARDUINO=10804 -DARDUINO_AVR_UNO -DARDUINO_ARCH_AVR "-IF:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\arduino\avr\cores\arduino" "-IF:\DonnéesM6\Programmes_M6\ArduinoMooc\Arduino\hardware\arduino\avr\variants\standard" "C:\Users\PHILIP~1\AppData\Local\Temp\arduino_build_578477\sketch\test_fixmath.ino.cpp" -o "C:\Users\PHILIP~1\AppData\Local\Temp\arduino_build_578477\preproc\ctags_target_for_gcc_minus_e.cpp"
F:\DonnéesM6\Données_M6\Données_WD160D\Electronique\Arduino\Calcul_Math_Flottant\test_fixmath\test_fixmath.ino:1:25: fatal error: FixedPoints.h: No such file or directory

compilation terminated.

exit status 1
Erreur de compilation pour la carte Arduino/Genuino Uno


F:\DonnéesM6\Données_M6\Données_WD160D\Electronique\Arduino\Calcul_Math_Flottant\test_fixmath\test_fixmath.ino:1:25: fatal error: FixedPoints.h: No such file or directory

??

Regards,
bidouilleelec

That's because you haven't installed the FixedPoints library.

Pete

Hello el_supremo

el_supremo:
That's because you haven't installed the FixedPoints library.

Pete

No doubt! Many thanks.
Where is this library?

Regards,
bidouilleelec

I don’t think asm("") is preventing the optimizer from removing code.

It isn’t. Or at least, not the math code. Both programs, as posted, produce empty loops…

void loop() {
  time = micros();
 620:   0e 94 68 01     call    0x2d0   ; 0x2d0 <micros>
 624:   6b 01           movw    r12, r22
 626:   7c 01           movw    r14, r24
 628:   84 e6           ldi     r24, 0x64       ; 100
 62a:   90 e0           ldi     r25, 0x00       ; 0
  for (int i = 0; i < 100; i++) {
    result[i] = a[i] * b[i];
    asm("");
 62c:   01 97           sbiw    r24, 0x01       ; 1
  }
}

void loop() {
  time = micros();
  for (int i = 0; i < 100; i++) {
 62e:   f1 f7           brne    .-4             ; 0x62c <main+0x1a6>
    result[i] = a[i] * b[i];
    asm("");
  }
  time = micros() - time;
 630:   0e 94 68 01     call    0x2d0   ; 0x2d0 <micros>

isn’t it the whole sense of fixed points to be faster than floats if you don’t have a FPU?

That only works if you have hardware support for the fixed point.
an AVR has “limited” HW support for INTEGER multiplication, and none for division. And no specific support for FixedPoint. So a fixedpoint multiply with 32bit numbers has essentially the same operations as a 32-bit integer multiply, plus a bunch of extra overhead to shift the results around, plus whatever inefficiencies this library wraps around everything. Whereas a floating point multiply is only 24 bits, plus some very-highly optimized avr-libc to handled the float format…

Recent gcc is supposed to include native fixed point support; that might be slightly better. But I think a 32bit operations are likely to always be slower than 24bit operations…

westfw:
It isn’t. Or at least, not the math code. Both programs, as posted, produce empty loops…

void loop() {

time = micros();
620:   0e 94 68 01     call    0x2d0   ; 0x2d0
624:   6b 01           movw    r12, r22
626:   7c 01           movw    r14, r24
628:   84 e6           ldi     r24, 0x64       ; 100
62a:   90 e0           ldi     r25, 0x00       ; 0
 for (int i = 0; i < 100; i++) {
   result[i] = a[i] * b[i];
   asm("");
62c:   01 97           sbiw    r24, 0x01       ; 1
 }
}

void loop() {
 time = micros();
 for (int i = 0; i < 100; i++) {
62e:   f1 f7           brne    .-4             ; 0x62c <main+0x1a6>
   result[i] = a[i] * b[i];
   asm("");
 }
 time = micros() - time;
630:   0e 94 68 01     call    0x2d0   ; 0x2d0




That only works if you have hardware support for the fixed point.
an AVR has "limited" HW support for INTEGER multiplication, and none for division. And no specific support for FixedPoint. So a fixedpoint multiply with 32bit numbers has essentially the same operations as a 32-bit integer multiply, plus a bunch of extra overhead to shift the results around, plus whatever inefficiencies this library wraps around everything. Whereas a floating point multiply is only 24 bits, plus some very-highly optimized avr-libc to handled the float format...

Recent gcc is supposed to include native fixed point support; that might be slightly better. But I think a 32bit operations are likely to always be slower than 24bit operations...

westfw:
It isn’t. Or at least, not the math code. Both programs, as posted, produce empty loops…

void loop() {

time = micros();
620:   0e 94 68 01     call    0x2d0   ; 0x2d0
624:   6b 01           movw    r12, r22
626:   7c 01           movw    r14, r24
628:   84 e6           ldi     r24, 0x64       ; 100
62a:   90 e0           ldi     r25, 0x00       ; 0
 for (int i = 0; i < 100; i++) {
   result[i] = a[i] * b[i];
   asm("");
62c:   01 97           sbiw    r24, 0x01       ; 1
 }
}

void loop() {
 time = micros();
 for (int i = 0; i < 100; i++) {
62e:   f1 f7           brne    .-4             ; 0x62c <main+0x1a6>
   result[i] = a[i] * b[i];
   asm("");
 }
 time = micros() - time;
630:   0e 94 68 01     call    0x2d0   ; 0x2d0




That only works if you have hardware support for the fixed point.
an AVR has "limited" HW support for INTEGER multiplication, and none for division. And no specific support for FixedPoint. So a fixedpoint multiply with 32bit numbers has essentially the same operations as a 32-bit integer multiply, plus a bunch of extra overhead to shift the results around, plus whatever inefficiencies this library wraps around everything. Whereas a floating point multiply is only 24 bits, plus some very-highly optimized avr-libc to handled the float format...

Recent gcc is supposed to include native fixed point support; that might be slightly better. But I think a 32bit operations are likely to always be slower than 24bit operations...

I completely agree.
I have a problem of English and I was thinking how to explain that.

Regards,
bidouilleelec

(deleted)

(deleted)