Go Down

Topic: Poor GCC optimization (Read 2 times) previous topic - next topic

Nick Gammon


I have Teensy 3.0 as well but I like to have this code running nicely on Uno for the sake of the microcontroller optimization challenge and exercise (: I haven't been writing asm for quite some time either, but maybe it's time to get a bit familiar with AVR asm.


I haven't done it myself but you could try altering your build process to use the later avr-gcc program.
http://www.gammon.com.au/electronics

westfw

As far as I understand, C has no concept of the idea that the result of an operation can be a different type than the operands (as in "an 8x8 multiply gives a 16bit result.")  So I think that...
Code: [Select]
res += (smp * vol) >> 8; 
has two possible interpretations:
1) smp and vol are 8 bits, and are promoted to 16bit quantities before the multiply, giving a 16bit result.  (this is what is happening, right?)
2) smp and vol are 8 bits, the result is 8 bits, shifting right by 8 bits gives zero, and the statement is a no-op.  (this is not what you want, is it?)

You wanted "smp and vol are 8 bits, the result of a multiply is 16bits, give me the high 8 of those 16bits" - C is not going to do that.

"mixed-size math" remains one of the areas where assembly language retains a big advantage over C.

Coding Badly

I haven't done it myself but you could try altering your build process to use the later avr-gcc program.


It's easy to do and works very well.

JarkkoL

#18
Jul 28, 2013, 08:06 am Last Edit: Jul 28, 2013, 08:41 am by JarkkoL Reason: 1
Quote

I haven't done it myself but you could try altering your build process to use the later avr-gcc program.

Ah, that's a good idea. I didn't realize GCC that comes with Arduino download is from 2008.
Quote

It's easy to do and works very well.

Great, can you tell me how to update avr-gcc exactly?

Edit: Is the latest GCC 4.8.1 available for AVR & Windows that I could easily install?

Coding Badly


http://forum.arduino.cc/index.php?topic=51984.msg371307#msg371307
http://forum.arduino.cc/index.php?topic=37965.0

Atmel provides the latest compiler here...
http://www.atmel.com/tools/ATMELAVRTOOLCHAINFORWINDOWS.aspx

MichaelMeissner

#20
Jul 28, 2013, 01:44 pm Last Edit: Jul 28, 2013, 01:47 pm by MichaelMeissner Reason: 1

I would have to agree that the 'optimizer' is an optimistic name :)  The game is given away by the flag -Os.  In other words the gcc authors admit that they can deal with size (true), speed (doubtful) but not both at the same time.  Hand assembly can do both at the same time, but I gave that up 20 years ago.

Except in special cases, no compiler can optimize for both speed and size at the same time.  In addition to compiling for speed and compiling for size, and third thing people need to choose is whether the time spent doing the compilation is more important than either the time it takes the executable code to run or the size of the code.

And as mentioned in other posts, the GCC shipped with the AVR is based on a release from 5 years ago.

JarkkoL

I updated the compiler to the latest one from Atmel (4.7.2) and it does much better job with optimization. For multiplication it's now also using the "mulsu" instruction instead of 3 muls:
Code: [Select]

      res+=(smp*vol)>>8;
    383e: 0f 2f        mov r16, r31
    3840: 1e 2f        mov r17, r30
    3842: 10 03        mulsu r17, r16
    3844: f0 01        movw r30, r0
    3846: 11 24        eor r1, r1
    3848: ef 2f        mov r30, r31
    384a: ff 0f        add r31, r31
    384c: ff 0b        sbc r31, r31
    384e: 2e 0f        add r18, r30
    3850: 3f 1f        adc r19, r31

It still seems a bit suboptimal though. For example the two first mov instructions are useless and mulsu could take r30 and r31 straight as inputs.

joe mcd

Quote
I updated the compiler to the latest one from Atmel (4.7.2)


How do you do that?

Coding Badly


Nick Gammon

That works?

http://forum.arduino.cc/index.php?topic=51984.msg371307#msg371307

That should be a sticky somewhere. I'll never remember that when I actually need to do it.
http://www.gammon.com.au/electronics

JarkkoL


That works?

Yes, except do that with Atmel AVR Toolchain to get the latest version. Latest WinAVR is from 2010 and has GCC 4.3.3

Coding Badly

That works?


It works and I just realized I have stopped performing step #8 so I suspect it can be removed.

Step #5 is only necessary if you want to preserve the old version (which I only do so I can support the Tiny Core).

A few folks claim avrdude from the Arduino download is necessary but I have not had any problems using the WinAVR / Atmel version.  Maybe it's necessary for Mac / Linux users.

Quote
That should be a sticky somewhere.


No doubt!

Quote
I'll never remember that when I actually need to do it.


In my case, I just keep copying / moving the "good" avr folder to wherever I need it.  I've done it so many freakin' times I can do it in my sleep (and may have).


For what it's worth, I have been using the WinAVR version for a few years and the Atmel version for a few months with no regrets.

Coding Badly


Oh, and WinAVR really should be hosted somewhere else.  Some days I have fits trying to get past that cursed Webring.

Coding Badly

Yes, except do that with Atmel AVR Toolchain to get the latest version. Latest WinAVR is from 2010 and has GCC 4.3.3


The only "downside" is registration is required to download.  However, registration provides access to samples!

bperrybap

#29
Aug 01, 2013, 02:16 am Last Edit: Aug 01, 2013, 02:18 am by bperrybap Reason: 1

A few folks claim avrdude from the Arduino download is necessary but I have not had any problems using the WinAVR / Atmel version.  Maybe it's necessary for Mac / Linux users.

I don't believe their claims. The newer avrdude works just fine for me.
I run avrdude that I build myself from the avrdude repository.
I use  linux, but I have also built it for other folks that still use Windows.
It works just fine on those two platforms.
I also patched avrdude  to get the AVR dragon to work with the IDE and the optiboot makefiles
to burn a bootloader.
(There are small patches that could be done in the IDE or the optiboot makefiles to work
around the avrdude issue, but I've not been able to get either of them to make the changes
so I just fixed it in avrdude where it really should be fixed anyway)

One thing to note about the newer avrdude, is that
it is MUCH faster at burning a bootloader because it knows how to skip over unused regions
Example:
0.5 seconds vs 30 seconds on USBasp to burn a new bootloader.

With the old avrdude the entire flash gets burned even for a 512 byte bootloader.

--- bill

Go Up