Go Down

Topic: Poor GCC optimization (Read 5062 times)previous topic - next topic

Nick Gammon

#15
Jul 27, 2013, 11:24 pm

I have Teensy 3.0 as well but I like to have this code running nicely on Uno for the sake of the microcontroller optimization challenge and exercise (: I haven't been writing asm for quite some time either, but maybe it's time to get a bit familiar with AVR asm.

I haven't done it myself but you could try altering your build process to use the later avr-gcc program.
Please post technical questions on the forum, not by personal message. Thanks!

http://www.gammon.com.au/electronics

westfw

#16
Jul 28, 2013, 02:34 am
As far as I understand, C has no concept of the idea that the result of an operation can be a different type than the operands (as in "an 8x8 multiply gives a 16bit result.")  So I think that...
Code: [Select]
`res += (smp * vol) >> 8;  `
has two possible interpretations:
1) smp and vol are 8 bits, and are promoted to 16bit quantities before the multiply, giving a 16bit result.  (this is what is happening, right?)
2) smp and vol are 8 bits, the result is 8 bits, shifting right by 8 bits gives zero, and the statement is a no-op.  (this is not what you want, is it?)

You wanted "smp and vol are 8 bits, the result of a multiply is 16bits, give me the high 8 of those 16bits" - C is not going to do that.

"mixed-size math" remains one of the areas where assembly language retains a big advantage over C.

#17
Jul 28, 2013, 07:09 am
I haven't done it myself but you could try altering your build process to use the later avr-gcc program.

It's easy to do and works very well.

JarkkoL

#18
Jul 28, 2013, 08:06 amLast Edit: Jul 28, 2013, 08:41 am by JarkkoL Reason: 1
Quote

I haven't done it myself but you could try altering your build process to use the later avr-gcc program.

Ah, that's a good idea. I didn't realize GCC that comes with Arduino download is from 2008.
Quote

It's easy to do and works very well.

Great, can you tell me how to update avr-gcc exactly?

Edit: Is the latest GCC 4.8.1 available for AVR & Windows that I could easily install?

#19
Jul 28, 2013, 09:19 am

http://forum.arduino.cc/index.php?topic=51984.msg371307#msg371307
http://forum.arduino.cc/index.php?topic=37965.0

Atmel provides the latest compiler here...
http://www.atmel.com/tools/ATMELAVRTOOLCHAINFORWINDOWS.aspx

MichaelMeissner

#20
Jul 28, 2013, 01:44 pmLast Edit: Jul 28, 2013, 01:47 pm by MichaelMeissner Reason: 1

I would have to agree that the 'optimizer' is an optimistic name  The game is given away by the flag -Os.  In other words the gcc authors admit that they can deal with size (true), speed (doubtful) but not both at the same time.  Hand assembly can do both at the same time, but I gave that up 20 years ago.

Except in special cases, no compiler can optimize for both speed and size at the same time.  In addition to compiling for speed and compiling for size, and third thing people need to choose is whether the time spent doing the compilation is more important than either the time it takes the executable code to run or the size of the code.

And as mentioned in other posts, the GCC shipped with the AVR is based on a release from 5 years ago.

JarkkoL

#21
Jul 31, 2013, 04:00 am
I updated the compiler to the latest one from Atmel (4.7.2) and it does much better job with optimization. For multiplication it's now also using the "mulsu" instruction instead of 3 muls:
Code: [Select]
`      res+=(smp*vol)>>8;    383e: 0f 2f        mov r16, r31    3840: 1e 2f        mov r17, r30    3842: 10 03        mulsu r17, r16    3844: f0 01        movw r30, r0    3846: 11 24        eor r1, r1    3848: ef 2f        mov r30, r31    384a: ff 0f        add r31, r31    384c: ff 0b        sbc r31, r31    384e: 2e 0f        add r18, r30    3850: 3f 1f        adc r19, r31`
It still seems a bit suboptimal though. For example the two first mov instructions are useless and mulsu could take r30 and r31 straight as inputs.

joe mcd

#22
Jul 31, 2013, 06:44 pm
Quote
I updated the compiler to the latest one from Atmel (4.7.2)

How do you do that?

#23

Nick Gammon

#24
Jul 31, 2013, 11:02 pm
That works?

http://forum.arduino.cc/index.php?topic=51984.msg371307#msg371307

That should be a sticky somewhere. I'll never remember that when I actually need to do it.
Please post technical questions on the forum, not by personal message. Thanks!

http://www.gammon.com.au/electronics

JarkkoL

#25
Jul 31, 2013, 11:30 pm

That works?

Yes, except do that with Atmel AVR Toolchain to get the latest version. Latest WinAVR is from 2010 and has GCC 4.3.3

#26
Jul 31, 2013, 11:49 pm
That works?

It works and I just realized I have stopped performing step #8 so I suspect it can be removed.

Step #5 is only necessary if you want to preserve the old version (which I only do so I can support the Tiny Core).

A few folks claim avrdude from the Arduino download is necessary but I have not had any problems using the WinAVR / Atmel version.  Maybe it's necessary for Mac / Linux users.

Quote
That should be a sticky somewhere.

No doubt!

Quote
I'll never remember that when I actually need to do it.

In my case, I just keep copying / moving the "good" avr folder to wherever I need it.  I've done it so many freakin' times I can do it in my sleep (and may have).

For what it's worth, I have been using the WinAVR version for a few years and the Atmel version for a few months with no regrets.

#27
Jul 31, 2013, 11:50 pm

Oh, and WinAVR really should be hosted somewhere else.  Some days I have fits trying to get past that cursed Webring.

#28
Aug 01, 2013, 02:03 am
Yes, except do that with Atmel AVR Toolchain to get the latest version. Latest WinAVR is from 2010 and has GCC 4.3.3

bperrybap

#29
Aug 01, 2013, 02:16 amLast Edit: Aug 01, 2013, 02:18 am by bperrybap Reason: 1

A few folks claim avrdude from the Arduino download is necessary but I have not had any problems using the WinAVR / Atmel version.  Maybe it's necessary for Mac / Linux users.

I don't believe their claims. The newer avrdude works just fine for me.
I run avrdude that I build myself from the avrdude repository.
I use  linux, but I have also built it for other folks that still use Windows.
It works just fine on those two platforms.
I also patched avrdude  to get the AVR dragon to work with the IDE and the optiboot makefiles
(There are small patches that could be done in the IDE or the optiboot makefiles to work
around the avrdude issue, but I've not been able to get either of them to make the changes
so I just fixed it in avrdude where it really should be fixed anyway)

it is MUCH faster at burning a bootloader because it knows how to skip over unused regions
Example:
0.5 seconds vs 30 seconds on USBasp to burn a new bootloader.

With the old avrdude the entire flash gets burned even for a 512 byte bootloader.

--- bill

Go Up

Please enter a valid email to subscribe