Pages: 1 [2] 3   Go Down
Author Topic: Poor GCC optimization  (Read 1525 times)
0 Members and 1 Guest are viewing this topic.
Global Moderator
Offline Offline
Brattain Member
*****
Karma: 452
Posts: 18694
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I have Teensy 3.0 as well but I like to have this code running nicely on Uno for the sake of the microcontroller optimization challenge and exercise (: I haven't been writing asm for quite some time either, but maybe it's time to get a bit familiar with AVR asm.

I haven't done it myself but you could try altering your build process to use the later avr-gcc program.
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 106
Posts: 6373
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

As far as I understand, C has no concept of the idea that the result of an operation can be a different type than the operands (as in "an 8x8 multiply gives a 16bit result.")  So I think that...
Code:
res += (smp * vol) >> 8; 
has two possible interpretations:
1) smp and vol are 8 bits, and are promoted to 16bit quantities before the multiply, giving a 16bit result.  (this is what is happening, right?)
2) smp and vol are 8 bits, the result is 8 bits, shifting right by 8 bits gives zero, and the statement is a no-op.  (this is not what you want, is it?)

You wanted "smp and vol are 8 bits, the result of a multiply is 16bits, give me the high 8 of those 16bits" - C is not going to do that.

"mixed-size math" remains one of the areas where assembly language retains a big advantage over C.
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12283
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I haven't done it myself but you could try altering your build process to use the later avr-gcc program.

It's easy to do and works very well.
Logged

Montreal
Offline Offline
Full Member
***
Karma: 4
Posts: 179
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
I haven't done it myself but you could try altering your build process to use the later avr-gcc program.
Ah, that's a good idea. I didn't realize GCC that comes with Arduino download is from 2008.
Quote
It's easy to do and works very well.
Great, can you tell me how to update avr-gcc exactly?

Edit: Is the latest GCC 4.8.1 available for AVR & Windows that I could easily install?
« Last Edit: July 28, 2013, 01:41:14 am by JarkkoL » Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12283
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


http://forum.arduino.cc/index.php?topic=51984.msg371307#msg371307
http://forum.arduino.cc/index.php?topic=37965.0

Atmel provides the latest compiler here...
http://www.atmel.com/tools/ATMELAVRTOOLCHAINFORWINDOWS.aspx
Logged

Ayer, Massachusetts, USA
Offline Offline
Edison Member
*
Karma: 50
Posts: 1766
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I would have to agree that the 'optimizer' is an optimistic name smiley  The game is given away by the flag -Os.  In other words the gcc authors admit that they can deal with size (true), speed (doubtful) but not both at the same time.  Hand assembly can do both at the same time, but I gave that up 20 years ago.
Except in special cases, no compiler can optimize for both speed and size at the same time.  In addition to compiling for speed and compiling for size, and third thing people need to choose is whether the time spent doing the compilation is more important than either the time it takes the executable code to run or the size of the code.

And as mentioned in other posts, the GCC shipped with the AVR is based on a release from 5 years ago.
« Last Edit: July 28, 2013, 06:47:17 am by MichaelMeissner » Logged

Montreal
Offline Offline
Full Member
***
Karma: 4
Posts: 179
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I updated the compiler to the latest one from Atmel (4.7.2) and it does much better job with optimization. For multiplication it's now also using the "mulsu" instruction instead of 3 muls:
Code:
      res+=(smp*vol)>>8;
    383e: 0f 2f        mov r16, r31
    3840: 1e 2f        mov r17, r30
    3842: 10 03        mulsu r17, r16
    3844: f0 01        movw r30, r0
    3846: 11 24        eor r1, r1
    3848: ef 2f        mov r30, r31
    384a: ff 0f        add r31, r31
    384c: ff 0b        sbc r31, r31
    384e: 2e 0f        add r18, r30
    3850: 3f 1f        adc r19, r31
It still seems a bit suboptimal though. For example the two first mov instructions are useless and mulsu could take r30 and r31 straight as inputs.
Logged

0
Offline Offline
Sr. Member
****
Karma: 6
Posts: 383
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
I updated the compiler to the latest one from Atmel (4.7.2)

How do you do that?
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12283
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

How do you do that?

Reply #19
Logged

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 452
Posts: 18694
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

That works?

http://forum.arduino.cc/index.php?topic=51984.msg371307#msg371307

That should be a sticky somewhere. I'll never remember that when I actually need to do it.
Logged

Montreal
Offline Offline
Full Member
***
Karma: 4
Posts: 179
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

That works?
Yes, except do that with Atmel AVR Toolchain to get the latest version. Latest WinAVR is from 2010 and has GCC 4.3.3
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12283
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

That works?

It works and I just realized I have stopped performing step #8 so I suspect it can be removed.

Step #5 is only necessary if you want to preserve the old version (which I only do so I can support the Tiny Core).

A few folks claim avrdude from the Arduino download is necessary but I have not had any problems using the WinAVR / Atmel version.  Maybe it's necessary for Mac / Linux users.

Quote
That should be a sticky somewhere.

No doubt!

Quote
I'll never remember that when I actually need to do it.

In my case, I just keep copying / moving the "good" avr folder to wherever I need it.  I've done it so many freakin' times I can do it in my sleep (and may have).


For what it's worth, I have been using the WinAVR version for a few years and the Atmel version for a few months with no regrets.
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12283
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


Oh, and WinAVR really should be hosted somewhere else.  Some days I have fits trying to get past that cursed Webring.
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12283
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Yes, except do that with Atmel AVR Toolchain to get the latest version. Latest WinAVR is from 2010 and has GCC 4.3.3

The only "downside" is registration is required to download.  However, registration provides access to samples!
Logged

Dallas, TX USA
Offline Offline
Edison Member
*
Karma: 47
Posts: 2333
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

A few folks claim avrdude from the Arduino download is necessary but I have not had any problems using the WinAVR / Atmel version.  Maybe it's necessary for Mac / Linux users.
I don't believe their claims. The newer avrdude works just fine for me.
I run avrdude that I build myself from the avrdude repository.
I use  linux, but I have also built it for other folks that still use Windows.
It works just fine on those two platforms.
I also patched avrdude  to get the AVR dragon to work with the IDE and the optiboot makefiles
to burn a bootloader.
(There are small patches that could be done in the IDE or the optiboot makefiles to work
around the avrdude issue, but I've not been able to get either of them to make the changes
so I just fixed it in avrdude where it really should be fixed anyway)

One thing to note about the newer avrdude, is that
it is MUCH faster at burning a bootloader because it knows how to skip over unused regions
Example:
0.5 seconds vs 30 seconds on USBasp to burn a new bootloader.

With the old avrdude the entire flash gets burned even for a 512 byte bootloader.

--- bill
« Last Edit: July 31, 2013, 07:18:37 pm by bperrybap » Logged

Pages: 1 [2] 3   Go Up
Jump to:  

Powered by SMF 1.1.19 | SMF © 2013, Simple Machines