adressing data from assembler

Hello,

[sorry , i could not post this in "Syntax..." on the forum, donno why...]

i am optimizing my code with inline assembler but i got questions...

ex: i have to acess elements of a 2 dimentional array one by one and refer them as operands that gonna be mapped to registers

__asm__ __volatile__
(

    "mov r24,%[LEDS]" "\n\t"

 : [LEDS] "+r" (leds[0][0].value) ::
);

how do i adress a whole array instead of having to assign 1 register per operand ?

the problem is that if i use too many operands, the compiler generates a lot of code to load data in registers (before my code) and a lot of code to load modified registers back to the memory (after my code)

so i thought i could do the adressing myself by storing the adress of an array for instance in the z register and the parsing memory, read/write the required values, but how to do that ?

also the compiler generates a lot of push before my actual asm code (without any pops ??? wtf?) does anybody knows why and how to supress this, i lose a lot of cycles for nothing as the whole code runs in the "void loop(){...}" section

ex:

void loop()
{

     400: 4f 92        push    r4
     402: 5f 92        push    r5
     404: 6f 92        push    r6
     406: 7f 92        push    r7
     408: 8f 92        push    r8
     40a: 9f 92        push    r9
     40c: af 92        push    r10
     40e: bf 92        push    r11
     410: cf 92        push    r12
     412: df 92        push    r13
     414: ef 92        push    r14
     416: ff 92        push    r15
     418: 0f 93        push    r16
     41a: 1f 93        push    r17
     41c: df 93        push    r29
     41e: cf 93        push    r28
     420: 0f 92        push    r0
     422: cd b7        in  r28, 0x3d   ; 61
     424: de b7        in  r29, 0x3e   ; 62
     426: c0 90 04 04     lds r12, 0x0404
     42a: 8c 2c        mov r8, r12
     42c: 99 24        eor r9, r9
     42e: 70 91 05 04     lds r23, 0x0405
     432: 87 2f        mov r24, r23
     434: 90 e0        ldi r25, 0x00   ; 0
     436: f0 90 36 01     lds r15, 0x0136
     43a: 2c 01        movw    r4, r24
     43c: 44 0c        add r4, r4
     43e: 55 1c        adc r5, r5
     440: 44 0c        add r4, r4
     442: 55 1c        adc r5, r5
     444: 48 0e        add r4, r24
     446: 59 1e        adc r5, r25
     448: 88 e7        ldi r24, 0x78   ; 120
     44a: 90 e0        ldi r25, 0x00   ; 0
     44c: 9c 01        movw    r18, r24
     44e: 82 9e        mul r8, r18
     450: c0 01        movw    r24, r0
     452: 83 9e        mul r8, r19
     454: 90 0d        add r25, r0
     456: 92 9e        mul r9, r18
     458: 90 0d        add r25, r0
     45a: 11 24        eor r1, r1
     45c: 48 0e        add r4, r24
     45e: 59 1e        adc r5, r25
     460: f2 01        movw    r30, r4
     462: ec 5c        subi    r30, 0xCC   ; 204
     464: fe 4f        sbci    r31, 0xFE   ; 254
     466: 43 81        ldd r20, Z+3    ; 0x03
     468: 54 81        ldd r21, Z+4    ; 0x04
     46a: 91 81        ldd r25, Z+1    ; 0x01
     46c: 30 91 06 04     lds r19, 0x0406
     470: 3a 01        movw    r6, r20
     472: b9 2e        mov r11, r25
     474: a3 2e        mov r10, r19
     476: 27 2f        mov r18, r23
     478: 26 95        lsr r18
     47a: 26 95        lsr r18
     47c: 26 95        lsr r18

}

plz help

thx

anybody ?

i am optimizing my code with inline assembler

Why?

The compiler optimizes your code too. What makes you think you can do a better job, Rumpelstiltskin?

rompelstilchen: ... i could not post this in "Syntax..." on the forum, donno why...

Because that part of the forum is read-only? Just a guess.

[quote author=Coding Badly link=topic=158948.msg1191628#msg1191628 date=1365416280]

i am optimizing my code with inline assembler

Why? [/quote]

to have you ask why

[quote author=Nick Gammon link=topic=158948.msg1191654#msg1191654 date=1365417389] The compiler optimizes your code too. What makes you think you can do a better job, Rumpelstiltskin?

[/quote]

i know, but if you dont try.. right?

btw i end up having a code 10 times faster

seeing the code in asm, also helps you understand things that could be optimized

[quote author=Coding Badly link=topic=158948.msg1191628#msg1191628 date=1365416280]

i am optimizing my code with inline assembler

Why? [/quote]

i need fast refresh, and with c code, the led panel flickers arduino is not that fast

rompelstilchen: [quote author=Nick Gammon link=topic=158948.msg1191654#msg1191654 date=1365417389] The compiler optimizes your code too. What makes you think you can do a better job, Rumpelstiltskin?

i know, but if you dont try.. right?

btw i end up having a code 10 times faster

seeing the code in asm, also helps you understand things that could be optimized [/quote]

Excellent answer. Looking at the generated code has always helped me to make more efficient programs. While the optimizations can be incredible, the compiler does not always generate the most efficient code.

I probably know less about AVR assembler than any other assembler in the world, but I'll try to help. The tail-end registers can be paired ( R26:R27, R28:R29 and R30:R31). These are treated as 16-bit pointer registers (X, Y and Z) that can point into SRAM. Maybe setting one of these as a base pointer and using offsets you can accomplish what you want. Like I said, I have written no AVR assembler in the past, but I've written tons of ARM7, PIC and mainframe assembler along with scads of other micros.

Generating your own pro/epilogue code: http://ucexperiment.wordpress.com/2013/03/13/improving-the-interrupt-service-routine/

rompelstilchen: i need fast refresh, and with c code, the led panel flickers arduino is not that fast

It's fast enough to generate VGA signals which don't flicker:

http://www.gammon.com.au/forum/?id=11608

I suggest you post your C code rather than trying to convert it all to assembler. By all means look at the generated assembler code, that's what I did. And then work out what lines of C code are generating more assembler code than you though.

To do that, find the .elf file from your compile (turn on verbose compiling) and type this at a command window:

avr-objdump -S -z filename.elf