Go Down

Topic: What am doing wrong here with my class? (Read 1 time) previous topic - next topic

Nick Gammon

You can make attachments you know. Including .zip files. Click on the Additional Options button below.

Quote
I don't think you'd want to go to all the trouble to compile it. :-)


I've gone to the trouble of writing, compiling and testing the code above with a view to helping you.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

scswift

Okay, if you really want to take a look at it, here's the code:
http://mightymicrocontroller.com/backups/Mightycode.zip

The board is a 1284p with an SD card on usart0, DAC and TLC5947s on SPI, and serial upload is done via USART1 using a special bootloader.  I use the avr-developers variant. 
(Crappy setup I know... too late to fix it though.)

The wavehc lib has been modified to use hardware SPI, and the sd lib that comes with it has been modfied to use usart0.

scswift

I've got a video of the thing in action too if you want to see what the code is doing.  This was with the old slow version of the LED SPI function though:
http://youtu.be/OtbZCntLV3Q

Nick Gammon


I just discovered that it was the NOP's which were the problem.  Somehow switching to the use of a namespace has changed the timing so 10 NOP's is no longer sufficient. 


I just tested the code with the NOPs. You are right, it's faster (84 uS). I was wrong about the SPI.transfer.

However looking at the generated code you still got 11 NOPs. The namespace didn't change that at all.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

scswift

Quote
However looking at the generated code you still got 11 NOPs. The namespace didn't change that at all.


I don't understand.  The generated code for which version? 

With the code just using standard global arrays, 10 NOPs was sufficient.  When I added the namespace, 10 stopped working (no leds lit) but 11 works.

I tried adding an extra WAIT before and after the loop as well, but 10 still does not work.  Nor does 9.

Nick Gammon

This version (I believe in making small test cases):

Code: [Select]

#include <SPI.h>

#define NOP __asm__ __volatile__ ("nop\n");
#define WAIT NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP // 11 NOPs


namespace myLeds {
   
    const int LEDBYTES = 36;
    const int SCALEBYTES = 24;
   
    int modules;   // Number of LED modules connected.  Defined in config.txt.
 
    byte * data;    // LED data is stored as a string of 12 bit LED values.  There are 3 bytes for every two LEDs.  We store it this way because the data does not need to be altered when we are shifting it out to the TLC5947s.  Modifying the data with bit shifts would be very expensive and there is a lot of data to move.
    byte * scale;   // Scaling factor for LED brightness.  Used when say, you want to be able to adjust the brightness of the powercell without changing all the code that assumes the max desired brightness is 1.0.  Value stored here is scaling_factor*255, where scaling factor is 0..1.

    void set(const int which, const float a, const float b)
      {
      // whatever
      }
     
    float get(const int which)
      {
      return 42.0;
      }
     
}  // end of namespace myLeds

void setup ()
  {
  myLeds::modules = 2;

  myLeds::data =  (byte *) malloc (myLeds::modules * myLeds::LEDBYTES * sizeof(byte)); 
  myLeds::scale = (byte *) malloc (myLeds::modules * myLeds::SCALEBYTES * sizeof(byte));
 
  SPI.begin ();
  Serial.begin (115200);
 
  }  // end of setup
 
void loop ()
  {
  myLeds::set (1, 2, 3);

     
  byte *thisLED = &myLeds::data [myLeds::modules * myLeds::LEDBYTES];
  byte *lastLED = &myLeds::data [0];
     
  noInterrupts ();
  unsigned long startTime = micros();
 
   do
     {       
      SPDR = *--thisLED; WAIT;       
      } while (thisLED != lastLED); // thisLED is decremented one last time before we hit the end of the loop, so after byte 0 transfers, the loop exits.
 
  WAIT; // Wait for last byte to finish transfer.
  SPSR = SPSR & ~_BV(SPIF); // Clear transfer flag.
  unsigned long endTime = micros();

  interrupts ();
 
 
  Serial.println (endTime - startTime);
 
  delay (1000);
 
  }  // end of loop=


That uses your method of assigning to SPDR and doing a WAIT.

Generated code (machine cycles in brackets):

Code: [Select]

void loop ()
31c:   cf 92           push    r12   (2)
31e:   df 92           push    r13   (2)
320:   ef 92           push    r14   (2)
322:   ff 92           push    r15   (2)
324:   0f 93           push    r16   (2)
326:   1f 93           push    r17   (2)
328:   cf 93           push    r28   (2)
32a:   df 93           push    r29   (2)
  {
  myLeds::set (1, 2, 3);

     
  byte *thisLED = &myLeds::data [myLeds::modules * myLeds::LEDBYTES];
32c:   00 91 18 01     lds     r16, 0x0118   (2)
330:   10 91 19 01     lds     r17, 0x0119   (2)
334:   20 91 16 01     lds     r18, 0x0116   (2)
338:   30 91 17 01     lds     r19, 0x0117   (2)
33c:   84 e2           ldi     r24, 0x24       ; 36   (1)
33e:   90 e0           ldi     r25, 0x00       ; 0   (1)
340:   28 9f           mul     r18, r24   (2)
342:   e0 01           movw    r28, r0   (1)
344:   29 9f           mul     r18, r25   (2)
346:   d0 0d           add     r29, r0   (1)
348:   38 9f           mul     r19, r24   (2)
34a:   d0 0d           add     r29, r0   (1)
34c:   11 24           eor     r1, r1   (1)
34e:   c0 0f           add     r28, r16   (1)
350:   d1 1f           adc     r29, r17   (1)
  byte *lastLED = &myLeds::data [0];
     
  noInterrupts ();
352:   f8 94           cli   (1)
  unsigned long startTime = micros();
354:   0e 94 7e 02     call    0x4fc   ; 0x4fc <micros>   (4)
358:   6b 01           movw    r12, r22   (1)
35a:   7c 01           movw    r14, r24   (1)
 
   do
     {       
      SPDR = *--thisLED; WAIT;       
35c:   8a 91           ld      r24, -Y   (2)
35e:   8e bd           out     0x2e, r24       ; 46   (1)
360:   00 00           nop   (1)
362:   00 00           nop   (1)
364:   00 00           nop   (1)
366:   00 00           nop   (1)
368:   00 00           nop   (1)
36a:   00 00           nop   (1)
36c:   00 00           nop   (1)
36e:   00 00           nop   (1)
370:   00 00           nop   (1)
372:   00 00           nop   (1)
374:   00 00           nop   (1)
  byte *lastLED = &myLeds::data [0];
     
  noInterrupts ();
  unsigned long startTime = micros();
 
   do
376:   c0 17           cp      r28, r16   (1)
378:   d1 07           cpc     r29, r17   (1)
37a:   81 f7           brne    .-32            ; 0x35c <loop+0x40>   (1/2)
     {       
      SPDR = *--thisLED; WAIT;       
      } while (thisLED != lastLED); // thisLED is decremented one last time before we hit the end of the loop, so after byte 0 transfers, the loop exits.
 
  WAIT; // Wait for last byte to finish transfer.
37c:   00 00           nop   (1)
37e:   00 00           nop   (1)
380:   00 00           nop   (1)
382:   00 00           nop   (1)
384:   00 00           nop   (1)
386:   00 00           nop   (1)
388:   00 00           nop   (1)
38a:   00 00           nop   (1)
38c:   00 00           nop   (1)
38e:   00 00           nop   (1)
390:   00 00           nop   (1)
  SPSR = SPSR & ~_BV(SPIF); // Clear transfer flag.
392:   8d b5           in      r24, 0x2d       ; 45   (1)
394:   8f 77           andi    r24, 0x7F       ; 127   (1)
396:   8d bd           out     0x2d, r24       ; 45   (1)
  unsigned long endTime = micros();
398:   0e 94 7e 02     call    0x4fc   ; 0x4fc <micros>   (4)
39c:   ab 01           movw    r20, r22   (1)
39e:   bc 01           movw    r22, r24   (1)

  interrupts ();
3a0:   78 94           sei   (1)


Between addresses 35C and 37A you can see 18 cycles if the branch is taken.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

Nick Gammon

If I take out the namespace it generates identical code. It's not the use of namespace. You must have changed something else.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

scswift

#37
Nov 05, 2012, 06:27 am Last Edit: Nov 05, 2012, 06:30 am by scswift Reason: 1
I can't have changed anything else.  I mean what is there that I could change which would change how many nops are needed between bytes in that loop, by one?

Well, I guess I could have changed something... somewhere.. that resulted in that loop being optimized differently.  I will disassemble an older 10 nop version of the code and see how the loop differs.

scswift

Okay so here's the OLD code that I am running right now, and it's working with 10 nops:

Code: [Select]

00000624 <_Z10updateLEDsv>:
  static unsigned long nextLEDFrameTime = 0;   
 
  uint8_t *thisLED;
  uint8_t *lastLED;

  if (time >= nextLEDFrameTime) { // If it is time for an LED update... 
     624: 20 91 c4 01 lds r18, 0x01C4
     628: 30 91 c5 01 lds r19, 0x01C5
     62c: 40 91 c6 01 lds r20, 0x01C6
     630: 50 91 c7 01 lds r21, 0x01C7
     634: 80 91 a4 02 lds r24, 0x02A4
     638: 90 91 a5 02 lds r25, 0x02A5
     63c: a0 91 a6 02 lds r26, 0x02A6
     640: b0 91 a7 02 lds r27, 0x02A7
     644: 28 17        cp r18, r24
     646: 39 07        cpc r19, r25
     648: 4a 07        cpc r20, r26
     64a: 5b 07        cpc r21, r27
     64c: 08 f4        brcc .+2      ; 0x650 <_Z10updateLEDsv+0x2c>
     64e: 31 c0        rjmp .+98      ; 0x6b2 <_Z10updateLEDsv+0x8e>
    //unsigned int time = micros();
     
    thisLED = &LEDData[LEDMODULES*36]; // The first operation is to decrement this pointer, so the first time we write thisLED it will be array index LEDMODULES*36-1
    lastLED = &LEDData[0];
   
    cli(); // Halt interrupts.
     650: f8 94        cli
     652: ee e9        ldi r30, 0x9E ; 158
     654: f2 e0        ldi r31, 0x02 ; 2
       
    do {       
   
      SPDR = *--thisLED; WAIT;           
     656: 82 91        ld r24, -Z
     658: 8e bd        out 0x2e, r24 ; 46
     65a: 00 00        nop
     65c: 00 00        nop
     65e: 00 00        nop
     660: 00 00        nop
     662: 00 00        nop
     664: 00 00        nop
     666: 00 00        nop
     668: 00 00        nop
     66a: 00 00        nop
     66c: 00 00        nop
    thisLED = &LEDData[LEDMODULES*36]; // The first operation is to decrement this pointer, so the first time we write thisLED it will be array index LEDMODULES*36-1
    lastLED = &LEDData[0];
   
    cli(); // Halt interrupts.
       
    do {       
     66e: 82 e0        ldi r24, 0x02 ; 2
     670: e6 35        cpi r30, 0x56 ; 86
     672: f8 07        cpc r31, r24
     674: 81 f7        brne .-32      ; 0x656 <_Z10updateLEDsv+0x32>
      SPDR = *--thisLED; WAIT;           
      //while (!(SPSR & _BV(SPIF))); SPDR = *thisLED--; // Wait for transfer of byte over SPI bus to complete, then transfer *thisLED into SPDR register, and decrement address of *thisLED.
         
    } while (thisLED != lastLED); // thisLED is decremented one last time before we hit the end of the loop, so after byte 0 transfers, the loop exits.

    WAIT; // Wait for last byte to finish transfer.
     676: 00 00        nop
     678: 00 00        nop
     67a: 00 00        nop
     67c: 00 00        nop
     67e: 00 00        nop
     680: 00 00        nop
     682: 00 00        nop
     684: 00 00        nop
     686: 00 00        nop
     688: 00 00        nop
    SPSR = SPSR & ~_BV(SPIF); // Clear transfer flag.
     68a: 8d b5        in r24, 0x2d ; 45
     68c: 8f 77        andi r24, 0x7F ; 127
     68e: 8d bd        out 0x2d, r24 ; 45


And here is the new code that requires 11:

Code: [Select]

000007da <_Z10updateLEDsv>:

*/



void updateLEDs() {
     7da: af 92        push r10
     7dc: bf 92        push r11
     7de: cf 92        push r12
     7e0: df 92        push r13
     7e2: ef 92        push r14
     7e4: ff 92        push r15
     7e6: 0f 93        push r16
     7e8: 1f 93        push r17
  //uint8_t *lastLED;

  byte *thisLED;
  byte *lastLED;

  if (time >= nextLEDFrameTime) { // If it is time for an LED update... 
     7ea: 20 91 c2 01 lds r18, 0x01C2
     7ee: 30 91 c3 01 lds r19, 0x01C3
     7f2: 40 91 c4 01 lds r20, 0x01C4
     7f6: 50 91 c5 01 lds r21, 0x01C5
     7fa: 80 91 5e 02 lds r24, 0x025E
     7fe: 90 91 5f 02 lds r25, 0x025F
     802: a0 91 60 02 lds r26, 0x0260
     806: b0 91 61 02 lds r27, 0x0261
     80a: 28 17        cp r18, r24
     80c: 39 07        cpc r19, r25
     80e: 4a 07        cpc r20, r26
     810: 5b 07        cpc r21, r27
     812: 08 f4        brcc .+2      ; 0x816 <_Z10updateLEDsv+0x3c>
     814: 57 c0        rjmp .+174    ; 0x8c4 <_Z10updateLEDsv+0xea>
 
    unsigned int time = micros();
     816: 0e 94 0b 12 call 0x2416 ; 0x2416 <micros>
     81a: 5b 01        movw r10, r22
     81c: 6c 01        movw r12, r24
     
    thisLED = &led::data[led::modules*36]; // The first operation is to decrement this pointer, so the first time we write thisLED it will be array index LEDMODULES*36-1
     81e: 40 91 56 02 lds r20, 0x0256
     822: 50 91 57 02 lds r21, 0x0257
     826: 20 91 54 02 lds r18, 0x0254
     82a: 30 91 55 02 lds r19, 0x0255
     82e: 84 e2        ldi r24, 0x24 ; 36
     830: 90 e0        ldi r25, 0x00 ; 0
     832: 28 9f        mul r18, r24
     834: f0 01        movw r30, r0
     836: 29 9f        mul r18, r25
     838: f0 0d        add r31, r0
     83a: 38 9f        mul r19, r24
     83c: f0 0d        add r31, r0
     83e: 11 24        eor r1, r1
     840: e4 0f        add r30, r20
     842: f5 1f        adc r31, r21
    lastLED = &led::data[0];
   
    cli(); // Halt interrupts.
     844: f8 94        cli

    do {       
      SPDR = *--thisLED; WAIT;       
     846: 82 91        ld r24, -Z 2
     848: 8e bd        out 0x2e, r24 ; 46 1
     84a: 00 00        nop 1
     84c: 00 00        nop 1
     84e: 00 00        nop 1
     850: 00 00        nop 1
     852: 00 00        nop 1
     854: 00 00        nop 1
     856: 00 00        nop 1
     858: 00 00        nop 1
     85a: 00 00        nop 1
     85c: 00 00        nop 1
     85e: 00 00        nop 1
     860: e4 17        cp r30, r20 1
     862: f5 07        cpc r31, r21 1
     864: 81 f7        brne .-32      ; 0x846 <_Z10updateLEDsv+0x6c> 1/2
      SPDR = *--thisLED; WAIT;       
    } while (thisLED != lastLED); // thisLED is decremented one last time before we hit the end of the loop, so after byte 0 transfers, the loop exits.

    WAIT; // Wait for last byte to finish transfer.
     866: 00 00        nop 1
     868: 00 00        nop 1
     86a: 00 00        nop 1
     86c: 00 00        nop 1
     86e: 00 00        nop 1
     870: 00 00        nop 1
     872: 00 00        nop 1
     874: 00 00        nop 1
     876: 00 00        nop 1
     878: 00 00        nop 1
     87a: 00 00        nop 1
    SPSR = SPSR & ~_BV(SPIF); // Clear transfer flag.
     87c: 8d b5        in r24, 0x2d ; 45 1
     87e: 8f 77        andi r24, 0x7F ; 127 1
     880: 8d bd        out 0x2d, r24 ; 45 1


Hm, I see one difference right off the bat.  But I don't see how it could affect the loop.  In the original code LEDMODULES was a constant.  Now it is not.  So there seems to be a lot more code to set up the pointers. 

Nick Gammon

Can I merge these threads? It's really confusing having the same code discussed in both of them.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

Nick Gammon

OK your 10 NOP one with cycle counts:

Code: [Select]

    do {       
   
      SPDR = *--thisLED; WAIT;           
     656:       82 91           ld      r24, -Z   (2)
     658:       8e bd           out     0x2e, r24       ; 46   (1)
     65a:       00 00           nop   (1)
     65c:       00 00           nop   (1)
     65e:       00 00           nop   (1)
     660:       00 00           nop   (1)
     662:       00 00           nop   (1)
     664:       00 00           nop   (1)
     666:       00 00           nop   (1)
     668:       00 00           nop   (1)
     66a:       00 00           nop   (1)
     66c:       00 00           nop   (1)
    thisLED = &LEDData[LEDMODULES*36]; // The first operation is to decrement this pointer, so the first time we write thisLED it will be array index LEDMODULES*36-1
    lastLED = &LEDData[0];
   
    cli(); // Halt interrupts.
       
    do {       
     66e:       82 e0           ldi     r24, 0x02       ; 2   (1)
     670:       e6 35           cpi     r30, 0x56       ; 86   (1)
     672:       f8 07           cpc     r31, r24   (1)
     674:       81 f7           brne    .-32            ; 0x656 <_Z10updateLEDsv+0x32>   (1/2)


18 when it takes the branch.

And the one with 11 NOPs:

Code: [Select]

    do {       
      SPDR = *--thisLED; WAIT;       
     846:       82 91           ld      r24, -Z                                                         2   (2)
     848:       8e bd           out     0x2e, r24       ; 46                                               1   (1)
     84a:       00 00           nop                                                                        1   (1)
     84c:       00 00           nop                                                                        1   (1)
     84e:       00 00           nop                                                                        1   (1)
     850:       00 00           nop                                                                        1   (1)
     852:       00 00           nop                                                                        1   (1)
     854:       00 00           nop                                                                        1   (1)
     856:       00 00           nop                                                                        1   (1)
     858:       00 00           nop                                                                        1   (1)
     85a:       00 00           nop                                                                        1   (1)
     85c:       00 00           nop                                                                        1   (1)
     85e:       00 00           nop                                                                        1   (1)
     860:       e4 17           cp      r30, r20                                                           1   (1)
     862:       f5 07           cpc     r31, r21                                                           1   (1)
     864:       81 f7           brne    .-32            ; 0x846 <_Z10updateLEDsv+0x6c>          1/2   (1/2)
      SPDR = *--thisLED; WAIT;       
    } while (thisLED != lastLED); // thisLED is decremented one last time before we hit the end of the loop, so after byte 0 transfers, the loop exits.


18 cycles again. So that totally agrees with what I observed with the logic analyzer in the other thread (or is it this thread?) You need 18 cycles.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

scswift

I don't mind, though if you can move just some of the posts and leave the class thread stuff alone that would be best for people who have questions about that in the future.

Go Up