Go Down

Topic: Why is there a delay at the end of void Loop? (Read 4 times) previous topic - next topic

JBMetal

Thanks westfw, I have bookmarked that thread looks like excellent reading!  :)

I am however effectively looking for the fastest way to latch 595 registers, any ideas there? Current code implementation in code mentioned in thread above. What you are seeing on the scope pictures is 8 bytes being sent via SPI and the associated latchings.

I got to asking this question because these factors influence my readings and calculations.

liudr


All of the typical infinite loop constructs ("while (1)", "for (;;)", goto, etc) end up producing a single branch instruction.
The delay "at the end of loop" in the original posting is the function return and call overhead.

See also: http://arduino.cc/forum/index.php/topic,4324.0.html for lots of discussion on generating the fastest possible square wave...



To corroborate this, I did a simple test, print out main.cpp and blink before compile but after arduino process:

http://arduino.cc/forum/index.php/topic,64615.msg475057.html#msg475057

There's nothing at the end of the loop() or in main so must be overhead. I expect maybe several registers need to be changed (stack and instruction pointers etc.).

westfw

Code: [Select]

void loop() {
  LATCH_ON();
  a8:   28 9a           sbi     0x05, 0 ; 5
  LATCH_OFF();
  aa:   28 98           cbi     0x05, 0 ; 5
  LATCH_ON();
  ac:   28 9a           sbi     0x05, 0 ; 5
  LATCH_OFF();
  ae:   28 98           cbi     0x05, 0 ; 5
} // without
  b0:   08 95           ret

000000b2 <main>:
#include <WProgram.h>

int main(void)
{
        init();
  b2:   0e 94 a8 00     call    0x150   ; 0x150 <init>
        setup();
  b6:   0e 94 53 00     call    0xa6    ; 0xa6 <setup>
         for (;;)
                loop();
  ba:   0e 94 54 00     call    0xa8    ; 0xa8 <loop>
  be:   fd cf           rjmp    .-6             ; 0xba <main+0x8>

The LATCH_ON and LATCH_OFF all end up as single (2-cycle) instructions.  The end/resumption of loop is three instructions (return, jmp, call) and both return and call take 4 cycles.  So I'd expect the gap between the last bitset in the loop and the first one after the loop resumes to be about 5 times longer than the gap between consecutive bitsets inside the loop, which is just about what the scope trace shows.

I wouldn't call 10 cpu cycles a "delay"; when you optimize your code down to single instructions, you have to start being aware that EVERYTHING takes at least a little bit of time!


Nick Gammon


To answer the original question, a simple "goto" will nearly always be faster than a "return from function" + "call to (same) function", but the latter won't get you despised by half the users who think that people who write "goto" in a C program should be drowned at birth.  :P


If you are trying to generate an exact square wave at an exact frequency, I suggest the 555 chip (or is it the 666 chip? I can never remember).

As for "despise", it's simply a case of using the right tool for the job. The goto statement has its uses, in possibly 0.01% of cases. In the example given:

Code: [Select]
start:
  LATCH_ON();
  LATCH_OFF();
  LATCH_ON();
  LATCH_OFF();
goto start;


... there is still going to be a slight discrepancy between the end of the first OFF and the start of the second ON, and the next one. The goto just makes it smaller (the extra instruction, whatever it does). The timer interrupts firing will also delay the code slightly. It will never be a perfect square wave.

Let hardware do it for you.



http://www.gammon.com.au/electronics

gerg



As for "despise", it's simply a case of using the right tool for the job. The goto statement has its uses, in possibly 0.01% of cases. In the example given:


Agreed. Goto is unduly demonized when in fact, its the author who should be in the receiving end of the ire. There is nothing wrong with goto in general. Having said that, its very frequently abused and misused. Its the classic, poor carpenter blaming his tools.


Code: [Select]
start:
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
goto start;


... there is still going to be a slight discrepancy between the end of the first OFF and the start of the second ON, and the next one. The goto just makes it smaller (the extra instruction, whatever it does). The timer interrupts firing will also delay the code slightly. It will never be a perfect square wave.

Let hardware do it for you.


I completely agree with your comment. But, I do want to offer that the error can be further marginalized by unrolling the loop by hand. This is a little used optimization technique. With the above, the error is 1 out of every 2 pulses. Not so good.

Code: [Select]
start:
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
 LATCH_ON();
 LATCH_OFF();
goto start;


So on and so on...unroll it until your error becomes acceptable - if possible. With the above, the error is now 1 out of 64 pulses. Still not great, but considerably better; being 32x more precise. Its the classic size vs speed trade off.
http://maniacalbits.blogspot.com

gerg


Quote
being 32x more precise.


Not really.  Attempting to generate a "precise" (however you are defining it) frequency with Arduino is a classic example of using the wrong tool.  It just isn't made for it. Arduino is good for a great many things, but being a precise frequency generator ain't one of them.


I think you misconstrued my intent. I was illustrating a way to minimize the error. I was not arguing fitness of the arduino as a precise square wave generator. I thought I was pretty clear on that. Especially since I agreed with the comment stating a timer, rtc, whatever, should be used rather than the arduino itself. And to be clear, "yes really." The error rate was diminished accordingly. Saying, "not really", doesn't really change that. I assume your, "not really", was more of a conflation of fitness and approach moreso than commentary of the results.

http://maniacalbits.blogspot.com

JBMetal

Err .. yes ... as the 'starter' of the thread I need to point at that I am just trying to determine what the 'somewhat theoretical' maximum speed is for Arduino to update 8 shiftregisters with eight bytes.

My tests are based on sending 8 bytes on SPI together with latching required. For those interested I achived 68KHz so far, or 4.3MHz at the bit level .. faster if you count the latches also.

I was not aware that adding code to functions and void loop{} added overhead and when I noticed these discrepancies I asked why, in this thread. I think the topic has been covered.

As stated I need to remove as much incidental overhead as possible from the measurement so as to closer approach the theoretical maximum. Thats all. No square wave generators and no goto vs no goto arguments intended.
I believe the results speak for themselves in this regard.

Thank you all for your valued input, I find this all fascinating and learned a great deal in a very quick time. Special thanks  to gerg who first got me started on this track with his inline compiling suggestions.  :)

PS: IMHO goto is just another code statement, its all rock and roll to me  :P

Nick Gammon


My tests are based on sending 8 bytes on SPI together with latching required. For those interested I achived 68KHz so far, or 4.3MHz at the bit level .. faster if you count the latches also.


In that case I would use the SPI hardware built into the chip. My measurements here:

http://www.gammon.com.au/spi

... showed that I could achieve around 4 MHz at the bit level (0.125 uS per pulse).

Using the hardware means you can let interrupts do the work for you (on the receiving end at least). I think for the base chip (the Atmega) which is clocked at 16 MHz, getting pulses out at 4 MHz (where a pulse is an off and on sequence) is about the best you can do.
http://www.gammon.com.au/electronics

FalconFour

Okay, with all the "goto is evil... from my books' point of view" nay-sayers in this thread, sort of "third-party bashing" goto with ye olde C teachings... I've got to come to its rescue.

There is not a DANG thing wrong with using goto, in my opinion. Personally, I've always avoided it because of all the negativity surrounding it, but I think that may be subject to change. Even with the most efficient C-style code blocks and structures, a goto would make the code much easier to follow and understand, and save thousands of compiled bytes at times. Sometimes you just need to "loop this if that plus that, always run this part of the loop, only do this part of the loop sometimes, and skip to the end of the loop if that", and a "break" just won't do. Often, the more complicated the loop's control, the more efficient the program can run - too many independent function calls and control statements, and you end up wasting lots of code and clock cycles. The "lowly goto" is an amazingly elegant way to "just GO TO" a part of the code, discarding all the conditions in the block... well, provided you do so within the construct of the language you use, like, don't "goto" outside the current function or something. There's LOTS of ways to abuse goto, which is probably what started the whole "evil goto" thing. I think the use of goto should be considered on the implementation, not the fact that it's present at all.

Because in the end, goto is indeed the single most absolutely efficient branching method possible: function calls push/pop registers to set up its environment; control statements perform validation. It's one instruction: jump to "here". :)

AWOL

Quote
Because in the end, goto is indeed the single most absolutely efficient branching method possible:

I wonder if that's why compilers use it so much in "while", "do..while" and "for" loops?

Seriously, more often than not, if you have to use a goto, you haven't understood the structure of your problem correctly.
"Pete, it's a fool looks for logic in the chambers of the human heart." Ulysses Everett McGill.
Do not send technical questions via personal messaging - they will be ignored.

Nick Gammon


Even with the most efficient C-style code blocks and structures, a goto would make the code much easier to follow and understand, and save thousands of compiled bytes at times.


I'm sorry to burst your bubble, but that simply isn't true. Consider this sketch:

Code: [Select]
void setup () {}
void loop ()
{
while (true)
  {
  digitalWrite (5, HIGH);
  }
}


That uses a "while" loop.

Sketch size:

Code: [Select]
Binary sketch size: 724 bytes (of a 30720 byte maximum)

Generated code:

Code: [Select]
00000102 <loop>:
void loop ()
{
while (true)
  {
  digitalWrite (5, HIGH);
102: 85 e0        ldi r24, 0x05 ; 5
104: 61 e0        ldi r22, 0x01 ; 1
106: 0e 94 86 00 call 0x10c ; 0x10c <digitalWrite>
10a: fb cf        rjmp .-10      ; 0x102 <loop>

0000010c <digitalWrite>:
}
}


Now consider this:

Code: [Select]
void setup () {}

void loop ()
{
  foo:
  digitalWrite (5, HIGH);
  goto foo;
}


Sketch size:

Code: [Select]
Binary sketch size: 724 bytes (of a 30720 byte maximum)

Generated code:

Code: [Select]
00000102 <loop>:
void loop ()
{
  foo:
  digitalWrite (5, HIGH);
102: 85 e0        ldi r24, 0x05 ; 5
104: 61 e0        ldi r22, 0x01 ; 1
106: 0e 94 86 00 call 0x10c ; 0x10c <digitalWrite>
10a: fb cf        rjmp .-10      ; 0x102 <loop>

0000010c <digitalWrite>:
}
}


The code sizes are identical! The generated code is identical!

There is no saving of "thousands of bytes" using goto. None. Not a byte.

There is no speed improvement. The compiler has generated, in both cases, "rjmp   .-10 " - the same machine code instruction.

All the goto does is make the code harder to read. It possibly introduces subtle errors of logic, if you "goto" over stuff you shouldn't.

It is not the panacea for saving memory, saving time. It does none of that.

In the OP's code (and I am sure he realizes this) he could have changed:

Code: [Select]
void loop() {
  start:
  LATCH_ON();
  LATCH_OFF();
  LATCH_ON();
  LATCH_OFF();
goto start;
} // with goto


to:

Code: [Select]
void loop() {
  while (true)
  {
  LATCH_ON();
  LATCH_OFF();
  LATCH_ON();
  LATCH_OFF();
  }
} // with goto


The goto didn't save time. It was changing the code to omit the repeated function calls that saved time. But you can do that without using goto.

Quote
Because in the end, goto is indeed the single most absolutely efficient branching method possible: ...


No, that simply is not true. You can achieve the same thing with "do" and "while". Keep the code elegant and maintainable.

http://www.gammon.com.au/electronics

pwillard

#26
Jun 28, 2011, 01:35 pm Last Edit: Jun 28, 2011, 01:36 pm by pwillard Reason: 1
...and, as stated earlier in the thread... you can gain much efficiency by using Hardware SPI as built into the AVR chip.

http://softsolder.com/2009/07/18/arduino-hardware-assisted-spi-synchronous-serial-data-io/

Ed was able to get a "factor-of-15 speedup" with 595 latches.




gerg

As Nick points out, there is absolutely nothing magical about goto. I get the impression from some followup comments that people now believe goto should be commonly used. If I gave that impression, I apologize. Goto should absolutely not be commonly be used. It should be used very sparingly. Its easy to create rats nests of code which is extremely difficult to read and understand. This is a tale any old BASIC coder will be more than willing to share. It should not be viewed as a general purpose flow control mechanism. It should be viewed as one of many in a developer's optimization bag of tricks. And is widely known, the root of all evil is premature optimization. So that should tell you, if you're readily reaching for goto, you're using it wrong.

In my many, many years of coding, I've used goto in C/C++ code less than a half dozen times. Once or twice more I would have used it again except it was forbidden by the coding standards. In all cases where I've used it, it was in fairly complex code where goto was the only possibly means to obtain the optimizations required while ensuring some facet of readability. Generally speaking, if you find you're using goto more frequently than once every couple of years, while coding on a daily basis, chances are very high you're using it wrong.

Now then, as I originally stated, goto has been demonized and is frequently forbidden. Such a response is almost as inappropriate as daily use. But, just because the use of goto is a legitmate flow control technique doesn't mean it should be used without considerable thought. Generally speaking, it should only be used as a optimization technique of last resort, and then generally only by experienced coders.

Again, as Nick pointed out, the simple use of goto, in of itself, doesn't magically imbue optimizations. And by far, it can be easily used for evil. Which is exactly why it has such a bad reputation.

http://maniacalbits.blogspot.com

FalconFour

#28
Jun 29, 2011, 05:44 am Last Edit: Jun 29, 2011, 05:49 am by FalconFour Reason: 1
Consider this sketch

I've never seen anyone post something SO distorted, so misinterpreted, and so blown out of proportion than the absolute CRAP you just posted in there. Dude, get your head outta your rear - do you even recognize that I actually know what I'm talking about and may actually have the slightest semblance of understanding more than your little 8-byte "demo"?

I said it COULD save thousands of bytes. Did you write ANY sort of complicated loop structure there? No, you jumped to the same g*ddamn location as the loop. Did you have nested if/else/while/select statements in there? Would you be able to imagine a reason to? No? OK, then you can't test it. Did you even understand what I was referring to by "saving thousands of bytes"? Apparently not. I don't mean using goto in place of a loop is somehow going to magically save a bunch of code. Far the hell from it.

I'm just f'ing dumbfounded how you pulled out such a far-fetched, WAY freaking overblown reply to such an outlandish misreading of the very specific and accurate wording I used in my post. And then to have other members actually reading that long drawn-out BS of a reply and running with it like gospel as well.

I'm just leaving it at that right now, as there's nothing I could really reply to say other than reposting the exact thing I already wrote and hoping someone actually reads it this time. Good god, never thought I'd be so pissed off at some crapshoot reply by an Arduino board member.  :smiley-yell:

edit: Hmm, so I skimmed a few replies above and see just the piggy-backing on Nick's reply and "using that reply to read my post", so let me make this much bigger for your reading ability:
I don't mean using goto in place of a loop is somehow going to magically save a bunch of code. Far the hell from it.

There, now would you mind going back and reading what I actually wrote?


Okay, with all the "goto is evil... from my books' point of view" nay-sayers in this thread, sort of "third-party bashing" goto with ye olde C teachings... I've got to come to its rescue.

There is not a DANG thing wrong with using goto, in my opinion. Personally, I've always avoided it because of all the negativity surrounding it, but I think that may be subject to change. Even with the most efficient C-style code blocks and structures, a goto would make the code much easier to follow and understand, and save thousands of compiled bytes at times. Sometimes you just need to "loop this if that plus that, always run this part of the loop, only do this part of the loop sometimes, and skip to the end of the loop if that", and a "break" just won't do. Often, the more complicated the loop's control, the more efficient the program can run - too many independent function calls and control statements, and you end up wasting lots of code and clock cycles. The "lowly goto" is an amazingly elegant way to "just GO TO" a part of the code, discarding all the conditions in the block... well, provided you do so within the construct of the language you use, like, don't "goto" outside the current function or something. There's LOTS of ways to abuse goto, which is probably what started the whole "evil goto" thing. I think the use of goto should be considered on the implementation, not the fact that it's present at all.

Because in the end, goto is indeed the single most absolutely efficient branching method possible: function calls push/pop registers to set up its environment; control statements perform validation. It's one instruction: jump to "here". :)

maniacbug

#29
Jun 29, 2011, 06:12 am Last Edit: Jun 29, 2011, 09:16 pm by maniacbug Reason: 1

Consider this sketch

I've never seen anyone post something SO distorted, so misinterpreted, and so blown out of proportion than the absolute CRAP you just posted in there. Dude, get your head outta your rear - do you even recognize that I actually know what I'm talking about and may actually have the slightest semblance of understanding more than your little 8-byte "demo"?

I said it COULD save thousands of bytes. Did you write ANY sort of complicated loop structure there? No, you jumped to the same g*ddamn location as the loop. Did you have nested if/else/while/select statements in there? Would you be able to imagine a reason to? No? OK, then you can't test it. Did you even understand what I was referring to by "saving thousands of bytes"? Apparently not. I don't mean using goto in place of a loop is somehow going to magically save a bunch of code. Far the hell from it.

I'm just f'ing dumbfounded how you pulled out such a far-fetched, WAY freaking overblown reply to such an outlandish misreading of the very specific and accurate wording I used in my post. And then to have other members actually reading that long drawn-out BS of a reply and running with it like gospel as well.

I'm just leaving it at that right now, as there's nothing I could really reply to say other than reposting the exact thing I already wrote and hoping someone actually reads it this time. Good god, never thought I'd be so pissed off at some crapshoot reply by an Arduino board member.  :smiley-yell:


Dude, settle down.  No need for personal attacks.

Go Up