Pages: [1] 2 3 4   Go Down
Author Topic: Maximum pin toggle speed  (Read 16941 times)
0 Members and 1 Guest are viewing this topic.
SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hmm.  This was asked over on AVRFreaks, and it's FREQUENTLY a Frequently asked question about CPUs/etc, though I don't recall ever seeing it asked here.   Since I actually did the experiment, I'll post the answer anyway!

Code:
 while (1) {
    digitalWrite(3, 1);
    digitalWrite(3, 0);
  }
produces a 106.8kHz square wave on digital pin 3 in Arduino 0010, 0011, and 0012.  Though it would probably be foolish to count on exactly that speed; library functions are subject to change.

Code:
 cli();
  while (1) {
    PORTD |= 0x8;
    PORTD &= ~0x8;
  }
on the same board runs at 2.667MHz.  (This does produce the minimal sbi/cbi/rjmp loop that you'd expect, BTW.)
(so that's about a 20x penalty for the arduino library code; sounds about right: the overhead of abstracting IO to "pin number" is pretty substantial: a subroutine call, lookup table to get the port, another lookup table to get the bit, a third to check whether analogWrite is in use, and then less efficient instructions to access the port "indirectly")

Logged

Connecticut, US
Offline Offline
Edison Member
*
Karma: 2
Posts: 1036
Whatduino
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Speaking of the hefty digitalWrite() overhead, I noticed two things when I went poking into that code.

One, some pins are slower than others, because they have PWM timers that have to be disengaged.

Two, the "function" that turns off those timers has a really ugly chain of if/if/if/if statements that should really be a switch or at least if/else if/else if/else statements.  I did a timing analysis like you, westfw, and found that the gcc compiler really does optimize the code the same way since it's all forced inline, and comparing against static const data.  But it bugs me to see code that relies on the optimizer to rephrase so drastically to do the right thing-- one accidental tweak or compiler update and boom, digitalWrite() could be twice as slow, because the compiler decides to implement what is written, not what might possibly be implied.
« Last Edit: December 26, 2008, 09:48:21 am by halley » Logged

0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

That's interesting.  

re: 2.667MHz example.  Was that at 8hz?  I have been operating under the assumption that a 16mhz atmega did 16,000,000 instructions per second.    Which would be 5.333 million loops per second with only 3 instructions.

Also the output isn't exactly "square", you would need another STI (or NOP) to hold the pin high for the same time as low, but it would be a bit slower if half/on half/off was a requirement.
« Last Edit: December 26, 2008, 02:01:01 pm by dcb » Logged

London
Offline Offline
Tesla Member
***
Karma: 10
Posts: 6255
Have fun!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I would have thought it would do 4 million loops per second at 16mhz. The while statement will be optimized to a relative  jump, which is a two clock instruction – so the whole loop should be 4 clock cycles long
« Last Edit: December 26, 2008, 02:49:46 pm by mem » Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
some pins are slower than others, because they have PWM timers that have to be disengaged.
Yes.  The code reads:
Code:
 timer = digitalPinToTimer(pin);
    :
  if (timer != NOT_ON_TIMER) turnOffPWM(timer);
I had assumed digitalPinToTimer() would be false if analogWrite wasn't active, but it's actually a ROM-based function that says which timer MIGHT be associated with the pin.  Pin3 used in my example IS a PWM output...

The examples are with a 16MHz MDC Bare Bones Board, as shown by the little frequency readout on my Tek TDS210 scope; you're right that that's not what I'd expect based on instruction timings.  This may require more investigation!
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Ah.  SBI and CBI are 2-cycle instructions (I guess that makes sense, since they're read-modify-write), as is the jmp, so 2.66666 MHz is exactly as expected after all!

This implies that I can make a faster loop using OUT and pre-loaded registers...
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
 while (1) {
    PORTD = ones;
    PORTD = zeros;
  }
Gives me 4Mhz (and noticeably not "square wave."  Adding a nop makes it more square but changes the max freq to about 3.2MHz.  Adding two nops makes for very square, but back to 2.667MHz.)

Using digitalWrite() on a non-PWM pin (4 instead of 3) runs about 148.4kHz instead of 106.8kHz:
Code:
 while (1) {
    digitalWrite(4, 1);
    digitalWrite(4, 0);
  }
Logged

0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Ok, it makes sense now smiley

Here's a square wave version (resulting assembler confirmed), should be 8 cycles per loop, and that would be a 2mz output on a 16mhz CPU (1 mhz on an 8mhz cpu).

Code:
 cli();
  while (1) {
    PORTD |= 0x8;
    PORTD |= 0x8;
    PORTD &= ~0x8;
  }

Are changing fuses allowed smiley-grin? it seems you can also program the CKOUT fuse and get the system clock echoed on CLK0 (digital smiley-cool, which could be useful I recon.  So that would be a 16mhz toggle speed, and it is conceivable that you could control it with an external gate with high precision.
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

You can usually suck the system clock off of one of the oscillator pins (XTAL2 is an output of an inverter-based circuit.), especially if you're willing to add a gate to square things off.   I guess it depends on whether the fuse bits are set for the "low power" oscillator or the "full swing" oscillator.  See Sections 7.2 through 7.4 of the Atmega168 data sheet.
Logged

France
Offline Offline
Sr. Member
****
Karma: 0
Posts: 266
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

You can save one cycle from
Code:
cli();
  while (1) {
    PORTD |= 0x8;
    PORTD |= 0x8;
    PORTD &= ~0x8;
  }

and get the higher square wave frequency with
Code:
cli();
  while (1) {
    PORTD |= B1000;
    PORTD &= B11110111;
  }

« Last Edit: June 13, 2009, 10:06:55 pm by selfonlypath » Logged

0
Offline Offline
Faraday Member
**
Karma: 8
Posts: 2526
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

The newer devices (168/328, IIRC) have an instruction to toggle a pin - too bad I can't recall the name of it right now...  That would reduce it by one more instruction.

-j
Logged

Oxford (England)
Offline Offline
Jr. Member
**
Karma: 0
Posts: 58
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I think the new 'toggle' functionailty for pins works by writing a 1 to it's PIN register.
e.g. to toggle bit 3 of port B, you can do:

PINB = B1000 ;

[yup - M88 datasheet, section 13.1 : "writing a logic one to a bit in the PINx Register, will result in a toggle in the corresponding bit in the Data Register"]

This does NOT work on Mega8's.

I think this means a 3 cycle loop is possible (1 cycle for the OUT instruction, 2 cycles for the RJMP).
That results in a 2.666Mhz square signal.

In theory you can beat this by filling the memory space with that OUT instruction, and letting the program-counter roll-over at the end of flash, but there are problems with this solution, despite a theoretical 8Mhz output...
Logged

France
Offline Offline
Sr. Member
****
Karma: 0
Posts: 266
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Could someone tell exactly how many cycles uses the while(true), I mean the overhead or surrounding
Code:
cli();
  while (true) {
    PORTD |= B1000;
    PORTD &= B11110111;
  }
For example, each while(true) loop will take 2 cycles to execute both PORTD writing / toggling but how many cycles the while(true) itself will take each run whatever instructions inside the loop ?
« Last Edit: June 14, 2009, 01:36:21 pm by selfonlypath » Logged

London, England
Offline Offline
Edison Member
*
Karma: 4
Posts: 1026
Go! Go! Arduinoooo !!!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

That's an inexcusable loss of speed. The compiler should be conveting the digitalWrite code into the same ASM that you get by doing it in raw C. This should be fixed in the next IDE if possible.
Logged

Oxford (England)
Offline Offline
Jr. Member
**
Karma: 0
Posts: 58
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

@selfonlypath:
Your code (or at least a variant using PORTB on my M8) compiles to:

sbi      0x18, 3
cbi      0x18, 3
rjmp      .-6

Each of these instructions are 2 cycles, hence this is a 6 cycle loop.
Note that it's not uniform though. The output will be on for 2 cycles, and off for 4 cycles.
Logged

Pages: [1] 2 3 4   Go Up
Jump to: