Go Down

Topic: Arduino nano, but faster (Read 723 times) previous topic - next topic

james_III

Hello everyone

I was thinking this would be the question everyone asked, but seems like I was wrong again.  Seems like I use mainly Arduino UNO or NANO with logic level 5V. I have realized my projects often include something time sensitive things like steppers/PWM/servo/pulse encoders etc. and I think I really benefit faster processors (or otherwise have to learn how interrupt works.)

There are so many alternatives, but don´t want to loose what I have learned, so is there any close enough plug and play alternatives but faster for basic NANO?


Programming is not really my thing, but I really keep wondering where GRBL gets all the time to execute all correct steps, must be some sort of sorcery  ;) 

PaulRB

#1
Apr 29, 2018, 11:36 am Last Edit: Apr 29, 2018, 11:49 am by PaulRB
There are two ways to solve your problem. One is to get a faster processor. The other is to write more efficient code. This is a very common issue for the whole IT industry. Writing code takes a lot of time and costs a lot of money, even in countries that do not pay their programmers so well. Writing efficient code takes expert programmers, and they get paid more. Hardware is relatively cheap. Hence the decision is easily made, and programmers that are expert enough to write efficient code are a dying breed.

But in the Arduino world, people code as a hobby and don't attach such a high value to the time they spend coding.

Writing efficient code often means making use of features that are specific to a platform, such as avr microcontrollers. Also accessing the hardware directly, not via an abstraction interface. For example, the Arduino function digitalWrite() is much slower than accessing the "PORTx" hardware register directly. But if you access the port directly, then your code cannot be easily moved to another platform later, because the underlying hardware is different, whereas digitalWrite() will probably work without alteration.

If you want a faster version of the Nano, you will almost certainly have to make the move from 5V logic to 3.3V. There is lots of choice. Teensy 3.x, Maple Mini, esp8266 for example. For maximum Arduino compatibility, it may be wisest to choose one based on the microcontroller used in the newer Arduino models "Zero" and "M0" which is the SAMD21 chip. Both AdaFruit and Sparkfun sell Nano-like boards based on this chip. There are also SAMD21 boards on eBay branded "Wemos". But the brand name "Wemos" appears to have been stolen, because these boards do not appear on Wemos' official website. They are not cheaper than some of the AdaFruit/Sparkfun offerings, so I can't see any reason to recommend them.

james_III

Thanks for quick reply.
This is only hobby, so might not go direct access and code seems to become so complicated to read, at least to me it does. It was good to point out how slow digitalRead() and digitalWrite() are, I just get light bulb moment and figured out where I lose so much time, on my current project. It was the reason to ask for more speed.

Seems like my stepper pulsing subroutine is written by idiot, well it´s not that bad, but should split on three different subroutines. My subroutine reads all user buttons on every half step and requirement is only for one button.

But this might be suitable time to start checking more powerful alternatives anyway, there are more projects to come :)

avr_fred

In general, digitalRead() and digitalWrite() are notoriously slow. But, they provide a consistent interface to hardware regardless of processor type. This is one of the strengths of the Arduino platform but of course any strength is also a weakness.

You can speed up I/O routines significantly by using direct hardware port reads and writes using the predefined DDRn PORTn PINn registers. The strength is speed, the weakness is the loss of processor/board portability.

https://www.arduino.cc/en/Reference/PortManipulation

Fortunately, the Uno and Nano use the same ATmega328 processor so it's not an issue for those two boards and what you're doing now, but it would require a rewrite to use the code on a Mega (ATmega2561) or even worse, a non-AVR part like a Due or Zero.

PaulRB

#4
Apr 29, 2018, 03:21 pm Last Edit: Apr 29, 2018, 04:21 pm by PaulRB
There is a compromise solution to speeding up code that uses digitalWrite().

Code: [Select]

myPort = portOutputRegister(digitalPinToPort(myPin));
myPinBit = digitalPinToBitMask(myPin);

myPort |= myPinBit; // equivalent to digitalWrite(myPin, HIGH)
myPort &= ~myPinBit; // equivalent to digitalWrite(myPin, LOW)


This approach should give some protection when moving to a different platform (chip). But I wonder how much? I might do some experimenting. I would hope it would work on atmega328, atmega2560 as a minimum. Would it also work on ATtiny45/85? On SAMD21? On esp8266?

Even if these functions/macros are available, you would need to be careful about the data types of myPort and myPinBit because although byte might be fine on avr chips, I suspect they will need to be something larger on 32-bit platforms.

EDIT: I found out that for the Arduino core for esp8266, there are the following definitions in Arduino.h:
Code: [Select]

#define digitalPinToPort(pin)       (0)
#define digitalPinToBitMask(pin)    (1UL << (pin))
#define digitalPinToTimer(pin)      (0)
#define portOutputRegister(port)    ((volatile uint32_t*) GPO)
#define portInputRegister(port)     ((volatile uint32_t*) GPI)
#define portModeRegister(port)      ((volatile uint32_t*) GPE)

So that looks hopeful. I guess from the "1UL" that myPinBit would need to be uint32_t and myPort would need to be uint32_t*. Those types would be less efficient on avr, so some compiler directives might be needed to define the appropriate types for the chip, unless they too already exist...

westfw

Unfortunately,  the SAMD processors are not as much faster than an AVR as you might hope.
While the clock rate is higher, the "special instructions" for writing to pins are gone, and it takes the ARM 3 or four instructions to do what would have been a single instruction on AVR.  I did some experiments recently, and found digitalWrite() to be about 3x faster than an AVR   (310kHz max toggle speed.)  Best all-out-effort pin toggle speed was also about 3x of an AVR (12MHz), but it's a bit less "general" in some sense.  You'd have to carefully craft and benchmark specific code cases.  (For example, the 12MHz toggle code on ARM uses two registers, while the 4MHz AVR code uses none.  And there are only (sort-of) 8 registers on ARM CM0.)

https://forums.adafruit.com/viewtopic.php?f=57&t=133497#p668317

PaulRB

#6
May 05, 2018, 06:43 pm Last Edit: May 05, 2018, 06:53 pm by PaulRB
Got around to doing some testing.

My test code:
Code: [Select]
#define PIN 2

uint8_t myPort;
uint8_t myPinBit;

void setup() {
  pinMode(PIN, OUTPUT);
  Serial.begin(115200);
  myPort = portOutputRegister(digitalPinToPort(PIN));
  myPinBit = digitalPinToBitMask(PIN);
}

void loop() {
  unsigned long startTime = micros();
  for (long i = 0; i < 100000UL; i++) {
    digitalWrite(PIN, HIGH);
    digitalWrite(PIN, LOW);
  }
  float digWriteSpeed = (micros() - startTime)/200000.00;
  Serial.print("digitalWrite() ");
  Serial.print(digWriteSpeed);
 
  startTime = micros();
  for (long i = 0; i < 100000UL; i++) {
    myPort |= myPinBit;
    myPort &= ~myPinBit;
  }
  float portManipSpeed = (micros() - startTime)/200000.00;
  Serial.print("us, Port Manipulation ");
  Serial.print(portManipSpeed);
  Serial.print("us which is ");
  Serial.print(digWriteSpeed/portManipSpeed);
  Serial.println(" times faster");
 
}


Arduino Nano 3 (atmega328 @ 16MHz):
digitalWrite() 3.59us, Port Manipulation 0.28us which is 12.65 times faster

Arduino Pro Micro (atmega32u4 @ 16MHz):
digitalWrite() 3.67us, Port Manipulation 0.28us which is 12.89 times faster




PaulRB

#7
May 05, 2018, 06:57 pm Last Edit: May 05, 2018, 07:23 pm by PaulRB
For esp, I had to make a couple of changes to the sketch. I knew I would need to use 32 bit rather than 8, but upon compiling, I realised I also had to use pointers.

Code: [Select]
#define PIN D2

volatile uint32_t *myPort;
uint32_t myPinBit;
...
    *myPort |= myPinBit;
    *myPort &= ~myPinBit;


Wemos Mini (esp8266 @80MHz):
digitalWrite() 0.46us, Port Manipulation 0.28us which is 1.62 times faster

Wemos Mini (esp8266 @160MHz):
digitalWrite() 0.23us, Port Manipulation 0.21us which is 1.11 times faster

Strangely, at 160MHz, the port manipulation time did not halve as I expected. I increased the loops to 1,000,000, but got the same result.


PaulRB

#8
May 05, 2018, 07:21 pm Last Edit: May 05, 2018, 07:25 pm by PaulRB
I then went back to the Nano to see if the pointer version worked OK:
Code: [Select]
#define PIN 2

uint8_t *myPort;
uint8_t myPinBit;

...

    *myPort |= myPinBit;
    *myPort &= ~myPinBit;


It did work, and the result was:
digitalWrite() 3.59us, Port Manipulation 0.41us which is 8.76 times faster

So using the pointer slowed down the direct port manipulation, but it works with and without using a pointer.

Using the "volatile" keyword slowed it down a little further:
digitalWrite() 3.59us, Port Manipulation 0.54us which is 6.70 times faster


james_III

I then went back to the Nano to see if the pointer version worked OK:
Code: [Select]
#define PIN 2

uint8_t *myPort;
uint8_t myPinBit;

...

    *myPort |= myPinBit;
    *myPort &= ~myPinBit;


It did work, and the result was:
digitalWrite() 3.59us, Port Manipulation 0.41us which is 8.76 times faster

So using the pointer slowed down the direct port manipulation, but it works with and without using a pointer.

Using the "volatile" keyword slowed it down a little further:
digitalWrite() 3.59us, Port Manipulation 0.54us which is 6.70 times faster


Thanks, this seems to be the way to go for now and now comes the big BUT, would it be possible to write example with more ports used than only one. There are probably many hobbyist like me, monkey see, monkey do. With this example I can directly use one port, but how to define if used more than 1? Like every time, did some google on this and google choose to not give me answer on this, just keep saying "these are not the direct access codes you are looking for".

PaulRB

#10
May 10, 2018, 07:00 am Last Edit: May 10, 2018, 07:03 am by PaulRB
Well, you say "this seems to be the way to go", but my experiments showed me that using these functions doesn't give as smooth an upgrade path to faster processors as I had hoped. And as westfw pointed out, they won't necessarily give your code a big boost when you move to a faster processor. And indeed I found that a single port manipulation took 0.28us on a 16MHz Nano and 0.28us on an 80MHz esp.

So to answer your question, I'll assume you will sticking with avr processors for now, because I don't have a good answer otherwise.

Even on different processors from the avr family, Arduino Pin X is not necessarily the same bit on the same port. Worse still, Arduino Pins X & Y might be on the same port on one avr processor but on different ports on another avr processor.

To deal with those situations, your code needs to use the functions portOutputRegister(), digitalPinToPort() and digitalPinToBitMask() for each Arduino pin your code needs to use, and store the results in different variables. So for example.
Code: [Select]

#define CLK 2
#define DATA 3
#define LATCH 4

uint8_t clkPort, clkBit, dataPort, dataBit, latchPort, latchBit;

void setup() {
  clkPort = portOutputRegister(digitalPinToPort(CLK));
  clkBit = digitalPinToBitMask(CLK);
  dataPort = portOutputRegister(digitalPinToPort(DATA));
  dataBit = digitalPinToBitMask(DATA);
  latchPort = portOutputRegister(digitalPinToPort(LATCH));
  latchBit = digitalPinToBitMask(LATCH);
}

and so on.

james_III

Thanks, this clears how to multiply things. So I did miss this nano vs esp entirely, weird. But with this nano case, faster port manipulations frees time to do other things. I wish I knew this when I was experimenting radiocommunication on those cheap 355Mhz radios with stepper motor controls, well it was disaster and big mess ^2 :)

polymorph

Very interesting comparisons.
Steve Greenfield AE7HD
Drawing Schematics: tinyurl.com/23mo9pf - tinyurl.com/o97ysyx - https://tinyurl.com/Technote8
Multitasking: forum.arduino.cc/index.php?topic=223286.0
gammon.com.au/blink - gammon.com.au/serial - gammon.com.au/interrupts

Go Up