Delay minimum 100ns

Hi everybody, ???

Well, I have a problem with my arduino DUE, I would like to do a delay of 100ns minimum, my code is very very very simple, only an interrupt of my pin 2 and I activate my pin 3.

My code works but my minimum delay is 880ns (due too the interrupt latency maybe?) and i can only add 1us by 1us.
If I compare with the pic 16lf1455 I used before the arduino, I have the same minimum delay, but i can up my delay 100ns by 100ns, and this uc is running at 48MHz with PLL compare to 84MHz for the arduino DUE...

why I need a time so fast? because i'm working with high voltage, between 1.5MVolt and 5MVolt, at my work we simulate thunder, and the time of a choc is 3us in average up to 2500us (it depends of the type of test), and sometime we have to send a signal before reaching the maximum voltage if it is a self firing, so the time is average 0.7ns...

I'm sorry if my explication is not good, i'm not a boss in english. :confused:

So Anyone have an idea? maybe a parameter which miss?
I will try with another pic from Microchip.

Thanks in advance for your helps. :smiley:

inline void digitalWriteDirect(int pin, boolean val){
  if(val) g_APinDescription[pin].pPort -> PIO_SODR = g_APinDescription[pin].ulPin;
  else    g_APinDescription[pin].pPort -> PIO_CODR = g_APinDescription[pin].ulPin;
}

void setup() {
  pinMode(3,OUTPUT);
  pinMode(2,INPUT);
  attachInterrupt(2, pin_ISR, RISING);
}

void loop() {
 
}

void pin_ISR() {
  delayMicroseconds(1);
  digitalWriteDirect(3, HIGH);
  digitalWriteDirect(3, LOW);
}

The Due can execute instructions in a single clock so the smallest delay you can add is 1/84M = 11.9ns. So, that's your resolution. You can delay that small by doing something like

asm volatile ( "nop":: )

Now, you want to delay longer than that but potentially shorter than 1us. The delayMicroseconds function is defined like this:

static inline void delayMicroseconds(uint32_t) __attribute__((always_inline, unused));
static inline void delayMicroseconds(uint32_t usec){
    /*
     * Based on Paul Stoffregen's implementation
     * for Teensy 3.0 (http://www.pjrc.com/)
     */
    if (usec == 0) return;
    uint32_t n = usec * (VARIANT_MCK / 3000000);
    asm volatile(
        "L_%=_delayMicroseconds:"       "\n\t"
        "subs   %0, #1"                 "\n\t"
        "bne    L_%=_delayMicroseconds" "\n"
        : "+r" (n) :
    );.
}

You could create a delayNanoseconds function that does the same basic thing. As you can see above, the number of loops is calculated to be microseconds * (84Mhz / 3Mhz) which is microseconds * 28. So, it takes 28 loops to create a 1us delay. That means each loop takes 35.7 nanoseconds. Thus, you can use the same basic loop to get that resolution.

static inline void delayNanoseconds(uint32_t) __attribute__((always_inline, unused));
static inline void delayNanoseconds(uint32_t nsec){
    /*
     * Based on Paul Stoffregen's implementation
     * for Teensy 3.0 (http://www.pjrc.com/)
     */
    if (nsec == 0) return;
    uint32_t n = (nsec * 1000) / 35714;
    asm volatile(
        "L_%=_delayNanos:"       "\n\t"
        "subs   %0, #1"                 "\n\t"
        "bne    L_%=_delayNanos" "\n"
        : "+r" (n) :
    );.
}

Keep in mind that the calculation for the proper number of loops takes a few instructions so it'll add to the delay time. Technically multiplications and divisions are very fast on this processor so I think the added overhead will be on the order of about 100ns but keep that in mind. You might need to ask for a touch less delay than you really need in order to compensate. Also, I removed the use of VARIANT_MCK so the code is now assuming you're still using an 84Mhz Due. But, the more math operations you add the more overhead it'll have. We're talking about nanoseconds here so I didn't want to add any more crap than necessary. Also, my multiplying by 1000 limits your upper limit for delay here but it's 2^32 / 1000 = 4.2 million nanoseconds (4.2 milliseconds). If you need a longer delay you'll have to use delayMicroseconds.

Hi AdderD,

Big thanks for your explication, I understand why I don't have the same as the Pic from MicrChip, I will combine your function for nano second with micro second, I whould be sometime go to 2000 micro seconds, it will be a great help for me! :slight_smile:

But what about the interrupt? because it take less than 1 micro second to start. Is there a solution to be faster?
You talk about "VARIANT_MCK", and I saw were it was in microsecond function, but what is it? i'm sorry, it's my first time with an arduino, i'm used to use PIC.

Thanks again AdderD :slight_smile:

The latency is mostly due to the interrupt process with attachInterrupt :slight_smile:

If you look at Table 45-46. I/O Characteristics page 1413, you wil see that there is a group of I/O pins which supports an higher output frequency: PA3 (pin A6) and PA15(pin 24). See pinout diagram.

I suggest that you choose one of these instead of pin 2 for outputting a digital signal.

Anyway, you can try these 2 sketches below. The first one uses attachInterrupt which may be a bit too slow because this function tests all pins. Therefore attachInterrupt is faster with an input pin with an high I/O number (e.g.PC28 will be faster than PC2).

The second sketch is rather radical, since the loop() is only reading the input pin, and I added an internal loop to avoid SerialEventRun which runs in background of the loop() and takes some time.

I am absolutely certain that the second sketch will be fast enough, even too fast and you will have to add a delay to avoid several togglings of the output pin.

Maybe a pull_down resistor on the input pin would avoid to receive noise.

You could add a software glitch filter if necessary to the input pin (see PIO_IFER).
The glitch filter can filter a glitch with a duration of less than 1/2 Master Clock (MCK = 84 MHz, 1 clock cycle = 11.9 ns, filter a glitch of less than 6 ns) and the debouncing filter can filter a pulse of less than 1/2 Period of a Programmable Divided Slow Clock.

First sketch:

void setup() {
 
pinMode(3, OUTPUT);
pinMode(2, INPUT);
attachInterrupt(2, pin_ISR, RISING);

}

void loop() {
 
}

void pin_ISR() {
  
  PIOC->PIO_ODSR ^= PIO_ODSR_P28;  // low to high
  PIOC->PIO_ODSR ^= PIO_ODSR_P28;  // high to low
     
}

Second sketch:

void setup() {

  pinMode(3, OUTPUT);  // pin3 = PC28
  pinMode(2, INPUT);   // pin2 = PB25
}


void loop() {
  
  PIOB->PIO_CODR = PIO_CODR_P25;  // Be sure to start with a low level for output pin
  while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0);

    // Toggle pin 3
    PIOC->PIO_ODSR ^= PIO_ODSR_P28;  // low to high

    PIOC->PIO_ODSR ^= PIO_ODSR_P28;  // high to low
    // Add a delay if necessary to wait for the end of the thunder 
    // and avoid toggling once more ??
  }
}

Hi ard_newbie,

Oh I see!! :slight_smile:

I'll try with another pin, and with your code, I prefer an interrupt, maybe it is only in my head ahah! But if for each interrupt the uc check each port i understand now why it take so much time to start...
I have the same pb with pic, so now i see! thanks a lot :slight_smile:

Thanks again for your help, for me it's done, I have my answers now :slight_smile:

If you are stuck with an interrupt, let's know that attachInterrupt has received a major improvement (from antodom) and most of the improvement is now included in the last DUE board software versions, although not all !!

As you will see in this thread, #reply 106:
https://forum.arduino.cc/index.php?topic=144446.105
antodom used:

register uint32_t isr ; and register uint8_t leading_zeros; in PIOx_Handler() process. But the version finally included has not retained the "register" type which is very efficient because the variables are directly stored into a register resulting in a faster process.

I think you can just add your own library with a different name Winterrupts2.h and attachInterrupt2() with the fastest version….

if you need something that fast you may need to code directly to registers to get axactly what you want and avoid extra overhead.

and never use delay or print in an ISR

Hi ard_newbie,

Thanks for your help :slight_smile:, I will try the fastest version, i didn't know for the major improvement!

Thanks joeblogs too, I will search about to code directly into registers :slight_smile:, you said never use delay or print in a ISR, but why? it is maybe a stupid question :grinning:

There are a variety of reasons not to use delay functions within an ISR. Here's a partial list:

  1. The way the Arduino core is coded nothing can interrupt an interrupt handler. This locks out all other interrupts while you're delaying in the handler. (note, the processor could support nested interrupts but the Arduino core isn't coded to support that)

  2. The delay() function (the one that delays milliseconds) requires an interrupt to update the milliseconds counter. If you're in an interrupt handler then issue #1 above means that interrupt will never fire and your delay becomes infinite. This isn't a problem for delayMicroseconds or delayNanoseconds. But, don't use those either unless you have a very good reason to. It'd probably be OK to delay for a finite number of nanoseconds but much more than a couple of microseconds is asking for trouble.

As for printing - it takes a long time potentially and also tends to involve interrupts. Since no interrupts will fire this can cause a lock up. The bottom line is this: don't do anything more in an interrupt than you absolutely must do. The best idea is to set a flag and poll it elsewhere. Sometimes you do more than this. Communications interrupts tend to send or receive right in the interrupt but they still try to grab data and go as quickly as possible.

Ohh, I see, I didn't know, so interrupt for my application is not the correct way, arf, it's too bad but I will put my code in the loop only. I have minimum a delay of 280ns with the code in the loop and write directly in register, that fast enough for me =).

Thanks AdderD for your explication =D, I learn many things ahah. thanks for every thing!

hey GCombault,
interrupts are the only way that you get the results you want,BUT if your talking nS and consecutive interrupts it doesnt leave much time for any thing else to happen which may seem like your system has locked up.

what AdderD said about printing is write, use it for debugging but dont use it in an interrupt, and dont use delay. If you want a delay start another interrupt within the first one, but the speeds you are talking about may be a bit to quick. ArdNewbie has had some good replies about "interrupt priority" search those.

you may want to see if you can do what you want a slightly different way.

have a look at the TC (timer counter section), and the PIO (peripheral section)
http://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-11057-32-bit-Cortex-M3-Microcontroller-SAM3X-SAM3A_Datasheet.pdf

Hi joeblogs,

I don't need a consecutive interrupts, but the interrupt is to slow, and I need a Delay in the interrupt, if I use Timer it wiil be to many slow... So I'm using a loop while in my Main loop, and it works very fine, so big thanks for explication :slight_smile: , and of course for your helps!

Now I have an issue with my delay, I explain, everything is working, the delay in microseconds or in nano seconds is reliable, But I have to put my Value directly

    while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayMicroseconds(1);
    delayNanoseconds(200);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    //while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }

But I would like to use two variables Int like this

    delayMicroseconds(delay_value_us_int);
    delayNanoseconds(delay_value_ns);

But when I'm using these variables I have sometimes a latency of 500nano secondes. Is there a solution to read the value directly in register?

I put my code :smiley: .

int delay_value_us_int=1;
int delay_value_ns=200;

inline void digitalWriteDirect(int pin, boolean val){
  if(val) g_APinDescription[pin].pPort -> PIO_SODR = g_APinDescription[pin].ulPin;
  else    g_APinDescription[pin].pPort -> PIO_CODR = g_APinDescription[pin].ulPin;
}
static inline void delayNanoseconds(uint32_t) __attribute__((always_inline, unused));
static inline void delayNanoseconds(uint32_t nsec){
    /*
     * Based on Paul Stoffregen's implementation
     * for Teensy 3.0 (http://www.pjrc.com/)
     */
    if (nsec == 0) return;
    uint32_t n = (nsec * 1000) / 35714;
    asm volatile(
        "L_%=_delayNanos:"       "\n\t"
        "subs   %0, #1"                 "\n\t"
        "bne    L_%=_delayNanos" "\n"
        : "+r" (n) :
    );
}
void setup() {
  pinMode(11,OUTPUT);
  pinMode(2,INPUT);
 
}

void loop() {
   //PIOB->PIO_CODR = PIO_CODR_P25;  // Be sure to start with a low level for output pin
  while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayMicroseconds(delay_value_us_int);
    delayNanoseconds(delay_value_ns);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    //while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }
}

Thanks in advance :smiley:

I did some tests, and this is what I saw with this code everythings is ok

    while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayMicroseconds(1);
    delayNanoseconds(200);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    //while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }

And it is exactly the same if my code is like this

    while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayMicroseconds(1);
    delayNanoseconds(delay_value_ns);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    //while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }

Or

    while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayMicroseconds(delay_value_us_int);
    delayNanoseconds(200);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    //while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }

I have no latency.

But if I'm using both so I have a latency around 500ns... And I try this, my delay in nano seconds before my delay in micro seconds.

    while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayNanoseconds(delay_value_ns);
    delayMicroseconds(delay_value_us_int);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    //while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }

And thus my latency is get down around 160ns.

Some one have an explication? :smiley:

Big thanks!

The latency may be due to a roll over somewhere not taken correctly into account by delaymicroseconds().

To get a precise blocking delay, the easiest way is using nop(s). A single nop equals a tick which equals 11.9 ns. Note that if you add a bunch of them, the core will add some wait states.

The assembler macro below should give you a 1200 ns blocking delay:

__asm__ __volatile__(
  ".macro NOPX      P1          \n\t"
  ".rept &P1                    \n\t"
  "NOP                          \n\t"
  ".endr                        \n\t"   // End of Repeat
  ".endm                        \n\t"   // End of macro
);
void setup() {

  __asm__ __volatile__("NOPX 82");  
 
  // 101 ticks = 1202 ns 
}

void loop() {
}

Hi ard_newbie,

Thanks for your reply, and your right, it works, and I totally forgot, AdderD give me a function in nano seconds and up to 4.3 ms so it is exactly what I want, so it works to, and my precision is around +/- 35ns with the oscilloscope.

Thanks for all, but... yeah I know I have so much difficulties, but I don't have idea, I have a problem with my variable in my function nanoseconds, I put my code here and i wiil explain what I want =D.

/*
 
int val1=0 ;
int val2=0 ;
int val4=0 ;
int val8=0 ;
uint32_t value_buffer=0;
uint32_t delay_value_100ns;

inline void digitalWriteDirect(int pin, boolean val){
  if(val) g_APinDescription[pin].pPort -> PIO_SODR = g_APinDescription[pin].ulPin;
  else    g_APinDescription[pin].pPort -> PIO_CODR = g_APinDescription[pin].ulPin;
}
static inline void delayNanoseconds(uint32_t) __attribute__((always_inline, unused));
static inline void delayNanoseconds(uint32_t nsec){
    /*
     * Based on Paul Stoffregen's implementation
     * for Teensy 3.0 (http://www.pjrc.com/)
     */
    if (nsec == 0) return;
    uint32_t n = (nsec * 1000) / 35714;
    asm volatile(
        "L_%=_delayNanos:"       "\n\t"
        "subs   %0, #1"                 "\n\t"
        "bne    L_%=_delayNanos" "\n"
        : "+r" (n) :
    );
}

void setup() {
  pinMode(11,OUTPUT);
  pinMode(10,OUTPUT);
  pinMode(9,OUTPUT);
  pinMode(2,INPUT);
  pinMode(5,INPUT);
 
  //0.1
  pinMode(28,INPUT);
  pinMode(26,INPUT);
  pinMode(24,INPUT);
  pinMode(22,INPUT);
  attachInterrupt(22, pin_ISR, CHANGE);
  
  
}

void loop() {
   //PIOB->PIO_CODR = PIO_CODR_P25;  // Be sure to start with a low level for output pin
  while (true) {
    // wait until pin 2 is high
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 3
    delayNanoseconds(delay_value_100ns);

    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }
}

int read_bcd(int value1, int value2, int value3, int value4){
  //Read thumbwheel switches BCD, if val1 = 1 and val2,val3 and val4 = 0 so the value is 1
  if((val1==0)&&(val2==0)&&(val4==0)&&(val8==0)){
    value_buffer=0;
    Test=0;
  }
  else if((val1==1)&&(val2==0)&&(val4==0)&&(val8==0)){
    value_buffer=1;
    Test=1;
  }
  else if((val1==0)&&(val2==1)&&(val4==0)&&(val8==0)){
    value_buffer=2;
    Test=2;
  }
  else if((val1==1)&&(val2==1)&&(val4==0)&&(val8==0)){
    value_buffer=3;
  }
  else if((val1==0)&&(val2==0)&&(val4==1)&&(val8==0)){
    value_buffer=4;
  }
  else if((val1==1)&&(val2==0)&&(val4==1)&&(val8==0)){
    value_buffer=5;
  }
  else if((val1==0)&&(val2==1)&&(val4==1)&&(val8==0)){
    value_buffer=6;
  }
  else if((val1==1)&&(val2==1)&&(val4==1)&&(val8==0)){
    value_buffer=7;
  }
  else if((val1==0)&&(val2==0)&&(val4==0)&&(val8==1)){
    value_buffer=8;
  }
  else if((val1==1)&&(val2==0)&&(val4==0)&&(val8==1)){
    value_buffer=9;
  }
  return value_buffer;
}

void pin_ISR() {
  
val1 = digitalRead(23);
val2 = digitalRead(25);
val4 = digitalRead(27);
val8 = digitalRead(29);

delay_value_100ns = read_bcd(val1,val2,val4,val8)*1000;

if(delay_value_us_Prec == 2000){
  digitalWrite(10,HIGH);
  digitalWrite(9,LOW);
}
else if(delay_value_us_Prec == 1000){
  
  digitalWrite(10,LOW);
  digitalWrite(9,HIGH);
}
else{
    digitalWrite(10,LOW);
  digitalWrite(9,LOW);
}

}

So, My delay works correctly, now I want to select my delay with a ThumbWheel Switche, So I'm reading The pin 23,25,27 and 29, I put a variable "Test" which help me to debug, and my function read BCD works fine.

But my value in "delay_value_100ns" doesn't refresh in my main, well I think, because nothing change... only if I restart the Arduino...

Thanks if any one could help me. =D

your global variables valx updated in an ISR should be declared as volatile boolean valx, otherwise they are not properly updated.

Ok, so I delete all variable Varx, so i should have no problem, but no =/. I found a solution and I think it is not the best.

void loop() {

   //PIOB->PIO_CODR = PIO_CODR_P25;  // Be sure to start with a low level for output pin
  while (true) {
    // wait until pin 2 is high
    analogWrite(9,delay_value_us);
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) == 0); //wait input become high
    // Toggle pin 11
    delayNanoseconds(delay_value_us);
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // low to high
    PIOD->PIO_ODSR ^= PIO_ODSR_P7;  // high to low
    while ((PIOB->PIO_PDSR & PIO_PDSR_P25) > 0); //wait input become low
  }
}


int read_bcd(boolean value1, boolean value2, boolean value3, boolean value4){
  //Read BCD wheel, if val1 = 1 and val2,val3 and val4 = 0 so the value is 1
  if((value1==LOW)&&(value2==LOW)&&(value3==LOW)&&(value4==LOW)){
    value_buffer=0;
  }
  else if((value1==HIGH)&&(value2==LOW)&&(value3==LOW)&&(value4==LOW)){
    value_buffer=1;
  }
  else if((value1==LOW)&&(value2==HIGH)&&(value3==LOW)&&(value4==LOW)){
    value_buffer=2;
  }
  else if((value1==HIGH)&&(value2==HIGH)&&(value3==LOW)&&(value4==LOW)){
    value_buffer=3;
  }
  else if((value1==LOW)&&(value2==LOW)&&(value3==HIGH)&&(value4==LOW)){
    value_buffer=4;
  }
  else if((value1==HIGH)&&(value2==LOW)&&(value3==HIGH)&&(value4==LOW)){
    value_buffer=5;
  }
  else if((value1==LOW)&&(value2==HIGH)&&(value3==HIGH)&&(value4==LOW)){
    value_buffer=6;
  }
  else if((value1==HIGH)&&(value2==HIGH)&&(value3==HIGH)&&(value4==LOW)){
    value_buffer=7;
  }
  else if((value1==LOW)&&(value2==LOW)&&(value3==LOW)&&(value4==HIGH)){
    value_buffer=8;
  }
  else if((value1==HIGH)&&(value2==LOW)&&(value3==LOW)&&(value4==HIGH)){
    value_buffer=9;
  }
  return value_buffer;
}

void pin_ISR() {
delay_value_us_Prec = read_bcd(digitalRead(47),digitalRead(49),digitalRead(51), digitalRead(53))*1000000;
delay_value_us_Prec = delay_value_us_Prec+read_bcd(digitalRead(39),digitalRead(41),digitalRead(43),digitalRead(45))*100000;
delay_value_us_Prec = delay_value_us_Prec+read_bcd(digitalRead(31),digitalRead(33),digitalRead(35),digitalRead(37))*10000;
delay_value_us_Prec = delay_value_us_Prec+read_bcd(digitalRead(23),digitalRead(25),digitalRead(27),digitalRead(29))*1000;
delay_value_ns = read_bcd(digitalRead(22),digitalRead(24),digitalRead(26),digitalRead(28))*100;

delay_value_us= delay_value_us_Prec+delay_value_ns;
}
}

If I use AnalogWrite so my value in delay_value_us will be updated in my function, if I use a simple equal it will not work.
Of cours with the analogWrite it works, but with a latency around 400ns between 250 ns without analogwrite.

Thanks for your help ard_newbie, if nobody have idea so I will use analogWrite.

Thanks in advance!

I wrote : << declare your global variables Inside an ISR as VOLATILE >>

Before your setup() write:

volatile boolean val1, val2, val3, val4;