Arduino Due 14MS/s 8CH logic analyzer sketch

In some threads I posted that Arduino Due is able to read a 32 channel port (Due has 4 of them) within 3 clock ticks (that is with 1000/84*3=35.7ns(!) between measurements). This is true if the results fit into free CPU registers (which is all that is needed for measuring speed of light). Unfortunately my statement that 3 cloc k cycles are needed per read when filling a huge array was incorrect – the compiler did optimize away the actual store statements into the array because that array was not used.

But a single 32bit read can be stored into an array within 6(!) clock cycles, the statement

*a++ = p->PIO_PDSR

reading 32 bit from port pointed to by p and storing that into an array gets compiled to these two assembler statements

ldr r1, [r4, #60]
strb r1, [r3, #_]

where “_” is a different constant index into the array.

It turns out that Arduino compiler has real problems in compiling 10000s of same staments, compile time becomes very high. The macro “T” in sketch does generate 10 copies of passed in statement, so that

T(T(T(*a++ = p->PIO_PDSR)));

creates 1000 statements “*a++ = p->PIO_PDSR” one after the other, taking 1000x6=6000 clock ticks. 14 iterations of this block gives 84000 clock ticks or 1ms.

I tested the sketch with 4.8MHz PWM frequency created by a Raspberry Pi Zero with these gpio commands:

gpio -g mode 18 pwm
gpio -g pwm-ms
gpio -g pwmc 2
gpio -g pwmr 2
gpio -g pwm 18 1

I did verify that it is really 4.8MHz with my 100MHz logic analyzer:

576044 / 0.1200 = 4800367Hz.

I connected Pi Zero pin 18 with channels 0, 2, 4 and 6 of Arduino Due port D (D25, D27, D14, D29).
Here you can see a run of the sketch with 4.8MHz in the reported ranges for those 4 channels:

channel=0 falling=28829 kHz=4804.83/4793.71
channel=1 falling=0 kHz=0.00/0.00
channel=2 falling=28830 kHz=4805.00/4793.87
channel=3 falling=0 kHz=0.00/0.00
channel=4 falling=28829 kHz=4804.83/4793.71
channel=5 falling=0 kHz=0.00/0.00
channel=6 falling=28829 kHz=4804.83/4793.71
channel=7 falling=0 kHz=0.00/0.00

Because of the for loop there is some clock cycles of overhead for the 1000 statement blocks, in total 1170 clock cycles or 13.93us for 6ms or 84000 port measurements. This sketch takes only lowest 8 channels from port D allowing to store 6ms in 84000 byte array. When taking all 32 channels only 1.5ms could be captured in the array. Ignoring the overhead takes too short time into account and therefore overshoots real frequency a bit, adding the overhead underestimates the frequency a bit because some falling edges do not get recorded in the overhead time. Inaccuracy is quite good with less than 0.2% as shown. The values in the array can be used to draw some curves on a LCD connected to Arduino Due, or to send back to laptop/ddesktop the Due is connected to.


60 line Sketch is attached as well:

   14MS/s 8CH logic analyzer (D.0[D25]-D.7[D11])
     overhead for 84000 measurements (6ms): 
     13.93us or 1170 clock cycles
#define ms 6
uint8_t A[1 + 14000*ms];
void setup() { 
  #define T(st) st; st; st; st; st; st; st; st; st; st

  int D[8]   = {25, 26, 27, 28, 14, 15, 29, 11};
  int f[8]   = {0,0,0,0,0,0,0,0};

  uint8_t *a = &A[0];
  Pio *p     = digitalPinToPort(D[0]);  // port D

  uint32_t i,j,t0,t1,ovhd,norm;

  for(i=0; i<8; ++i)  { pinMode(D[i], INPUT); }

  *a++ = p->PIO_PDSR;

// *a++ = p->PIO_PDSR --> 
// ldr r1, [r4, #60]   strb r1, [r3, #_] (6 clock cycles)
  for(i=1; i<=14*ms; ++i)  { T(T(T(*a++ = p->PIO_PDSR))); }


  norm = 84000*ms;
  ovhd = (((t0<t1)?84000+t0:t0)-t1);

  for(j=0,i=1; i<=14000*ms; ++i)
    for(j=0; j<8; ++j)
      if ( (A[i-1] & (1<<j)) > (A[i] & (1<<j)) )

  for(j=0; j<8; ++j)
    Serial.print(" falling=");
    Serial.print(" kHz=");

void loop() { }

P.S: Adding these lines

  for(j=0; j<8; ++j)
    for(i=0; i<=168; ++i)
      Serial.print( (A[i]&(1<<j)) ? "-" : "_");  

allows to “see” the first 2μs recorded (2x84 clock cycles):


sketch_aug30a.ino (1.29 KB)