Go Down

Topic: Arduino Due vs. Intel 2.8GHz single core (Read 1 time)previous topic - next topic

HermannSW

Apr 05, 2016, 02:26 amLast Edit: Apr 05, 2016, 02:39 am by HermannSW
In forum thread Aduino Due vs. Nano performance I did compare performance of Arduino Due and Arduino Nano for a given simple code (increment of a volatile variable). Runtime factors were 9(18) for 16(32) bit variable.

Recently I remembered (unrelated to Arduino) on "minimal magic 3x3 prime square":

I found that back in 1982 with a Sinclair ZX81 program, was found by R. Ondrejka in 1979 before:
Code: [Select]
` 47| 29|101|113| 59|  5| 17| 89| 71|`

Optimized C  program did compute the minimal solution in 6μs:

Sinclair ZX81 CPU had 3.25MHz clock or 0.31μs for a single clock cycle:

Today I thought how much time the optimized search program would take on an Arduino Due.

The posted C program made use of 64bit type, but I had 32bit version with same performance. I just ported that code (was not difficult, in fact measuring microseconds is much easier on Arduino than on Linux, formatted printing is worse). This is Serial monitor output generated by Arduino Due:
Code: [Select]
` 47| 29|101|113| 59|  5| 17| 89| 71|548us`

So for this non-trivial code Intel 2.8GHz single core is not 92x faster than the Arduino Due!

The Arduino preprocessor could not deal with multiline #define, so I had to bring the forall_odd_primes_less_than() macro into a single line. This is the complete code (also attached).
Code: [Select]
`uint32_t B[]={0x35145105,0x4510414,0x11411040,0x45144001};#define Prime(i) ((B[(i)>>5] & (0x80000000UL >> ((i)%32))) != 0)#define forall_odd_primes_less_than(p, m, block)  for((p)=3; (p)<(m); (p)+=2) if (Prime((p))) blockvoid setup() {  Serial.begin(9600);    uint8_t p,a,b,c,d,i;  unsigned long t0 = micros();  forall_odd_primes_less_than(p, 64,    forall_odd_primes_less_than(a, p,      if Prime(2*p-a)      {        forall_odd_primes_less_than(b, p,          if ( (b!=a) && Prime(2*p-b) )          {            c= 3*p - (a+b);            if ( (c<2*p) && (2*p-c!=a) && (2*p-c!=b) && Prime(c) && Prime(2*p-c) )            {              if (2*a+b>2*p)              {                d = 2*a + b - 2*p;   // 3*p - (3*p-(a+b)) - (2*p-a)                if ( (d!=a) && (d!=b) && (d!=2*p-c) && Prime(d) && Prime(2*p-d) )                {                  unsigned long t1 = micros();                  print3(a); print3(b); print3(c); Serial.println();                  print3(2*p-d); print3(p); print3(d); Serial.println();                  print3(2*p-c); print3(2*p-b); print3(2*p-a); Serial.println();                  Serial.println();                  Serial.print(t1-t0);                  Serial.println("us");                }              }            }          }        )      }    )  )}void loop() {}void print3(int n) {  if (n<10) Serial.print("  ");   else if (n<100) Serial.print(" ");  Serial.print(n);  Serial.print("|");}`

Hermann.
https://forum.arduino.cc/index.php?topic=462107.msg3236016#msg3236016
http://stamm-wilbrandt.de/en/Raspberry_camera.html

HermannSW

#1
Apr 05, 2016, 09:35 am
Just realized that Due compilation was done with -Os -- I changed that to -O3 in "~/.arduino15/packages/arduino/hardware/sam/1.6.4/platform.txt":
Code: [Select]
`\$ diff platform.txt.orig platform.txt22c22< compiler.c.flags=-c -g -Os {compiler.warning_flags} -ffunction-sections -fdata-sections -nostdlib --param max-inline-insns-single=500 -Dprintf=iprintf -MMD---> compiler.c.flags=-c -g -O3 {compiler.warning_flags} -ffunction-sections -fdata-sections -nostdlib --param max-inline-insns-single=500 -Dprintf=iprintf -MMD24c24< compiler.c.elf.flags=-Os -Wl,--gc-sections---> compiler.c.elf.flags=-O3 -Wl,--gc-sections27c27< compiler.cpp.flags=-c -g -Os {compiler.warning_flags} -ffunction-sections -fdata-sections -nostdlib -fno-threadsafe-statics --param max-inline-insns-single=500 -fno-rtti -fno-exceptions -Dprintf=iprintf -MMD---> compiler.cpp.flags=-c -g -O3 {compiler.warning_flags} -ffunction-sections -fdata-sections -nostdlib -fno-threadsafe-statics --param max-inline-insns-single=500 -fno-rtti -fno-exceptions -Dprintf=iprintf -MMD\$`

Sketch size increased from 11,624 bytes to 13,548 bytes, but is still only 2% of the 512MB available on Due.

Runtime decreased from  548μs  to  494μs, so now  83 x TIntel_2.8GHz > TArduino_Due:
Code: [Select]
` 47| 29|101|113| 59|  5| 17| 89| 71|494us`

Hermann.
https://forum.arduino.cc/index.php?topic=462107.msg3236016#msg3236016
http://stamm-wilbrandt.de/en/Raspberry_camera.html

Go Up