A common use of Arduinos is electronic games. To test such games, it makes sense to replace user inputs by random data using the built-in random function. This works pretty well with Arduino UNO and Arduino MEGA2560. Surprisingly, the latter is 20 percent faster, which I don't fully understand, but it won't give me sleepless nights.
When I switched the project to UNO R4 Minima or WiFi, at first it looked like I had bricked the MCU. Only after an endless wait some data was printed on the Serial monitor. So I wrote an MCVE to narrow things down. (I had to make use of the calculated random data, otherwise the compiler would optimize it out and remove all the random calls.)
As I am working with the R3 for nearly 20 years and with the R4 for nearly two years I am more than surprised and really shocked.
Can a product be considered “compatible” if this feature is 1000 times slower on R4 than on R3? I easily might use code written by Donald E. Knuth but built-in functions should work.
This is my MCVE, tested with various Arduinos (originals and clones) using both IDEs (1.x and 2.x) and various versions of Windows.
It looks like whoever wrote the library with the random function should repeat his homework twice.
/*
minimal, complete and verifiable example" (MCVE),
R3: 9 milliseconds
Minima: 9295 milliseconds
WiFi: 9581 milliseconds
*/
void setup() {
Serial.begin(115200);
delay(2000);
while (!Serial);
delay(2000);
Serial.println(__FILE__);
const int N = 100;
int x[N];
// begin:
long t1 = millis();
for (int i = 0; i < N; i++)
x[i] = random(7);
long t2 = millis();
// end
Serial.println("add");
long s = 0;
for (int i = 0; i < N; i++)
s = s + x[i];
Serial.println(s);
Serial.println(t2 - t1);
}
void loop() {}
just to be sure it's not an optimisation issue, what does this code print on the R4 and R3 ?
/*
minimal, complete and verifiable example" (MCVE),
R3: 9 milliseconds
Minima: 9295 milliseconds
WiFi: 9581 milliseconds
*/
void setup() {
Serial.begin(115200);
delay(2000);
while (!Serial);
delay(2000);
Serial.println(__FILE__);
const int N = 100;
volatile int x[N];
// begin:
volatile unsigned long t1 = millis();
for (int i = 0; i < N; i++)
x[i] = random(7);
volatile unsigned long t2 = millis();
// end
Serial.println("add");
long s = 0;
for (int i = 0; i < N; i++)
s = s + x[i];
Serial.println(s);
Serial.println(t2 - t1);
}
void loop() {}
This library uses a call to random() to fill a 32 bit buffer.
It has functions to get n bits from this buffer.
If the buffer has less than n bits it is refilled again.
You could use a similar trick to get your random values.
In your case extract 3 random bits, if in range 0..6 use them, else extract 2 new.
Seems likely. It may be generating bits from some internal breakdown noise hardware and doing a lot of processing to ensure it meets cryptographic specifications.
Often those are used to just to seed a faster algorithmic random number generator.
You can see the hardware/true random number generator being used for random() in ArduinoCore-renesas/cores/arduino/WMath.cpp at main · arduino/ArduinoCore-renesas · GitHub. I agree the behaviour/feature should be documented. I had a play with this to see if it's rate limited. It's not, it's just slow all the time. My R4 Minima takes 115ms and my R4 WiFi is mysteriously quicker at 98ms. I wonder if this varies per chip based on some sort of entropy checking...
You can disable the use of the TRNG by calling randomSeed with a value. An alternative is to call random() without a value but I'm not sure how pukka that's considered in Arduino world.
Not so mysterious, the bits of wifi, field strength etc can give extra entropy assuming it is harvested. However I would have expected more performance gain typical wifi is beyond 1 Mbit/second so harvesting the bits could be faster.
I'd assumed the hardware just used an electrical source of noise. If there's some entropy gathering from multiple sources then it would involve chatting to the ESP32-S3 which does WiFi.
I'm not using WiFi, I'd hope the radio isn't enabled for that case for Arduino programs.
Apparently almost ALL of this time is spent in HW_SCE_McuSpecificInit(), which doesn't remember that it has already been initialized! (sigh. Cursed vendor libraries!)
The time goes down to 20us. That's still about 5x slower than the PRNG algorithm, but it's certainly better!
In order to make relevant information available to any who are interested in this subject, I'll share a link to the formal report @westfw submitted to the "Arduino UNO R4 Boards" platform developers:
I submitted a bug against the Renesas fsp tree as well.
their "Init" function should be smarter than that, especially since they don't document enough to let user code check the status of the SCE.
Good catch!,
in your test sketch you could add a delay(100) to be sure the Serial has flushed.
Interrupts can affect timing measurements (on AVR they certainly do).
That’s what Serial.flush(); is for. Using a delay might get you to wait for longer than needed or not enough depending on what was in the outgoing buffer.
There is already a 2-second delay() after each set of output (actually, at the beginning of the loop, but it'll still happen between the output and the next set of timing...)