I am working on a security protocol implementation on Arduino boards. I have implemented the protocol to operate on 64-bit, 128-bit, and 256-bit variables (e.g., nonces, hash results, …). I have used the Arduino Mega (8-bit, 16MHz, 8KB) and the Arduino DUE (32-bit, 84MHz, 96KB). In terms of correctness, the protocol runs fine and terminates correctly. However, I have noticed some inconsistencies:
The codes that use larger variables (128 bit or 256 bit) are compiled to smaller codes (around 17KB) than the code that operates on 64-bit variables (around 31KB) when using the Arduino Mega board. The same thing is observed on the DUE board, but the difference is small (19K for 128-bit, 19K for 256-bit, and 20K for 64-bit)
As a consequence of the previous point, the 128-bit and 256-bit code run faster than the 64-bit code.
The code that uses 128-bit variables and run perfectly on the Arduino Mega, gets stuck at the beginning when running on the Arduino DUE board.
I have left the compiler optimizer as is (i.e., option -Os). If I use another optimization option (e.g., -O1, -O2, or -O3), the codes with larger variables are compiled to smaller codes, whereas the 64-bit code is compiled to a larger one (around 190KB).
Does anybody have an idea of what is going on? I cannot provide an interpretation of my results, in terms of execution time. It would be great if I can get some information about the issue that I pointed out above.
I have attached a plot for the execution time of the different phases of the protocol, for different variable sizes (the solid bars are for a serial communication rate of 190Kbps and the ones with diagonal lines are for a serial communication rate of 250Kbps). Regardless of the communication rate, you can see the inconsistencies: by increasing the variables sizes, the protocol becomes faster.