Question re: phase lag on energy measurements

Constantin:
The oscilloscope suggests that the curves are pretty good. However, one thing I noticed is that the usual approach of using sum(V^2) followed by a sqrt operation to determine Vrms is fraught with trouble when you start with a 16-bit ADC, take 1.4ksps, etc. Even the teensy 3 (a 32-bit platform) seems to require the use of the big numbers library to make it work because the Arduino 1.0x environment seems to continue to use 8-bit referenced variables (i.e. a long on a 32-bit ARM is not a 128 bit number).

Do you really need to use 16 bits to get the desired accuracy? I would have thought 10 to 12 was more than enough, so that the square will fit in 19 to 23 bits (bearing in mind that the square can be unsigned) and you can sum at least 512 samples using 32-bit maths.