Offline
Newbie
Karma: 0
Posts: 3
|
 |
« on: January 05, 2012, 04:28:07 pm » |
Function store_char() (in HardwareSerial.cpp), called from USART interrupt, use % division, which is mutch slower (208 ticks, as i measure), than & operation. If divider is power of 2, why not use & ?
For example:
#define SERIAL_BUFFER_SIZE 64 .....
int i = (unsigned int)(buffer->head + 1) % SERIAL_BUFFER_SIZE; are equal to: int i = (unsigned int)(buffer->head + 1) & (SERIAL_BUFFER_SIZE - 1);
|
|
|
|
|
Logged
|
|
|
|
|
0
Offline
Faraday Member
Karma: 12
Posts: 2857
ruggedcircuits.com
|
 |
« Reply #1 on: January 05, 2012, 04:31:54 pm » |
The GCC compiler is pretty smart and it should convert the % to & automatically during its optimization phase. Try it: I bet the code generated will be the same size with both approaches (as long as the modulus is a power of 2). Knowing this, it's always better to write clear(er) code than to prematurely optimize it. -- The Gadget Shield: accelerometer, RGB LED, IR transmit/receive, speaker, microphone, light sensor, potentiometer, pushbuttons
|
|
|
|
|
Logged
|
|
|
|
|
Seattle, WA
Offline
God Member
Karma: 4
Posts: 673
|
 |
« Reply #2 on: January 05, 2012, 04:36:44 pm » |
If divider is power of 2, why not use & ?
Just way less readable and maintainable, in a pretty significant way. Only worth this kind of optimization in cases where serial performance was the bottleneck.
|
|
|
|
|
Logged
|
|
|
|
|
Seattle, WA
Offline
God Member
Karma: 4
Posts: 673
|
 |
« Reply #3 on: January 05, 2012, 04:43:03 pm » |
The GCC compiler is pretty smart and it should convert the % to & automatically during its optimization phase. Try it: I bet the code generated will be the same size with both approaches (as long as the modulus is a power of 2). Ah yes, this is right, too. int main(void) { SPDR = SPDR % 64; }
00000080 <main>: 80: 8e b5 in r24, 0x2e ; 46 82: 8f 73 andi r24, 0x3F ; 63 84: 8e bd out 0x2e, r24 ; 46
|
|
|
|
« Last Edit: January 05, 2012, 04:46:31 pm by maniacbug »
|
Logged
|
|
|
|
|
Offline
Newbie
Karma: 0
Posts: 3
|
 |
« Reply #4 on: January 05, 2012, 09:16:33 pm » |
Hm... Very strange. Yes, your code do not generate any divisions.
But when a compile simple sketch: void setup() { Serial.begin(9600); } void loop() {}
and do (on Windows): c:\BIN\Arduino\arduino-1.0\hardware\tools\avr\bin\avr-objdump.exe -d test123.cpp.elf
i get: ..... 00000162 <__vector_19> 162: 1f 92 push r1 164: 0f 92 push r0 ....... 1bc: 01 96 adiw r24, 0x01 ; 1 1be: 60 e4 ldi r22, 0x40 ; 64 1c0: 70 e0 ldi r23, 0x00 ; 0 1c2: 0e 94 34 03 call 0x668 ; 0x668 <__divmodhi4> ....
Maybe, this code is from another place... I don't know assembler wery well...
|
|
|
|
|
Logged
|
|
|
|
|
Offline
Newbie
Karma: 0
Posts: 3
|
 |
« Reply #5 on: January 05, 2012, 09:37:13 pm » |
If divider is power of 2, why not use & ?
Only worth this kind of optimization in cases where serial performance was the bottleneck. I try to do simple audio recorder from the arduino uno (just for fun and training). I programmed Timer2 for generate interrupts at 16KHz (every 62.5uS) and set ADC prescaler to 64 (so, ADC conversion must take 52uS). But sometymes, ADC don't have to complete previous conversion, when the next interrupt occurs. I think, it is because other interrupts brings a time jitter to time, when Timer2 interrupt handler raise... Thank you, for your quick response!
|
|
|
|
« Last Edit: January 05, 2012, 09:50:35 pm by llevinson »
|
Logged
|
|
|
|
|
Global Moderator
Dallas
Online
Shannon Member
Karma: 120
Posts: 10183
|
 |
« Reply #6 on: January 05, 2012, 10:14:49 pm » |
Maybe, this code is from another place... I don't know assembler wery well... You know it well enough. That's a problem.
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
Dallas
Online
Shannon Member
Karma: 120
Posts: 10183
|
 |
« Reply #7 on: January 05, 2012, 10:28:42 pm » |
• Open HardwareSerial.cpp
• At or near line 206, change this... tx_buffer.tail = (tx_buffer.tail + 1) % SERIAL_BUFFER_SIZE; ...to this... tx_buffer.tail = (unsigned int)(tx_buffer.tail + 1) % SERIAL_BUFFER_SIZE;
• At or near line 230, change this... tx_buffer1.tail = (tx_buffer1.tail + 1) % SERIAL_BUFFER_SIZE; ...to this... tx_buffer1.tail = (unsigned int)(tx_buffer1.tail + 1) % SERIAL_BUFFER_SIZE;
• At or near line 247, change this... tx_buffer2.tail = (tx_buffer2.tail + 1) % SERIAL_BUFFER_SIZE; ...to this... tx_buffer2.tail = (unsigned int)(tx_buffer2.tail + 1) % SERIAL_BUFFER_SIZE;
• At or near line 264, change this... tx_buffer3.tail = (tx_buffer3.tail + 1) % SERIAL_BUFFER_SIZE; ...to this... tx_buffer3.tail = (unsigned int)(tx_buffer3.tail + 1) % SERIAL_BUFFER_SIZE;
• At or near line 385, change this... int i = (_tx_buffer->head + 1) % SERIAL_BUFFER_SIZE; ...to this... int i = (unsigned int)(_tx_buffer->head + 1) % SERIAL_BUFFER_SIZE;
• Save and close HardwareSerial.cpp
Basically, put "(unsigned int)" in front of every expression that includes "% SERIAL_BUFFER_SIZE".
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
Dallas
Online
Shannon Member
Karma: 120
Posts: 10183
|
 |
« Reply #8 on: January 05, 2012, 10:44:17 pm » |
And, as a nice side-effect, the entire ISR is much smaller.
|
|
|
|
|
Logged
|
|
|
|
|
Seattle, WA
Offline
God Member
Karma: 4
Posts: 673
|
 |
« Reply #9 on: January 06, 2012, 12:00:40 am » |
Sounds like someone needs to submit a patch 
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
Dallas
Online
Shannon Member
Karma: 120
Posts: 10183
|
 |
« Reply #10 on: January 06, 2012, 12:20:57 am » |
|
|
|
|
|
Logged
|
|
|
|
|
Seattle, WA
Offline
God Member
Karma: 4
Posts: 673
|
 |
« Reply #11 on: January 06, 2012, 01:01:49 am » |
Nice. Although if you actually attach a patch file, I think it increases the odds of it getting in...
|
|
|
|
|
Logged
|
|
|
|
|
SF Bay Area (USA)
Offline
Faraday Member
Karma: 78
Posts: 5454
Strongly opinionated, but not official!
|
 |
« Reply #12 on: January 06, 2012, 01:35:00 am » |
patch attached. I actually had this almost ready based on the discussions over in: http://arduino.cc/forum/index.php/topic,85207.0.html
|
|
|
|
|
Logged
|
|
|
|
|
Dallas, TX USA
Offline
Edison Member
Karma: 25
Posts: 1617
|
 |
« Reply #13 on: January 06, 2012, 03:08:01 am » |
And "while you are in there", why not modify the code to use 8 bit variables for head and tail? It saves hundreds of bytes of code to do this. It is one of the first things I change on every single Arduino release. And if you are worried about somebody making SERIAL_BUFFER_SIZE larger than 256 then that can also be handled compile time with a #if to change back to using ints. Example: struct ring_buffer { unsigned char buffer[SERIAL_BUFFER_SIZE]; #if SERIAL_BUFFER_SIZE > 256 volatile int head; volatile int tail; #else volatile uint8_t head; volatile uint8_t tail; #endif
};
But in the bigger picture, larger than 8 bit indexes won't really work anyway because the code that uses it does not properly deal with atomicity of the 2 byte value when doing things like compares. So it might as well be an 8 bit value anyway since 16 bit (multi byte) values will have issues when the value is larger than 255. --- billl
|
|
|
|
|
Logged
|
|
|
|
|
Offline
Full Member
Karma: 0
Posts: 167
|
 |
« Reply #14 on: January 06, 2012, 04:04:37 am » |
Interesting, this is just a regression in1.0, I checked my 0022 HardwareSerial and all the buffer size lines are properly cast.
|
|
|
|
|
Logged
|
|
|
|
|
|