Go Down

Topic: divmod10() : a fast replacement for /10 and %10 (unsigned) (Read 24908 times) previous topic - next topic

Paul Stoffregen


think divmod10() should become part of the core seeing these and the above numbers.


I'm planning to release it in the next version of Teensyduino.

Stimmer, is Arduino Print LGPL ok with you?  Would you like your name to appear, other than "Stimmer" and link to this forum topic?  You deserve credit.... just let me know what should be in the comments?

Coding Badly

Stimmer, is Arduino Print LGPL ok with you?  Would you like your name to appear, other than "Stimmer" and link to this forum topic?


Credit is not enough.  Open source licenses require a copyright claim by Stimmer.

Paul Stoffregen

Ok then.  I'll hold off publishing anything until Stimmer replies.

I'm currently working on a Mac-related USB problem, so it's not like there's any hurry.  But it would be nice to publish this code to a wider audience, if that's allowed?

Jantje

@Paul
;) Shouldn't you be working on a space related issue with the Teensy uploader  ]:D
Best regards
Jantje
Do not PM me a question unless you are prepared to pay for consultancy.
Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -

drjiohnsmith

agree these should be standard libs of the core,
   
but the arduino team wont,

stimmer



think divmod10() should become part of the core seeing these and the above numbers.


I'm planning to release it in the next version of Teensyduino.

Stimmer, is Arduino Print LGPL ok with you?  Would you like your name to appear, other than "Stimmer" and link to this forum topic?  You deserve credit.... just let me know what should be in the comments?



Yes that's OK. Consider what I wrote to be in the public domain so anyone can use it under any license. Just put something like "this section of code based on public domain code by Stimmer, see post at forum.arduino.cc/whatever-the-url-is ".
Due VGA library - http://arduino.cc/forum/index.php/topic,150517.0.html

robtillaart

Quote
agree these should be standard libs of the core,
but the arduino team wont,


I'm definitely not sure of the latter.
The code is not core lib ready yet. For Arduino-team the portability of core libs is a very important issue. The divmod10() function should be decorated with #ifdefs to select the C (portable to DUE?) or the assembly (328 only?) implementation .

So the fun part is over, now the real work starts ;)

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Paul Stoffregen

Quote

So the fun part is over, now the real work starts


It ought to be pretty simple to surround this code with  #ifdef __AVR__, or perhaps a check for the AVR architecture with multiply?  In fact, the Print.cpp I posted has #ifdef checks to use either the original C code, Stimmer's optimization, or your version of the Hackers Delight algorithm.  Yes, it requires some careful attention, but this part really isn't very difficult.

Whether the Arduino Team will accept this for the Print.cpp that ships for all Arduino boards is a good question.  Historically, they've been pretty uninterested in speed optimizations.

I definitely do plan to publish this in the next version of Teensyduino, likely within the next 6 weeks.  Of course, if Coding Badly is satisfied with the licensing?  ... Thanks Stimmer!  :D  You definitely did the heavy lifting with an amazing optimization!


Coding Badly


robtillaart

#99
Jul 11, 2013, 10:31 pm Last Edit: Jul 11, 2013, 10:36 pm by robtillaart Reason: 1
update on printfloat version using divmod10()       (follows - http://forum.arduino.cc/index.php?topic=167414.msg1295109#msg1295109 - )

a new version to print the remainder (limited to 10 decimals) of a float
- timing is slightly shorter
- code is more straightforward than the previous one

insert in Print.cpp  ==> size_t Print::printFloat()
Code: [Select]

 // Print the decimal point, but only if there are digits beyond
 if (digits > 0)
 {
   n += print(".");

   uint32_t t = 1;
   for (uint8_t i=0; i<digits; i++) t = ((t<<2) + t)<<1;  // t *= 10
   uint32_t rem = remainder * t;
   char z[11]="0000000000";  // max 10 decimals
   z[digits]='\0';
 
   uint32_t d = 0;
   uint8_t m = 0;
   while (rem > 10)
   {
     divmod10(rem, d, m);
     z[--digits] = m + '0';
     rem = d;
   }
   z[--digits] = rem + '0';  // last digit
   n += print(z);
 }


testing
10737.4179
1.0182
107.37
Time=1104      << original Print.cpp 2144;      almost 50% off (for this particular test)
done


testsketch
Code: [Select]
unsigned long start = 0;
unsigned long stop = 0;
volatile unsigned long q;

void setup()
{

 Serial.begin(115200);
 Serial.println("testing");
 
 byte backup = TIMSK0;
 
 TCCR1A = 0;
 TCCR1B = 4;
 TIMSK1 |= _BV(TOIE1);
 
 Serial.flush(); //wait for serial buffer to clear
 TIMSK0 = 0; //disable millis;
 TCNT1 = 0;
 
 Serial.println(10737.41824, 4);
 Serial.println(1.01819584, 4);
 Serial.println(107.37, 2);
 
 stop = TCNT1; //how many clock cycles.
 TIMSK0 = backup; //renable millis;
 stop*=16;//There are 16us per clock cycle with a 1:256 prescaler.
 Serial.print("Time=");
 Serial.println(stop);
 Serial.println("done");
 TIMSK1 &= ~_BV(TOIE1);
}

void loop()
{
}

ISR(TIMER1_OVF_vect) {
 Serial.println("I Overflowed!"); //just to make sure we can tell if this happens.
}
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

update
replace n += print(".");   with    n += print('.');

Time=1072  << original 2144 is 50% off
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Coding Badly


Does this make any difference...
Code: [Select]
n += write('.');

robtillaart



Does this make any difference...
Code: [Select]
n += write('.');

not measurable with the test script, but removing some "stacked calls" and use write() where possible may add up.
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

#103
Jul 21, 2013, 04:30 pm Last Edit: Jul 25, 2013, 12:34 pm by robtillaart Reason: 1
update on printfloat version using divmod10() . Code see- http://forum.arduino.cc/index.php?topic=167414.msg1312959#msg1312959 -

Incorporated the Stimmer ASM divmod10_asm into the floating point test. - http://forum.arduino.cc/index.php?topic=167414.msg1293679#msg1293679 -
It strips of another 3% for printing floats

output:
testing
10737.4179
1.0182
107.37
Time=1008  << original 2144 is 53% off
done

1 millisecond for 19 digits is about 50+ uSec per digit (iso 100)

update: printFloat continues in its own thread here - http://forum.arduino.cc/index.php?topic=179111.0 -
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Paul Stoffregen

I published a Teensyduino release candidate today, which includes Stimmer's optimization.

http://forum.pjrc.com/threads/24393-Teensyduino-1-17-Release-Candidate-1-Available

Thanks to everyone who worked and contributed to this awesome speedup.  Soon it'll be in widespread use on Teensy 2.0 and Teensy++ 2.0 boards.  :)

Go Up