I've gone now for code size optimising, couple of idears

if ( notation == DEF ) {

if ( (number > 8388608.0) || (number < 0.000001) ) {

// as we head into SCI / ENG world add digits for output

notation = SCI;

digits += 6;

}

}

I've done this as this is the value, that we have to move to E notation, to only print accurate numbers, ie 8388609 actually prints as 8388610 :-(

this still has it listed wrong, but printed as E form allows the user to 'have a guess' that number might not be exactly as shown.

i've enum'd the DEF, SCI and ENG values, and by passing these we get a free compiler sense check on values, not perfect, but useful IMO. ( saves having to take a local copy in Enotation )

I've also changed printNumber, to take a third argument ( the number of digits we must print ) a simple

#define NO_LEADING_ZERO 0 allows sensible looking code for existing use of the function. ( the compiler for your example program only twice has to reload r16 ( with the extra arg being passed, one time extra for when we use it to get leading zero's )

// Extract the integer part of the number and print it watching how many digits we have

uint32_t int_part = number;

uint8_t tmp = prn_cnt;

prn_cnt += printNumber(int_part, DEC, NO_LEADING_ZERO);

// see if we are going to be printing too many digits, we can save time doing the decimal half.

uint8_t digits_available = 7 - ( prn_cnt - tmp );

if ( digits > digits_available ) digits = digits_available;

if (digits > 0) {

prn_cnt += write('.');

double remainder = number - int_part;

// make an unsigned long of the decimal part - of a certain length, ie leading zero's !

uint32_t rem = remainder * remMult[digits - 1];

prn_cnt += printNumber(rem, DEC, digits);

}

thie above is the partial code that handles printing of the float now. and the faster divmod10_asm is handled only once in the printNumber routine.

size_t Print::printNumber(uint32_t num, uint8_t base, uint8_t leading_zeros) {

char buf[33];

char *str = &buf[sizeof(buf) - 1];

*str = '\0';

uint8_t mod, tmp;

int8_t extra_digits = leading_zeros;

do {

#ifdef USE_STIMMER_OPTIMIZATION

if ( base == DEC )

{

divmod10_asm32(num, mod, tmp);

*--str = mod + '0';

extra_digits -= 1;

}

else

#endif

{

*--str = '0' + num % base;

if ( *str > '9' ) *str += 7;

num /= base;

extra_digits -= 1;

}

} while (num);

for ( ; extra_digits > 0; extra_digits-- ) *--str = '0';

return write(str);

}

i'm happy with not having optimised versions of print for HEX, OCT and BINary. and this shrinks down the code added nicely.

so slightly slower than the version you've posted, but a code shrink on it. numbers printed before the decimal point, are always right, and rounding is cut off after a total of seven digits being printed ( ignoring sign exponent etc )

note, previously i've change a few vars from int down to uint8_t, to allow the compiler to utilise a single register. base being an example.

also, you need to use smaller vars, where possible... ie using int instead of (u)int8_t means lots of extra code checking and using ( ie expoent++ )

thats another example exponent++ and exponent-- and quite often on gcc 4.3.2 on (u)int8_t vars will extend to 16 bits, and thus waste time / cpu cycles. expoenent -= 1; is faster and smaller cos the compiler produces code for a single register than a pair.

oops a few edits for spelling, and clearer reading text, and here are my times on a UNO

10737.41

1.0182

107.37

Time=448

per char incl .\r\n : 17.23 <----- corrected in the code for only printing 26 not 28 chars.

done