How to safely and reasonably convert a float or double to string or char array?

alirezasafdari:
Hi

I have been using in dtosrtf() in a relatively big project and some usual stuff was happening. Long story short, I took "The caller is responsible for providing sufficient storage in s" not as serious as I should. I thought if I do not provide enough width it automatically understand not to write in the rest of the memory.

Someone in another forum suggested using 100 byte array to be safe, but I think this is not a reasonable solution on a AVR/Arduino with limited memory.

So sad that the Arduino IDE developers refuse to let users do something like this:

int main (void)
{
    init();
    Serial.begin (115200);

    STDIO.open (Serial);

    double a = 123.45678;
    double b = -0.0045;
    double c = 1.2345678;

    fprintf (stdout, "The variable 'a' has a value of %7.3f\n", a);
    fprintf (stdout, "The variable 'b' has a value of %7.3f\n", b);
    fprintf (stdout, "The variable 'b' has a value of %7.3f\n", c);

    while (1);
}

Output from this code:

[b]The variable 'a' has a value of 123.457
The variable 'b' has a value of  -0.004
The variable 'b' has a value of   1.235
[/b]

See? No "dtostrf" baloney.

krupski:
So sad that the Arduino IDE developers refuse to let users do something like this:

Is it really the IDE that stops you? I thought it was GCC that was the issue there. And they don't refuse to let you. You did it. It's open source, you can do anything you want with it.

Delta_G:
Is it really the IDE that stops you? I thought it was GCC that was the issue there. And they don't refuse to let you. You did it. It's open source, you can do anything you want with it.

GCC does everything just fine. All the IDE does is build GCC command strings and passes them to the compiler. Any limitations in programming an Arduino are due to the IDE, NOT GCC.

To use ordinary printf and floating point, all that's necessary is to "hack" the IDE to link [b]libprintf_flt.a[/b] into the compiled binary instead of [b]libprintf_min.a[/b].

Or, better yet, add a checkbox option in Preferences to choose which one to link (floating point when you need it, the non-floating point one when you don't - saves a few bytes of flash).

The "sketch" I show above was compiled by a plain old Arduino 1.0.5 IDE with a few minor differences:

(1) I turned on the floating point option checkbox (custom option in my IDE).
(2) I connected the serial port to the standard IO streams to allow using fprintf (the STDIO.open() call).
(3) I used "int main (void)" instead of that "setup/loop" nonsense (although it would work the same with setup/loop).

See? I didn't need "dtostrf". I didn't need to worry about buffer overflows. I didn't need a dozen "Serial.print()" calls just to print a few lines, I didn't need to play games checking string lengths and inserting padding to get the numbers to line up right.

@westfw that could be a solution but honestly showing the scientific notations may not be the best way to present data mainly it is harder to read and understand if the numbers are small most of the time. So, I am still going to work on a solution which safely perform this. Please read the rest of my post after replies to each individual, because I have a question which probably you can answer it the best.

@krupski I am not really familiar with IDE manipulations and most stuff in that regard which you mentioned. I think the method you proposed has a lot of over head in run time. but a "safe, civilized, reasonable" dtostrf is really easy to do and also consider that we may not always need to put things in serial port (although I know sprintf could be used if some tweaks are done to IDE).

Now back to question:

  1. so is it concluded that the description for dtostrf is wrong or dtostrf is implemented wrong? if yes, could you guys with higher star rating and reputation voice this out so that it can get fixed. Or if I should report it myself please let me know what is the best way to do it.

  2. I was checking the implementation of dtostrf from this (thank to westfw) and I noticed there is an interesting function which does most of the job and probably is written in the most efficient way. that function/macro looking function is __ftoa_engine.
    A good documentation can be found in here . Before finding this document I noticed that __ftoa_engine cannot be used in arduino IDE. and I could not even include ftoa_engine.h. How can I get access to this file in arduino? should not it be part of the IDE?

  3. In the reference above for avr-libc.1.8.0 a dtostrf implementation has been provided which follows the exact documention as the one used in arduino but it has been implemented correctly, which means the user can tell the precision."prec determines the number of digits after the decimal sign". What do you think about this implementation?

alirezasafdari:
@krupski I am not really familiar with IDE manipulations and most stuff in that regard which you mentioned. I think the method you proposed has a lot of over head in run time.

I'm confused. You are "not really familiar" with how all this works, yet you assert that using the compiler properly creates a "lot of overhead in run time".

Now, I know that a lot of people feel that enabling floating point support uses a lot of extra flash memory. And, indeed it does use more flash, but not THAT much. I have yet to come anywhere close to filling all the flash.

Another thing which most people don't realize is that "floating point support" entails TWO DIFFERENT pieces of code that are linked into the program at compile time.

They are:

libprintf_flt.a


- ```
[b]libscanf_flt.a[/b]

The first one supports all of the printf related functions. The second one supports scanf, sscanf, etc... (functions that are rarely used).

Both of these utilize extra flash (and a bit of ram as well) space. The second one (scanf) uses the bulk of resources and is rarely if ever used.

The printf one, however, uses very little memory and therefore there is NO REASON to not use it. You can freely enable one, the other or both - doesn't matter.

If you want to try it without hacking anything, do this:
Go to your "[b]arduino_dir/hardware/tools/avr/avr/lib/[/b]" directory.

Rename the file "[b]libprintf_min.a[/b]" to "[b]libprintf_min.a.backup[/b]".
Copy the file "[b]libprintf_flt.a[/b]" to "[b]libprintf_min.a[/b]"

Now, write a small sketch like this:

void setup (void)
{
    char buffer [64];
    double value = 123.45678;
    Serial.begin (9600); // or whatever baud rate you want
    sprintf (buffer, "You should see a number here --> %7.3f\r\n", value);
    Serial.print (buffer);
}

void loop (void)
{
    /* nothing */
}

Do you see what we did here?

We replaced the "minimum" printf code with the "floating point" printf code. Since you won't have printf enabled, you can test it by using sprintf to a buffer, then printing the buffer.

If it works, you should see this: "[b]You should see a number here --> 123.457[/b]"
It not, you will get this: "[b]You should see a number here -->       ?[/b]"

To un-do the "hack", just rename those two files above back to what they were.

In fact, FIRST try the sketch above before doing the mod and see how much ram and flash is used, then do the mod and compare resource usage. You will see that it's a trivial amount.

Good luck

Sorry krupski. I just stated what I have read and did not test it myself. I give it a shot now. Just one question, does the new files use heap? I am going to test it anyway but just wanted to know if they use heap.

It is trivial to fix or work around the limitation of dtostrf.

  1. Use snprintf() with floating point extensions enabled.

I think that is "far from trivial" by Arduino standards, unfortunately.

is it concluded that the description for dtostrf is wrong or dtostrf is implemented wrong?

I think it matches the (avr-libc documentation) description. "width" was documented as "minimum width" and precision is "number of places after decimal." There does not seem to be a "maximum width" parameter, so numbers like "99e20" are a problem.
The floating point format (on AVR) is limited to values <1039, so a number will never take more than sign+39digits+.+ bytes, I think...

It's worth remembering why printf() and floatingPoint printf() are not the defaults. a printf() based implementation of "hello world" is about 1k bigger than the same with Serial.print(), and using the floating point version of printf() expands the difference to about 4kbytes, WHETHER OR NOT YOU USE THE FLOATING POINT FEATURES. Those were big numbers back when a "large" AVR only had 4k to 8k of code space.
(OTOH, the difference in size between using the floating point printf vs using the Arduino Serial.print(float,prec) is only a couple hundred bytes.)
I'm a little disappointed that the Arduino team keeps extending Serial.print() instead of embracing industry standards.
OTOH, I'm also a bit disappointed that dtostr() isn't just a wrapper for __ftoa_engine() or something; it's not clear to me how they're different. And this (in the dtoa_prf.c) seems a bit worrisome:

   unsigned char buf[[color=red]9[/color]];
       :
    [color=red]ndigs = [/color]prec < 60 ? prec + 1 : [color=red]60[/color];
    exp = __ftoa_engine (val, (char *)buf, 7, [color=red]ndigs[/color]);

Looks to me like they just told the engine that their 9byte buffer had room for 60 digits...
And ... It looks like the ftoa_engine output is ... really weird.

One solution is to check every float before calling the dtostrf but as I have mentioned before that would take so much time since you have to make sure each number is within the positive and negative boundary. That is 2 float comparison for each time dtostrf is called

Actually, floating point comparisons are very cheap. MUCH cheaper than the first division that WILL happen if you actually do the conversion. (in fact, most FP formats are cleverly designed so that a floating point comparison can happen nearly as quickly as a fixed point comparison (even without FP hardware.) Except for NaN and other "features." That's why you see things like "excess128" instead of two's complement used for the exponent.)

There is a limit on how big your float can be. The documentation says it's -3.4028235E+38. I can confirm that if you try to print out a number bigger than that (like -3.4028236E+38), it turns it into "INF" or "-INF".

So the solution to your problem is to supply a buffer that is 43 characters or larger. Because yeah, if you only supply 42, it could crash.

alirezasafdari:
Sorry krupski. I just stated what I have read and did not test it myself. I give it a shot now. Just one question, does the new files use heap? I am going to test it anyway but just wanted to know if they use heap.

They are not "new" files, you are simply tricking GCC into linking in the floating point code when it's actually trying to link in the "minimal" (non-floating point) code.

As far as "heap", I don't really know or care how GCC allocates memory. I only care that I have ENOUGH.

That's why I suggested to build the test program first, then see how much flash and ram it uses, then do the file rename trick to use the floating point code and build the exact same test program again and compare flash and ram usage.

If you like how it works and you feel the little extra memory usage is OK, then stick with it. If not, simply rename the files back to original and all is back to the way it was before.

westfw:
It's worth remembering why printf() and floatingPoint printf() are not the defaults. a printf() based implementation of "hello world" is about 1k bigger than the same with Serial.print(), and using the floating point version of printf() expands the difference to about 4kbytes, WHETHER OR NOT YOU USE THE FLOATING POINT FEATURES. Those were big numbers back when a "large" AVR only had 4k to 8k of code space.

Did you see my [u]post[/u] a few places up?

There are TWO different floating point support files... one for printf and related, another for scanf and related.

There is no need to link in BOTH of them, You can use one, the other or both. And, since scanf and related are rarely used, one can link in ONLY the printf floating point code and save quite a bit of space.

That's why I have the option to chose one, the other or both in my Preferences.

(click image for full size)
prefs.jpg

@krupski

I have tried your trick by renaming the files and it did not work. I think it was simple enough for me not to make a mistake but again I could be wrong. (I am using latest Arduino).

However since I did a small test, I decided to share the result (not sure if it is meaningful but again it is a job done)

For Dtostrf:

#include "MemoryFree.h"

void setup() 
{
 int availableMemory;
 availableMemory = freeMemory();

 Serial.begin(115200);
 uint8_t buffer[60] = {};
 float inputNumber = 1357.125;
 uint32_t start, end;
 
 Serial.println(availableMemory);
 
 start = micros();
 dtostrf(inputNumber, 0, 3, buffer);
 end = micros();
 
 Serial.println(end - start);

 Serial.write(buffer, 60);
 
}
// the loop function runs over and over again forever
void loop() 
{

 
}

Result:

Using dtostrf:
Flash space = 3790
Global variable = 190
Free memory at the beginning = 7937
Micros taken = 120
Output is: 1357.125 [00] X 52

for sprintf:

#include "MemoryFree.h"

void setup() 
{
 int availableMemory;
 availableMemory = freeMemory();

 Serial.begin(115200);
 uint8_t buffer[60] = {};
 float inputNumber = 1357.125;
 uint32_t start, end;
 
 Serial.println(availableMemory);
 
 start = micros();
 sprintf (buffer, "%7.3f\r", inputNumber);
 end = micros();
 
 Serial.println(end - start);

 Serial.write(buffer, 60);
 
}

// the loop function runs over and over again forever
void loop() 
{

 
}

results:

Without modifying the IDE:
Flash space = 3766
Global variable = 196
Free memory at the beginning = 7931
Micros taken = 80
Output is: [20] [20] [20] [20] [20] [20][3F = ‘?’][0D = ‘\n’] [00] X 52

With modifying the IDE:
Flash space = 3766
Global variable = 196
Free memory at the beginning = 7931
Micros taken = 80
Output is: [20] [20] [20] [20] [20] [20][3F = ‘?’][0D = ‘\n’] [00] X 52

@westfw
I am not sure about (avr-libc documentation) but the the other stostrf I found they let you set the number of digits after the sign. This technically solve all the problems. I am not sure what they do if they cannot fit it in the given number of digits but they definitely have a way to know.

__ftoa_engine() precision is different from in current arduino dtostrf. the precision tells the maximum number of digit coming out. that is why they put a 7 there to indicate the want only 7 numbers to be printed in a 9 byte buffer. Not sure about the return though. The implementation is odd with zero comments which make it tougher.

krupski:
Rename the file "[b]libprintf_min.a[/b]" to "[b]libprintf_min.a.backup[/b]".
Copy the file "[b]libprintf_flt.a[/b]" to "[b]libprintf_min.a[/b]"

libprintf_min.a is not the default printf, but an even more stripped down version.
The default version is in libc.a.

oqibidipo:
libprintf_min.a is not the default printf, but an even more stripped down version.
The default version is in libc.a.

Here is the exact piece of code that I have in "Compiler.java" to do the job:

///////////////// this provides the floating point option /////////////////

if (Preferences.getBoolean ("build.printf_floating_point")) {
    baseCommandLinker.add ("-Wl,-u,vfprintf,-lprintf_flt");
}

if (Preferences.getBoolean ("build.scanf_floating_point")) {
    baseCommandLinker.add ("-Wl,-u,vfscanf,-lscanf_flt");
}

///////////////// this provides the floating point option /////////////////

...and this code provides the check boxes in "Preferences,java" to select or deselect either one:

// [ ] printf Floating point support
useFFloatingPointBox = new JCheckBox (("  Enable (f)printf floating point"));
useFFloatingPointBox.setToolTipText (getToolTip ("Enable floating point support for printf"));
pane.add (useFFloatingPointBox);
d = useFFloatingPointBox.getPreferredSize();
useFFloatingPointBox.setBounds (left, top, d.width + 10, d.height);
right = Math.max (right, left + d.width);
top += d.height + GUI_BETWEEN;

// [ ] scanf Floating point support
useSFloatingPointBox = new JCheckBox (("  Enable (f)scanf floating point"));
useSFloatingPointBox.setToolTipText (getToolTip ("Enable floating point support for scanf"));
pane.add (useSFloatingPointBox);
d = useSFloatingPointBox.getPreferredSize();
useSFloatingPointBox.setBounds (left, top, d.width + 10, d.height);
right = Math.max (right, left + d.width);
top += d.height + GUI_BETWEEN;

And finally this code reads and writes the preferences in "Preferences.txt":

// in "applyFrame()"
setBoolean ("build.printf_floating_point", useFFloatingPointBox.isSelected());
setBoolean ("build.scanf_floating_point", useSFloatingPointBox.isSelected());

// in showFrame()
useFFloatingPointBox.setSelected (getBoolean ("build.printf_floating_point"));
useSFloatingPointBox.setSelected (getBoolean ("build.scanf_floating_point"));

I know why the Arduino IDE developers haven't included such a useful option... it's SO complex!

@Jimmus
You actually need more than 43 precision is set to be more than zero! so you would need (precision + 43). precision is am unsigned 8 bit which can go all they way up to 256. also take note that copying from that array to the actual array would take some time too.

@krupski first comment (Nov 11, 2017, 07:41 am )
Well, the problem with heap is that it cannot be measured but I think it is unlikely for them to use heap.

@krupski (Nov 11, 2017, 08:04 am)
I tried but I failed as you can see in my other comment

@krupski (Today at 12:06 am)
Have you written a guide on where to place these parts of your code so I can give it a shot.

Now on my progress in making a safe dtostrf:

  1. I was not sure how should the function look like. So I start looking for function developed by others. soon I noticed dtostrf rarely is used and the alternative name usually used is ftoa. So, I looked up few ftoas and to all of them were safe. They may not have produce clean results like our dtostrf but they were safe.

  2. I asked this question in AVR freaks and I was told to just implement it myself. So I started but problems were raised one after each other. The initial step was finding the files I need.
    So here is a list:
    ftoa_engine.h
    [avr-libc] Contents of /trunk/avr-libc/common/ftoa_engine.h

ftoa_engine.S
http://svn.savannah.gnu.org/viewvc/avr-libc/trunk/avr-libc/libc/stdlib/ftoa_engine.S?revision=2191&view=markup

macros.inc
http://svn.savannah.gnu.org/viewvc/avr-libc/trunk/avr-libc/common/macros.inc?revision=2542&view=markup

sectionname.h
http://svn.savannah.gnu.org/viewvc/avr-libc/trunk/avr-libc/common/sectionname.h?revision=2166&view=markup

#include <avr/io.h>
I could not find this one though. I thought this is supposed to be the easiest one.

So I started a new project and the code did not compile but not because something is missing. It could not understand the .S (assembly file) and the .INC file. So hen I remembered that Arduino IDE and assembly do not go well together.

I also thought of just using the assembly part in a function and use "asm volatile( // code );", so after finishing I noticed the code does not compile mainly because it cannot connect the function's arguments to the the variables in assembly code (even though I used the same name). Which kind of make sense.
So is there anyone here who knows how to connect the function arguments to assembly code? (we may need to relate the #defines to assembly code (I am not sure if this is done currently))
Or is there anyone who knows how we can trick the arduino IDE to use assembly code?
Or is there anyone who knows a way to use the "__ftoa_engine" function in Arduino IDE? Arduino dtosrf use this thing but the users cannot use it in the IDE environment.

My assembly function was too long, so I attached it.

Direction from here:
while I wait for someone who knows the solution to number 2, I start making an equivalent __ftoa_engine in C++ so that for the worst case scenario. Then I make the dtostrf which is safe. Any suggestions are appreciated.

floatEngineTest.ino (11.2 KB)

alirezasafdari:
Any suggestions are appreciated.

I'm watching this thread, my interest is just academic.

I have to say though, I'm trying to understand why it is you would need this. I admit that I really couldn't understand from your explanation above.

Just to make things clearer after BulldogLowell saying that my explanation was not clear, I will try one more time to express what I meant. I think most uncertainty is in regard to part 2. So I try to explain in more detail.

In Arduino Freaks post (here) They asked me to implement a dtostrf myself. I started doing it using the available resources fully. In current dtosrf most of the work is done by a function named "__ftoa_engine". This function is related to all the files in my list in the previous post. Unlike most libraries we so for Arduino, this function has its definition in a .h file and its implementation in a .S file. .S file is like a hybrid assembly + C/C++ file. So there are parts in C/C++ which are defines and things like that and then the implementation of the function is in assembly. The other 2 files also contain some stuff which are not you typical C++ file but I could live with that if everything would work out of the box(copy pasting code).
When I realized the files are being an issue, I thought of making one function with assembly code in it and trying to make it independent. So I made the file attached to my previous post having the whole ~500 lines of assembly using the "asm volatile( // code ); ". I read that this is the only way to use assembly in Arduino IDE. So I put my while code inside the code shown above (a lot of manual changes were needed to make the format compatible since this feature accept one line of assembly per each double quotation). After finishing the whole editing, I realized that the code does not compile because in assembly there are variable names which are the arguments passed to the function and these cannot be connected to the variables passed to the function. If you open my attached file you can see the following in the assembly code which do not get connected to the function's arguments: "val, buf, prec, maxdgs". Therefore I need someone to answer any of those 3 questions if I want to move forward with this approach.

I hope this clarified the process and the progress and everything else :smiley:

alirezasafdari:
I hope this clarified the process and the progress and everything else :smiley:

No, I understand your approach, I just don't understand (except for purely academic reasons) why you would need to do this.

@BulldogLowell I am having so much free time in my hand so let me explain one more time why there is a problem because I think you are not the only one not understanding it and also to be 100% I am not the one being wrong.

We call the current dtostrf. What happens is that this function does not know how big the buffer we want him to write in is. It could be any number but dtostrf does not have any clue about it. It could be larger, equal or smaller. If it is larger or equal, we are safe. If it is smaller then that is a bad news. The current dtostrf goes and write the number in the memory not knowing that the buffer is smaller. So the data in memory get corrupted since dtostrf does not stop writing in the memory when the small buffer has been used up.

One might say we know roughly how big our float is so we can make sure we have enough room. That is correct but the problem with float is that it can become very large if you miss few special cases. Let's say you expect your number to be between 0 to 10. Also let's assume you have set the precision (the way it works in current dtostrf) to 1. So the minimum will be "0.0" and the max will be "9.9". So, you want to be as efficient as you can get and you allocate an array with 3 bytes. (I am ignoring the null because dtostrf does not print null as far as I remember). You also set the width to 3.

if the number is 0 you will get [0]- [0] (we are all good)
if the number is 5.3 you will get [5]- [3] (we are all good)
if the number is 9.9 you will get [9]- [9] (we are all good)

Now if you variable for some reasons that you did not predict become something out of the range. let's see what happens:

if the number is -0.1 you will get [-][0]- [1] (We changed some other data byte to '0')
if the number is 10.0 you will get [1][0]- [0] (We changed some other data byte to '0')

or in some very ugly case:

if the number is 10000.0 you will get [1][0][0][0][0]- [0] (We changed some other data bytes to '0' and one byte to '.')

So this is the problem. Now one solution is to put a huge buffer and the problem is that float variable can have a very huge value too. So, you won't be able to find a reasonable size. One way is to check and make sure if the number is in certain range. However the problem is that you should do this every time and on top of that you have to make sure you consider the sign and the precision (precision as it is in current dtostrf). So as you can see not all these solutions are neat.
If we had a dtostrf which would know the size of buffer, everything would be solved. dtostrf is technically aware of everything it is doing. It knows exactly how many bytes are being used and so one, so why not we just tell dtostrf how many byte it can use and then we sleep in peace at night not worrying about the float ending up out of range and screw the whole memory.

One more look at dtostrf

char* dtostrf ( double __val, signed char __width, unsigned char __prec, char * __s)
Conversion is done in the format "[-]d.ddd". The minimum field width of the output string (including the possible '.' and the possible sign for negative values) is given in width, and prec determines the number of digits after the decimal sign. width is signed value, negative for left adjustment.
The dtostrf() function returns the pointer to the converted string s.

in the Arduino implemented:
__prec: determines the number of digits after the decimal point
What would make it safe although not as practical:
__prec: determines the number of digits after the sign

My suggestion for dtostrf:

bool dtostrf (double __val, signed char __width, unsigned char __prec, char * __s, unsigned char __maxSize)
__val: same as current dtostrf
__width: same as current arduino dtostrf
__prec: same as current arduino dtostrf
__s: same as current arduino dtostrf
__maxSize: The number of bytes allocated for this number
return true if the process was successful and maxSize was large enough. Return false if the size was not enough.

Correction to my suggested dtostrf

char* dtostrf (double __val, signed char __width, unsigned char __prec, char * __s, unsigned char __maxSize, bool __result)
__val: same as current dtostrf
__width: same as current arduino dtostrf
__prec: same as current arduino dtostrf
__s: same as current arduino dtostrf
__maxSize: The number of bytes allocated for this number
__result: true if the process was successful and maxSize was large enough. Return false if the size was not enough.

eturns the pointer to the converted string s. (this is used for sprintf and keep it compatible with what we have)

alirezasafdari:
Correction to my suggested dtostrf
eturns the pointer to the converted string s. (this is used for sprintf and keep it compatible with what we have)

You are making SUCH a big deal out of a VERY trivial thing.

And, any version of "dtostrf" does not need to be used with "sprintf". It already returns the formatted string (in a buffer of sufficient size that you have to supply - I hope you know).

As far as a "guide" being available to mod your IDE... all the information is in the post. I said which java source files are changed and what the changes are.

If I need to go into more detail than that (for example, how to re-compile the whole IDE) then I suggest not even trying.