Go Down

Topic: String/sprintf alternative specifically for Arduino. (Read 10343 times) previous topic - next topic

pYro_65

May 15, 2013, 07:02 pm Last Edit: May 22, 2013, 03:42 pm by pYro_65 Reason: 1
G'day fellow Arduino people.

I have recently put together a class which inherits the Arduino core 'Print' class, it implements a memory target which provides all the standard print functions.

I got encouraged by its possibilities and decided to flesh it out for sharing. Now its a nice alternative to 'Strings' as it allows easy concatenation of data. without a malloc/free pair for every little operation. Also it can replace basic usage of sprintf.

When using libraries such as Serial, it has little overhead as the print functionality may already be used and included.

All this said, I have not really tested it heavily, and probably needs tweaks for certain scenarios.
Any ideas are welcome also.

Here is a list of functionality.
Quote
  • template< typename T > GString( const T *u_DataPtr );
       Constructor: pass in a valid pointer to any memory.
     
     
  • operator char*( void );
       Conversion operator ( GString to char* ), also allows accessing string elements directly using '[]' operators.
     
     
  • template< typename T > String &operator +=( const T &t );
       Allows use of the '+=' operator on anything the Print class accepts.

     
  • void clear( void );            
       Resets the string to empty.
       
     
  • void clear( bool b_Empty );
       Clears any previous data if requested. then resets.
     
     
  • template< typename T > GString &concat( const T &t );
       Provides chainable print methods.
       
     
  • template< typename T > GString &concat( const T &t, const int i );
       Provides chanable print methods utilising the second parameter.
       
     
  • size_t count( void ) const;        
       Returns the number of characters printed into the string.
       
     
  • void end( void );            
       Adds a terminating null. Buffer will still increase after ending a string.
     
     
  • size_t find( const char &c_Character );  
       Find a character within the string. Its index is returned or -1.

     
  • pft printf( const char *format, ... );
       Prints formatted text to the buffer.

     
  • void repeat( const char &c_Character, unsigned char u_Count );
       Prints c_Character into the string repeating u_Count times.
             
     
  • void toLower( void );          
       Converts the string into its lower case equivalent.
       
     
  • void toUpper( void );          
       Converts the string into its upper case equivalent.

     
  • void translate( const char &c_StartLow, const char &c_StartHigh, const char &c_EndLow );
       Converts a range of characters to a different part of the ascii table.


The library code is at the bottom, below is its usage.

Note: all examples produce output resembling the following:
Quote
This is a float test: 123.456
Analog pin 0 value: 439


Code: [Select]
//Buffer to store data.
char TestArr[ 128 ];

//Create string, passing in buffer pointer.
GString g_Test( TestArr );

//Use like Serial
g_Test.print( "This is a float test: " );
g_Test.println( 123.456, 3 );
g_Test.print( "Analog pin 0 value: " );
g_Test.print( analogRead( A0 ), DEC );  


The library also supports += operators:
Code: [Select]
GString g_Test( TestArr );

g_Test += "This is a float test: ";
g_Test += 123.456f;
g_Test += "\r\nAnalog pin 0 value: ";
g_Test += analogRead( A0 );


I have also added a function 'concat' for chained printing ( notice anonymous GString rather than variable ):
Code: [Select]
GString( TestArr ).concat( "This is a float test: " ).concat( 123.456, 3 ).concat( "\r\nAnalog pin 0 value: " ).concat( analogRead( A0 ), DEC );

And the GString class can be printed itself.
Code: [Select]
Serial.print( g_Test );

EDIT:
As the print functionality does not deal with null's, you have to take care when using the original pointer.
Either clear the pointer initially before use, or call end() as the last call to the GString before using the pointer.
The built in conversions are safe, a null is added, but will be overwritten if more print operations take place.
Update message.

The library:
Due to message length limits, the code is now an attachment to this post, an external link will be available soon.

Enjoy :)

EDIT: new code update, file located in this post: http://forum.arduino.cc/index.php?topic=166540.msg1247875#msg1247875.

Google code download link., download GString.h

robtillaart

Thanks for sharing,

Can it also be used on an lcd screen?
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

pYro_65


Can it also be used on an lcd screen?


If you wan't to concatenate a string then pass the finished result to the LCD you can. You can cast a variable of type 'GString' to a 'char*'. You can also use the buffer pointer passed to the class originally.

pYro_65

I forgot to mention, as it converts to a char*, you can also use array indexing with it:

Code: [Select]

GString g_Test( TestArr );

g_Test.print( "DEC: " ); //Wrong title.
g_Test.print( 0xABCD, HEX );

//Replace 'DEC' with 'HEX'
g_Test[ 0 ] = 'H';
g_Test[ 1 ] = 'E';
g_Test[ 2 ] = 'X';

Serial.print( g_Test );

robtillaart

Quote
If you wan't to concatenate a string then pass the finished result to the LCD you can. You can cast a variable of type 'GString' to a 'char*'. You can also use the buffer pointer passed to the class originally.

That means I can also use it for concatenating HTML strings over a socket etc.

+1 in usefulness
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

pYro_65

#5
May 15, 2013, 08:46 pm Last Edit: May 21, 2013, 05:09 pm by pYro_65 Reason: 1
Cheers, I'm hoping there are lots of little uses for it. I might add some extra string based functions like right & left trim, etc.

I have updated the class to be safer now, if you convert the class to a char*, a null is added.

An 'end' function has been added which does the same thing, adds a null. But 'end's usefulness is when you use the original pointer and the contents may be dirty before use.

These additions make it safe to use a buffer without clearing the contents first.

robtillaart

Quote
I might add some extra string based functions like right & left trim, etc.

check the String class interface I would say ;)

- if you implement the .mid(begin, end)   then .right(n) = .mid(size-n, size);  .left(n) = .mid(0, n)

some weirdo string class methods form the past
- .fox()  => fills with  "The big brown fox jumps quick over the lazy dog";
- .alphabet(bool case) => fills with abcd etc
- .repeat(number, char) => fills with "========" or "----------------" depending on char.
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

pYro_65


- if you implement the .mid(begin, end)   then .right(n) = .mid(size-n, size);  .left(n) = .mid(0, n)

some weirdo string class methods form the past
- .fox()  => fills with  "The big brown fox jumps quick over the lazy dog";
- .alphabet(bool case) => fills with abcd etc
- .repeat(number, char) => fills with "========" or "----------------" depending on char.


Some good starting points, thanks. I have implemented a few, but I'll post the changes later on once I have the set finished.

I have started to do some basic testing, which I was discussing elsewhere, so to keep everything together I'll post that information here.

For a simple concatenation of a string and integer,
sprintf had an overhead of 1758 bytes, the Print class' overhead was only 816 bytes.

Below is a sketch using the code posted here
Test number 1 to 3 should compile the same size, test 0 is sprintf.
What I didn't know about Arduino's sprintf is floating point numbers are not supported, whereas the print class handles floats fine.

Code: [Select]
#include <GENX.h>

//Modify this
#define TEST_NUMBER 0

void setup(){
 
  char buffer[ 64 ];
 
  Serial.begin( 115200 );
 
  #if TEST_NUMBER == 0
 
    sprintf( buffer, "sprintf integer copy test: %d", analogRead( 0 ) );
    Serial.print( buffer );
   
  #elif TEST_NUMBER == 1
 
    GString g( buffer );
    g += "sprintf integer copy test: ";
    g += analogRead( 0 );
    Serial.print( g );
 
  #elif TEST_NUMBER == 2
 
    GString g( buffer );
    g.print( "sprintf integer copy test: " );
    g.print( analogRead( 0 ) );
    Serial.print( g ); 
 
  #elif TEST_NUMBER == 3
 
    Serial.print( GString( buffer ).concat( "sprintf integer copy test: " ).concat( analogRead( 0 ) ) ); 
 
  #endif
}

void loop(){}

pYro_65

#8
May 16, 2013, 12:27 pm Last Edit: May 16, 2013, 12:49 pm by pYro_65 Reason: 1
I have updated my code and reply #1 at the top.

Some string functions have been added, these include: repeat, translate, toupper/lower, find.

The translate function is strange, apart from providing the toupper/lower functions it can switch numbers and letters:
Code: [Select]

char arr[ 12 ];
float x = 123.456f;
   
GString g( arr );
g.print( x, 3 );

//Contains '123.456'

g.translate( '0', '9', 'a' );

//Contains 'bcd.efg'


EDITED

robtillaart

Quote
What I didn't know about Arduino's sprintf is floating point numbers are not supported, whereas the print class handles floats fine.

+1 again

I have an update for the print class here - http://forum.arduino.cc/index.php?topic=166041.0 - to handle Scientific notation for larger

you might check it with GString()
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

pYro_65

Yeah, I'll give it a test. As its part of the print library it should seamlessly fit in.

Cheers.

obiwanjacobi

#11
May 19, 2013, 09:38 pm Last Edit: May 19, 2013, 09:45 pm by obiwanjacobi Reason: 1
How do you know that you are not writing past your array bounds?
From what I can see the GString class does not do any bounds checking.
I also do not get the use of template parameter T. Could you not have used void* and cast that to uint8_t? And why not use char? Each template method generates extra code that has little/no? use.

I also do not like the Print class but I can understand why you would want to stick with that. I rewrote the Print class to a template class (TextWriter). That class implements only the knowledge on how to convert a value to text, nothing else. The Print class does that also but my TextWriter does not take up any permanent RAM bytes (Print class has virtuals/v-table and an int member) at the cost of generating more code (being a template class).

Writing a string class, you would have to give considerable thought to how you would want to manage the memory it takes to store those strings. Especially with concatenation of string. I would make a distinction between operations that work on the string itself (like toUpper) and functions that produce new strings. The latter type I would make static to drive the point home that you are creating a new string - that needs new memory. I would make the methods robust enough so you could use one parameter as the result also (copy into itself). I would
Would also be nice to encapsulate the PRGMEM strings.

Something like:

Code: [Select]
String(char* buffer, unsigned int capacity);
unsigned int getCapacity();
unsigned int getLength();
char GetCharAt(int index);
bool SetCharAt(int index, char value);
bool InsertCharAt(int index, char value);
const char* operator();

// base can be same as result - provided capacity is enough
static bool Concat(String& base, String& tail, String& result);
static bool Concat(String& base, String& mid, String& tail, String& result);

static String& Empty();


I would keep the class very lean (like you have done). I would perhaps make a different class to find things in strings (StringReader?). FindFirst, FindLast, IsDigit, SubString (could go both ways) and the likes.

Hope it helps.

pYro_65

#12
May 19, 2013, 10:08 pm Last Edit: May 19, 2013, 10:11 pm by pYro_65 Reason: 1
Quote
I also do not get the use of template parameter T. Could you not have used void* and cast that to uint8_t? And why not use char? Each template method generates extra code that has little/no? use.


There is a good reason for it.
You would need to pass semantic information and use it in switch / if block to determine the appropriate cast so the best overload can be selected, then all the print variants would need to be included, this becomes very costly ( and not to mention strict aliasing rules ).

Contrary to what you may first think, my templates generate no extra code due to being inline ( the trick is the constant references ). The two snippets below are equivalent, no function calls are generated for concat but rather straight through to the appropriate print method.

Code: [Select]
 char buffer[ 64 ];
 GString g( buffer );
 g.concat( 44 );
//----------------------------------
 char buffer[ 64 ];
 GString g( buffer );
 g.print( 44 );


Quote
From what I can see the GString class does not do any bounds checking.


That is on the onus of the owner of the buffer, just like normal string routines. GString is not quite a string, I originally wanted to call it MemoryPrinter or something similar.

The notion is that one can print to memory sequentially as if it were serial, SD, or an LCD, etc... Having the ability to access the data already printed is merely a bonus.

pgm functionality is not far off, I have a large update I'll hopefully have tomorrow.

The print class like you say is heavy on things like virtual calls and such, but as we can see many libraries use this functionality already, so using this for string support rather than cstrings or sprintf, you can prevent the unavoidable compiling of multiple algorithms for the same job. This may overall be far more efficient code wise, and only suffer an extra pointer dereference to manipulate the v-table.

Thank you for your response, check back soon for the new stuff.

pYro_65

#13
May 21, 2013, 05:03 pm Last Edit: May 22, 2013, 02:55 pm by pYro_65 Reason: 1
Awesome new update people!

I have just painstakingly finished a version of my library which now includes a printf function.
This is a big step, as now you can use code like:
Code: [Select]
g.printf( "Some numbers: %d, %ld, %010d", 1977, 650000, 1977 );

And produce:
Quote
Some numbers: 1977, 650000, 0000001977


Before I get into to many details there are a few major advantages to using my code.


  • It can be used to create a sprintf function ( see below ).

  • GString compiles smaller than sprintf.

  • GString supports more features than the standard sprintf.

  • If the Arduino team likes my idea, printf can be moved into the 'Print' class where it belongs. This would give all classes deriving 'Print' the ability to print formatted text diretly to their output.
    This list includes Serial, Ethernet, WiFi, SD, LiquidCrystal, Wire, SoftwareSerial, Stream and more!



I have added an overload for sprintf in the class file, so all you have to do is include the file into your sketch.
Here is a breakdown of my sprintf function.

Quote from: GString.h sprintf

sprintf function.
 Prints formatted text to a buffer.
 
 Formatting options use the following syntax:
   %[flags][width][length]specifier
   
 Flags:
   -:  Left-justify within the given field width; Right justification is the default.
   0:  When padding is specified, zeros are used instead of spaces.
   
 Width:
(number):   Minimum number of characters to be printed.
         If the value to be printed is shorter than this number,
         the result is padded with blank spaces.
         The value is not truncated even if the result is larger.                  

    *:   The width is not specified in the format string,
   but as an additional integer value argument preceding
   the argument that has to be formatted.

 Length:
   l:  d, i use long instead of int. u, x use unsigned long instead of unsigned int.
 Specifier:
   s:  String ( null terminated ).
   d:  Signed decimal integer ( 32bits max ).
   i:  Same as 'd'.
   u:  Unsigned decimal integer ( 32bits max ).
   f:  Decimal floating point number.
   x:  Unsigned decimal integer ( 32bits max ).
   c:  Character.
   %:  Escape character for printing '%'


As you can see it supports floating point data.
Below is a test you can use to see the difference.

Code: [Select]
//Uncomment the line below to use the improved sprintf
//#include <GString.h>

void setup(){
 char buffer[ 64 ];
 Serial.begin(115200);
 Serial.println( "=================================================" );
 Serial.println( "This is a test using standard print functionality\r\n" );
 Serial.println( 123456 );
 Serial.println( 123.456 );
 Serial.println( 1193046, HEX );
 Serial.println( "=================================================" );
 Serial.println( "This is a test using sprintf functionality\r\n" );
 sprintf ( buffer, "Characters: %c %c \n", 'a', 65 );
 Serial.print( buffer );  
 sprintf ( buffer, "Decimals: %d %ld\n", 1977, 650000 );
 Serial.print( buffer );
 sprintf ( buffer, "floats: %f\n", 3.1416f );
 Serial.print( buffer );
 sprintf ( buffer, "Preceding with blanks: %10d \n", 1977 );
 Serial.print( buffer );  
 sprintf ( buffer, "Preceding with zeros: %010d \n", 1977 );
 Serial.print( buffer );    
 sprintf ( buffer, "Some different radices: DEC: %ld HEX:%lx \n", 650000, 650000 );
 Serial.print( buffer );
}

void loop(){}


On a mega the code is 574 bytes smaller, on an Uno 594 bytes less.
This leaves more room for things like a precision option or even the nice improvements under-way by robtillart here; these would allow different notations ( sprintf 'e' specifier ). And these might even still be smaller.

If you like these improvements please leave a message and tell your friends :P, it would be great to have something of my own added to the Arduino core.

Note: You have to be logged in to download the files unfortunately.

The file is best viewed in a document viewer like WordPad, notepad++ as the IDE doesn't handle tabs great.

THERE IS A NEW VERSION, CHECK MY POST BELOW!

pYro_65

#14
May 22, 2013, 02:59 pm Last Edit: May 26, 2013, 01:08 pm by pYro_65 Reason: 1
Had an interesting find. It appears the version of sprintf I'm modifying is the same one used by atmel, which is included with the wifi shield code, and most probably inside libc.a ( AVR Arduinos ).

You can find the resources I used here: http://www.menie.org/georges/embedded/#printf

So using the Arduino framework has proved quite powerful as I now have a system that slots right in and won't break anyone's code ( fingers crossed ).

I have now added a new parameter to the width option, it is the standard '*' which allows specifying the width in another parameter, rather than in the format string.

There are other custom features I can put in if interest arises, like progmem strings and/or data. Also I have put switches ( defines ) in that can disable the extra features, allowing you to use an Arduino standard equivalent at a smaller price.

I'm going to modify the print library now, putting the printf functionality there, this will allow the extensions I mentioned in my previous post.

Google code download link.

There is now a patch released for the Print library with all of the functionality built in. LINK

Although if you use a Due, this is the file you need as the patch is AVR only ( will be compatible soon ), it is out of date now as I have more functionality in the patch. So if you like this version and want the full feature set, leave a post and I'll put a new version together.

Go Up