String/sprintf alternative specifically for Arduino.

G'day fellow Arduino people.

I have recently put together a class which inherits the Arduino core 'Print' class, it implements a memory target which provides all the standard print functions.

I got encouraged by its possibilities and decided to flesh it out for sharing. Now its a nice alternative to 'Strings' as it allows easy concatenation of data. without a malloc/free pair for every little operation. Also it can replace basic usage of sprintf.

When using libraries such as Serial, it has little overhead as the print functionality may already be used and included.

All this said, I have not really tested it heavily, and probably needs tweaks for certain scenarios.
Any ideas are welcome also.

Here is a list of functionality.

  1. template< typename T > GString( const T *u_DataPtr );
    Constructor: pass in a valid pointer to any memory.
  • operator char*( void );
    Conversion operator ( GString to char* ), also allows accessing string elements directly using '[]' operators.

  • template< typename T > String &operator +=( const T &t );
    Allows use of the '+=' operator on anything the Print class accepts.

  • void clear( void );
    Resets the string to empty.

  • void clear( bool b_Empty );
    Clears any previous data if requested. then resets.

  • template< typename T > GString &concat( const T &t );
    Provides chainable print methods.

  • template< typename T > GString &concat( const T &t, const int i );
    Provides chanable print methods utilising the second parameter.

  • size_t count( void ) const;
    Returns the number of characters printed into the string.

  • void end( void );
    Adds a terminating null. Buffer will still increase after ending a string.

  • size_t find( const char &c_Character );
    Find a character within the string. Its index is returned or -1.

  • pft printf( const char *format, ... );
    Prints formatted text to the buffer.

  • void repeat( const char &c_Character, unsigned char u_Count );
    Prints c_Character into the string repeating u_Count times.

  • void toLower( void );
    Converts the string into its lower case equivalent.

  • void toUpper( void );
    Converts the string into its upper case equivalent.

  • void translate( const char &c_StartLow, const char &c_StartHigh, const char &c_EndLow );
    Converts a range of characters to a different part of the ascii table.[/quote]

The library code is at the bottom, below is its usage.

Note: all examples produce output resembling the following:

This is a float test: 123.456
Analog pin 0 value: 439

//Buffer to store data.

char TestArr[ 128 ];

//Create string, passing in buffer pointer.
GString g_Test( TestArr );

//Use like Serial
g_Test.print( "This is a float test: " );
g_Test.println( 123.456, 3 );
g_Test.print( "Analog pin 0 value: " );
g_Test.print( analogRead( A0 ), DEC );




The library also supports += operators:


GString g_Test( TestArr );

g_Test += "This is a float test: ";
g_Test += 123.456f;
g_Test += "\r\nAnalog pin 0 value: ";
g_Test += analogRead( A0 );




I have also added a function 'concat' for chained printing ( notice anonymous GString rather than variable ):


GString( TestArr ).concat( "This is a float test: " ).concat( 123.456, 3 ).concat( "\r\nAnalog pin 0 value: " ).concat( analogRead( A0 ), DEC );




And the GString class can be printed itself.


Serial.print( g_Test );

- *EDIT**:
As the print functionality does not deal with null's, you have to take care when using the original pointer.
Either clear the pointer initially before use, or call end() as the last call to the GString before using the pointer.
The built in conversions are safe, a null is added, but will be overwritten if more print operations take place.
[Update message.](http://forum.arduino.cc/index.php?topic=166540.msg1241168#msg1241168)
- *The library**:
- *Due to message length limits, the code is now an attachment to this post, an external link will be available soon.**

Enjoy :)
- *EDIT**: new code update, file located in this post: http://forum.arduino.cc/index.php?topic=166540.msg1247875#msg1247875.

[Google code download link.](https://code.google.com/p/arduino-extensions/downloads/list), download GString.h

[GString.h|attachment](upload://uyYzQapwi8TaOsVfzOZg8TFhW3K.h) (5.38 KB)

Thanks for sharing,

Can it also be used on an lcd screen?

robtillaart:
Can it also be used on an lcd screen?

If you wan't to concatenate a string then pass the finished result to the LCD you can. You can cast a variable of type 'GString' to a 'char*'. You can also use the buffer pointer passed to the class originally.

I forgot to mention, as it converts to a char*, you can also use array indexing with it:

GString g_Test( TestArr );

g_Test.print( "DEC: " ); //Wrong title.
g_Test.print( 0xABCD, HEX );

//Replace 'DEC' with 'HEX'
g_Test[ 0 ] = 'H';
g_Test[ 1 ] = 'E';
g_Test[ 2 ] = 'X';

Serial.print( g_Test );

If you wan't to concatenate a string then pass the finished result to the LCD you can. You can cast a variable of type 'GString' to a 'char*'. You can also use the buffer pointer passed to the class originally.

That means I can also use it for concatenating HTML strings over a socket etc.

+1 in usefulness

Cheers, I'm hoping there are lots of little uses for it. I might add some extra string based functions like right & left trim, etc.

I have updated the class to be safer now, if you convert the class to a char*, a null is added.

An 'end' function has been added which does the same thing, adds a null. But 'end's usefulness is when you use the original pointer and the contents may be dirty before use.

These additions make it safe to use a buffer without clearing the contents first.

I might add some extra string based functions like right & left trim, etc.

check the String class interface I would say :wink:

  • if you implement the .mid(begin, end) then .right(n) = .mid(size-n, size); .left(n) = .mid(0, n)

some weirdo string class methods form the past

  • .fox() => fills with "The big brown fox jumps quick over the lazy dog";
  • .alphabet(bool case) => fills with abcd etc
  • .repeat(number, char) => fills with "========" or "----------------" depending on char.

robtillaart:

  • if you implement the .mid(begin, end) then .right(n) = .mid(size-n, size); .left(n) = .mid(0, n)

some weirdo string class methods form the past

  • .fox() => fills with "The big brown fox jumps quick over the lazy dog";
  • .alphabet(bool case) => fills with abcd etc
  • .repeat(number, char) => fills with "========" or "----------------" depending on char.

Some good starting points, thanks. I have implemented a few, but I'll post the changes later on once I have the set finished.

I have started to do some basic testing, which I was discussing elsewhere, so to keep everything together I'll post that information here.

For a simple concatenation of a string and integer,
sprintf had an overhead of 1758 bytes, the Print class' overhead was only 816 bytes.

Below is a sketch using the code posted here
Test number 1 to 3 should compile the same size, test 0 is sprintf.
What I didn't know about Arduino's sprintf is floating point numbers are not supported, whereas the print class handles floats fine.

#include <GENX.h>

//Modify this
#define TEST_NUMBER 0

void setup(){
  
  char buffer[ 64 ];
  
  Serial.begin( 115200 );
  
  #if TEST_NUMBER == 0
  
    sprintf( buffer, "sprintf integer copy test: %d", analogRead( 0 ) );
    Serial.print( buffer );
    
  #elif TEST_NUMBER == 1
  
    GString g( buffer );
    g += "sprintf integer copy test: ";
    g += analogRead( 0 );
    Serial.print( g );
  
  #elif TEST_NUMBER == 2
  
    GString g( buffer );
    g.print( "sprintf integer copy test: " );
    g.print( analogRead( 0 ) );
    Serial.print( g );  
  
  #elif TEST_NUMBER == 3
  
    Serial.print( GString( buffer ).concat( "sprintf integer copy test: " ).concat( analogRead( 0 ) ) );  
  
  #endif
}

void loop(){}

I have updated my code and reply #1 at the top.

Some string functions have been added, these include: repeat, translate, toupper/lower, find.

The translate function is strange, apart from providing the toupper/lower functions it can switch numbers and letters:

char arr[ 12 ];
float x = 123.456f;
    
GString g( arr );
g.print( x, 3 );

//Contains '123.456'

g.translate( '0', '9', 'a' );

//Contains 'bcd.efg'

EDITED

What I didn't know about Arduino's sprintf is floating point numbers are not supported, whereas the print class handles floats fine.

+1 again

I have an update for the print class here - Proposed update for the printFloat code of print.cpp - Libraries - Arduino Forum - to handle Scientific notation for larger

you might check it with GString()

Yeah, I'll give it a test. As its part of the print library it should seamlessly fit in.

Cheers.

How do you know that you are not writing past your array bounds?
From what I can see the GString class does not do any bounds checking.
I also do not get the use of template parameter T. Could you not have used void* and cast that to uint8_t? And why not use char? Each template method generates extra code that has little/no? use.

I also do not like the Print class but I can understand why you would want to stick with that. I rewrote the Print class to a template class (TextWriter). That class implements only the knowledge on how to convert a value to text, nothing else. The Print class does that also but my TextWriter does not take up any permanent RAM bytes (Print class has virtuals/v-table and an int member) at the cost of generating more code (being a template class).

Writing a string class, you would have to give considerable thought to how you would want to manage the memory it takes to store those strings. Especially with concatenation of string. I would make a distinction between operations that work on the string itself (like toUpper) and functions that produce new strings. The latter type I would make static to drive the point home that you are creating a new string - that needs new memory. I would make the methods robust enough so you could use one parameter as the result also (copy into itself). I would
Would also be nice to encapsulate the PRGMEM strings.

Something like:

String(char* buffer, unsigned int capacity);
unsigned int getCapacity();
unsigned int getLength();
char GetCharAt(int index);
bool SetCharAt(int index, char value);
bool InsertCharAt(int index, char value);
const char* operator();

// base can be same as result - provided capacity is enough
static bool Concat(String& base, String& tail, String& result);
static bool Concat(String& base, String& mid, String& tail, String& result);

static String& Empty();

I would keep the class very lean (like you have done). I would perhaps make a different class to find things in strings (StringReader?). FindFirst, FindLast, IsDigit, SubString (could go both ways) and the likes.

Hope it helps.

I also do not get the use of template parameter T. Could you not have used void* and cast that to uint8_t? And why not use char? Each template method generates extra code that has little/no? use.

There is a good reason for it.
You would need to pass semantic information and use it in switch / if block to determine the appropriate cast so the best overload can be selected, then all the print variants would need to be included, this becomes very costly ( and not to mention strict aliasing rules ).

Contrary to what you may first think, my templates generate no extra code due to being inline ( the trick is the constant references ). The two snippets below are equivalent, no function calls are generated for concat but rather straight through to the appropriate print method.

  char buffer[ 64 ];
  GString g( buffer );
  g.concat( 44 );
//----------------------------------
  char buffer[ 64 ];
  GString g( buffer );
  g.print( 44 );

From what I can see the GString class does not do any bounds checking.

That is on the onus of the owner of the buffer, just like normal string routines. GString is not quite a string, I originally wanted to call it MemoryPrinter or something similar.

The notion is that one can print to memory sequentially as if it were serial, SD, or an LCD, etc... Having the ability to access the data already printed is merely a bonus.

pgm functionality is not far off, I have a large update I'll hopefully have tomorrow.

The print class like you say is heavy on things like virtual calls and such, but as we can see many libraries use this functionality already, so using this for string support rather than cstrings or sprintf, you can prevent the unavoidable compiling of multiple algorithms for the same job. This may overall be far more efficient code wise, and only suffer an extra pointer dereference to manipulate the v-table.

Thank you for your response, check back soon for the new stuff.

Awesome new update people!

I have just painstakingly finished a version of my library which now includes a printf function.
This is a big step, as now you can use code like:

g.printf( "Some numbers: %d, %ld, %010d", 1977, 650000, 1977 );

And produce:

Some numbers: 1977, 650000, 0000001977

Before I get into to many details there are a few major advantages to using my code.

  • It can be used to create a sprintf function ( see below ).
  • GString compiles smaller than sprintf.
  • GString supports more features than the standard sprintf.
  • If the Arduino team likes my idea, printf can be moved into the 'Print' class where it belongs. This would give all classes deriving 'Print' the ability to print formatted text diretly to their output.
    This list includes Serial, Ethernet, WiFi, SD, LiquidCrystal, Wire, SoftwareSerial, Stream and more!

I have added an overload for sprintf in the class file, so all you have to do is include the file into your sketch.
Here is a breakdown of my sprintf function.

As you can see it supports floating point data.
Below is a test you can use to see the difference.

//Uncomment the line below to use the improved sprintf
//#include <GString.h>

void setup(){
  char buffer[ 64 ];
  Serial.begin(115200);
  Serial.println( "=================================================" );
  Serial.println( "This is a test using standard print functionality\r\n" );
  Serial.println( 123456 );
  Serial.println( 123.456 );
  Serial.println( 1193046, HEX );
  Serial.println( "=================================================" );
  Serial.println( "This is a test using sprintf functionality\r\n" );
  sprintf ( buffer, "Characters: %c %c \n", 'a', 65 );
  Serial.print( buffer );  
  sprintf ( buffer, "Decimals: %d %ld\n", 1977, 650000 );
  Serial.print( buffer );
  sprintf ( buffer, "floats: %f\n", 3.1416f );
  Serial.print( buffer );
  sprintf ( buffer, "Preceding with blanks: %10d \n", 1977 );
  Serial.print( buffer );   
  sprintf ( buffer, "Preceding with zeros: %010d \n", 1977 );
  Serial.print( buffer );    
  sprintf ( buffer, "Some different radices: DEC: %ld HEX:%lx \n", 650000, 650000 );
  Serial.print( buffer ); 
}

void loop(){}

On a mega the code is 574 bytes smaller, on an Uno 594 bytes less.
This leaves more room for things like a precision option or even the nice improvements under-way by robtillart here; these would allow different notations ( sprintf 'e' specifier ). And these might even still be smaller.

If you like these improvements please leave a message and tell your friends :P, it would be great to have something of my own added to the Arduino core.

Note: You have to be logged in to download the files unfortunately.

The file is best viewed in a document viewer like WordPad, notepad++ as the IDE doesn't handle tabs great.

THERE IS A NEW VERSION, CHECK MY POST BELOW!

GString.h (12.7 KB)

Had an interesting find. It appears the version of sprintf I'm modifying is the same one used by atmel, which is included with the wifi shield code, and most probably inside libc.a ( AVR Arduinos ).

You can find the resources I used here: Free source code for embedded software

So using the Arduino framework has proved quite powerful as I now have a system that slots right in and won't break anyone's code ( fingers crossed ).

I have now added a new parameter to the width option, it is the standard '*' which allows specifying the width in another parameter, rather than in the format string.

There are other custom features I can put in if interest arises, like progmem strings and/or data. Also I have put switches ( defines ) in that can disable the extra features, allowing you to use an Arduino standard equivalent at a smaller price.

I'm going to modify the print library now, putting the printf functionality there, this will allow the extensions I mentioned in my previous post.

Google code download link.

There is now a patch released for the Print library with all of the functionality built in. LINK

Although if you use a Due, this is the file you need as the patch is AVR only ( will be compatible soon ), it is out of date now as I have more functionality in the patch. So if you like this version and want the full feature set, leave a post and I'll put a new version together.

GString.h (13.3 KB)

progmem (+1) support would be great as it allows to build up (almost) any length output string.

Binary printing %b
int x = 15;
g.printf("%b", x); -> "1111" // just the bits needed
g.printf("%0b", x); -> "00001111" // preceding zero's
g.printf("%00b", x); -> "0000000000001111" // 2 zeros makes it 2 bytes
g.printf("%0.0b", x); -> "0000.0000.0000.1111" // . char as separator (to easier recognize individual bits

maybe a better alternative could be %4b %8b %16b %32b or %20b or %24b in short to explicitly give the length

Direct EEPROM printing would be great too as otherwise I need to make a local copy to insert into the sprintf buffer.
%r (r from rom?)

int offset = 65
g.printf("%4r", offset); // 4 bytes from position 65-68

I would definitely like to do some abstract options too, I've focused mainly on the stuff supported by the Print library. PROGMEM support is within the print library so that'll get added no worries.

Also I'm almost finished testing a patch for the core, which requires my library and two other files. It inserts the print functionality and the new sprintf straight in. I'll finish up and post it shortly, but I just tested this code with flying colours.

  Serial.printf( "Characters: %c %c \n", 'a', 65 );
  Serial.printf( "Decimals: %d %ld\n", 1977, 650000 );
  Serial.printf( "floats: %f\n", 3.1416f );
  Serial.printf( "Preceding with blanks: %10d \n", 1977 );
  Serial.printf( "Preceding with zeros: %010d \n", 1977 );
  Serial.printf( "Preceding with zeros(width passed as param): %0*d \n", 10, 1977 );
  Serial.printf( "Some different radices: DEC: %ld HEX:%lx \n", 650000, 650000 );

Which produces this output:

Characters: a A
Decimals: 1977 650000
floats: 3.14
Preceding with blanks: 1977
Preceding with zeros: 0000001977
Preceding with zeros(width passed as param): 0000001977
Some different radices: DEC: 650000 HEX:9EB10

It replaced this ugly code:

sprintf ( buffer, "Characters: %c %c \n", 'a', 65 );
  Serial.print( buffer );  
  sprintf ( buffer, "Decimals: %d %ld\n", 1977, 650000 );
  Serial.print( buffer );
  sprintf ( buffer, "floats: %f\n", 3.1416f );
  Serial.print( buffer );
  sprintf ( buffer, "Preceding with blanks: %10d \n", 1977 );
  Serial.print( buffer );   
  sprintf ( buffer, "Preceding with zeros: %010d \n", 1977 );
  Serial.print( buffer );   
  sprintf ( buffer, "Preceding with zeros(width passed as param): %0*d \n", 10, 1977 );  
  Serial.print( buffer );    
  sprintf ( buffer, "Some different radices: DEC: %ld HEX:%lx \n", 650000, 650000 );
  Serial.print( buffer );

And that same functionality is available to all Print classes like: Ethernet, WiFi, SD, LiquidCrystal, Wire, SoftwareSerial.

The EEPROM idea is nice, I'll look into that, I'm thinking of making a collection of writeable objects so the EEPROM would fit there nicely.

+++++

the eeprom printing should have additional modifiers to indicate the type

int offset = 65
g.printf("%4br", offset); // default in this case 4 bytes from position 65-68

g.printf("%br", offset); // print byte 65 as a byte
g.printf("%cr", offset); // print byte 65 as an (ASCII) character
g.printf("%ir", offset); // print byte 65 66 as a int
g.printf("%ur", offset); // print byte 65 66 as an unsigned int
g.printf("%lr", offset); // print byte 65-68 as a long
g.printf("%ulr", offset); // print byte 65-68 as an unsigned long

the numeric offset makes printing strings easy.

I have applied a few optimisations and PROGMEM strings are working now. It is quite different to the GString version as it uses the print pipeline for everything whereas GString could write to memory for a few things. Maybe a bit slower, but not larger.

p: PROGMEM string. No formatting takes place, the string is printed directly.

I'm going to add the EEPROM test asap, then I'll post the files for the patch.

I've been experimenting some different ways to do things.

Here are a few common functions implemented using the Print library.

	inline int sprintf( char * str, const char * format, ... )
		{ 
			va_list v_List;
			va_start( v_List, format );
			GString g( str );
			int i_Return = g._printf( format, v_List );
			g.end();
			va_end( v_List );
			return i_Return;
		}
		
	inline char *itoa( int value, char * str, int base )
		{
			GString( str ).print( ( long ) value, base );
			return str;
		}
		
	inline char *utoa( unsigned int value, char * str, int base )
		{
			GString( str ).print( ( unsigned long ) value, base );
			return str;
		}		
		
	inline char *ltoa( long value, char * str, int base )
		{
			GString( str ).print( value, base );
			return str;
		}
		
	inline char *ultoa( unsigned long value, char * str, int base )
		{
			GString( str ).print( value, base );
			return str;
		}

On a large test sketch the optimisations made a huge difference.

  • 7762 bytes: Original Arduino code.
  • 7388 bytes: Print based sprintf
  • 7274 bytes: using Print based itoa, utoa, ltoa, ultoa
  • 7014 bytes: using printf
  • 6958 bytes: force itoa and utoa using 32-bit conversions. ( sprintf does not use 16 bit Print code. )

I have the eeprom code ready too. I'll give it a testing soon too.