A function to handle multiple datatypes

With "common implementation merging" in the linker, template functions that have the same actual bytecode (because they serialize values of the same size, say,) will all use the same "actual" function under the hood, leading to less bloat. The "pointer and size" function doesn't have any bloat in implementation, but instead there's a little bit of bloat each time you call it, as the size has to be passed as an argument, not hard-coded in the function.

If size REALLY matters, you have to implement it both ways and measure it :slight_smile:

jwatte:
With "common implementation merging" in the linker, template functions that have the same actual bytecode (because they serialize values of the same size, say,) will all use the same "actual" function under the hood, leading to less bloat. The "pointer and size" function doesn't have any bloat in implementation, but instead there's a little bit of bloat each time you call it, as the size has to be passed as an argument, not hard-coded in the function.

If size REALLY matters, you have to implement it both ways and measure it :slight_smile:

I've read that size always matters, but then again this is a family oriented site so I will leave it at that. :wink:

If you are not considering why we posted our concerns, how about asking for a proof why it is safe. C/C++ is built upon a set of rules; if it is allowed, the standard will state it.

Up until your post everything seemed to be a battle of opinions. You were the first to quote rules and I thank you for it. Where possible I shall avoid void pointers.

Code:

#include <Arduino.h>

template void sendAnything (const T& value)
  {
  const byte* p = (const byte*) &value;
  for (unsigned int i = 0; i < sizeof value; i++)
      Serial.write (*p++);
  }  // end of sendAnything

Thanks for this, Nick Gammon, I shall implement this on the morrow.

So one last question: Overloading or Templating?

As far as I know templating is basically "automatic overloading". That is, to save you the trouble of writing (say) 5 overloaded functions, templates will write them for you, from the template.

So I would template, as, if you ever need to change how it works, you fix one function and not five.

If the code for all data types effectively just works out the size and address of the memory occupied by the data, it makes no odds which approach you take and could even be reduced to a macro. The important thing in my mind is whether you expect to ever need to apply this function to something where the decision about which bytes to serialise needs to be type-specific. For example, to serialise a string value it would be sensible to measure the string length rather than just output the pointer, or the first char pointed to. When exporting a struct or class that contains pointers, it might be necessary to do a deep copy rather than a shallow one. Same for any other non-trivial data structures.

However, if you're mainly interested in coping with primitive variables that have different sizes, all those considerations are irrelevant.

Nick has provided very convincing arguments for using templates. One more question still. The OP's requirements were:

I want a function that I can pass either a string of characters or an integer or an object ..

He probably expects to send the strings without terminating zeros. The code below won't probably work with the template version of sendAnything, will it? Can sendAnything be fixed to work with strings as well?

char *foo1 = "foo1";
char foo2[] = "foo2";

sendAnything(foo1);
sendAnything(foo2);

No, a pointer would be disastrous. p would be an address to a pointer.

You can overload the function, but still isn't sufficient for arrays.

/*May cause ambiguities occur, if so the function prototypes need more specific matches. ( all constant data can be accessed via references, so constant pointers may not be able to match a higher overload.*/
template <typename T> void sendAnything (const T& value)
  {
  const byte* p = (const byte*) &value;
  for (unsigned int i = 0; i < sizeof( value ); i++) 
      Serial.write (*p++);
  }  // end of sendAnything

template <typename T> void sendAnything (const T* value)
  {
    const byte* p = (const byte*) value;
    for (unsigned int i = 0; i < sizeof( T ); i++) 
      Serial.write (*p++);
  }

Or partially specialise a class template:

template <typename T> struct Sender{

  static void Send( const T& value ){

    const byte* p = (const byte*) &value;
    for (unsigned int i = 0; i < sizeof( value ); i++) 
      Serial.write (*p++);
  }
}

template <typename T> struct Sender< T* >{

  static void Send( const T* value ){

    const byte* p = (const byte*) value;
    for (unsigned int i = 0; i < sizeof( T ); i++) 
      Serial.write (*p++);
  }
}

You can mix overloads with templates, so you can have a non-template function for char* and templates for everything else.
As with the class method you could do a full specialisation ( different from the partial specialisation above ) for the character arrays.
In the char* specific version, code could check for a null char.

After what pYro_65 just demonstrated, I am inclined to claim that for OP's case the sendAnything(const byte *value, unsigned int size) is more reasonable that all the extra hassle that the template alternative would require.

int MyInt = 42;
char *MyString = "foo";
sendAnything((const byte *)&MyInt, sizeof(MyInt));
sendAnything((const byte *)MyString, strlen(MyString));

pekkaa:
After what pYro_65 just demonstrated, I am inclined to claim that for OP's case the sendAnything(const byte *value, unsigned int size) is more reasonable that all the extra hassle that the template alternative would require.

Well let's not get too carried away. In five pages of this thread I am the only one who has posted an actual working sketch, and that sketch showed that the templated solution was simple, and produced shorter code. The rest is hypothetical talk about needing to serialize "an object" or "lots of different types".

Sure, you can invent an example where pointers might be simpler or produce less code, but how about seeing an actual example of the types of data that need to be serialized? In other words, the real requirement, not some sort of hypothetical requirement.

It would have been helpful if the original post had actually quoted some code, or the actual data types, rather than just "a string of characters or an integer or an object". That's pretty vague. Does the string of characters contain every possible value (eg. including 0x00)? Is it a 2-byte integer? What sort of object is it?

Is the receiving end an Arduino? Does it have the same endian-ness? Does it have the same float representation? Is Unicode involved? How do you handle errors? Do you want sumchecks? How do you know if you are receiving the string of characters, or the integer, or the object?

"all the extra hassle that the template alternative would require."

Agreed. All this talk of "this approach has some shortcomings so it should not be used" isn't understanding what coding is all about: making compromises so you produce a flawed and optimal solution to the task at hand.

It would have been helpful if the original post had actually quoted some code, or the actual data types, rather than just "a string of characters or an integer or an object". That's pretty vague. Does the string of characters contain every possible value (eg. including 0x00)? Is it a 2-byte integer? What sort of object is it?
Is the receiving end an Arduino? Does it have the same endian-ness? Does it have the same float representation? Is Unicode involved? How do you handle errors? Do you want sumchecks? How do you know if you are receiving the string of characters, or the integer, or the object?

OK, as requested, more information.
The receiving end is not an arduino, but a machine running Ubuntu. I am fairly certain that is has the same endian-ness as everything I have sent so far I have managed to receive without having to switch the byte-order. Floats are tricky, due to different ways of representing them and their inherent inaccuracy, so I am not going to use any. Nope, no unicode. Error checking will be via a checksum byte at the end of the serial packet (note: I have not implemented this bit yet). Immediately prior to calling sendAnything the program outputs 0xFE, which signifies the start of the frame, followed by a one byte code signifying the type of data being sent and another byte containing the address or destination code. After all the data is sent the code 0xFE is sent again to signify the end of the frame.

With this in mind the sendAnything function can be re-written to look like this:

template <class T> void sendAnything(const T& value, byte type, byte address)
{
    const byte* p = (const byte*)&value;
    unsigned int i;
    Serial.write(0x7E);
    Serial.write(type);
    Serial.write(address);
    for (i = 0; i < sizeof(value); i++) {
      if (*p == 0x7E || *p == 0x7D) {
        Serial.write(0x7D);
        Serial.write(*p++ ^ 0x20);
      } else {
        Serial.write(*p++);
      }
    }
    Serial.write(0x7E);
}

The data types I will have to send are: 8, 16 and 32 bit integers, both signed and unsigned. Strings of up to 20 characters in length, null terminated as per normal strings.
At the moment I don't need to send objects or structures, but I may have to in the future.

So with that in mind, maybe sendAnything should be sendSomeThings :stuck_out_tongue:

In that case, use pointer to an unsigned char + sizeof.

He probably expects to send the strings without terminating zeros.

foo1 and foo2 have terminating NULLs.

char nonsense[4] = { 'c', 'r', 'a', 'p' };

does not.

Though why one would want to do this is unclear. The definition of a string is "a NULL terminated array of chars", so your premise is flawed from the start.

PaulS:
Though why one would want to do this is unclear.

That depends which representation of the string we're talking about. It's reasonable to expect strings to be null-terminated in memory, but I wouldn't normally expect the terminator to be persisted or serialised.

but I wouldn't normally expect the terminator to be persisted or serialised.

Not serialized is reasonable. But, somehow, the size of the array (the number of non-NULL characters in the array) needs to be communicated.

I'm not sure what you mean by not persisted. If you mean, for instance, that when writing the string to a file, the NULL would not be written, I'd agree. But, still, the size of the string needs to be communicated somehow. Typically, that is done by replacing the NULL with a CR/LF, which allows some process that reads the data to know where the stream of characters ends.

PaulS:

He probably expects to send the strings without terminating zeros.

foo1 and foo2 have terminating NULLs.

char nonsense[4] = { 'c', 'r', 'a', 'p' };

does not.

Yes, I am very well aware that foo1 and foo2 have terminating nulls and that char nonsense[4] in your example doesn't. Instead of writing "He probably expects to send the strings without terminating zeros.", I perhaps should have written "When he sends strings, he probably doesn't want to send the terminating zeros."

PaulS:
Though why one would want to do this is unclear. The definition of a string is "a NULL terminated array of chars", so your premise is flawed from the start.

My premise wasn't flawed. My premise was that when he sends null terminating char arrays a.k.a strings, he probably wants to send the non-null characters only and not the terminating nulls. I am not sure if thats what the OP wanted, but it certainly is a valid assumption. And that is something that was not possible with Nick's templated sendAnytihing().

Yes, the string length needs to be known or communicated somehow. My point was that this is not usually done by null-terminating the string in its serialised or persisted form; null termination is how the C runtime library happens to indicate the string length, but this method is not widely used once you step outside the C runtime. For example, when you write a string to a text file the text is not null-terminated; instead, the file system explicitly records the length of the file in most file systems. When you write a string to a network stream, the network protocol provides a way of indicating the length of the string or denoting the end of the string, but these are typically not null-terminated. By the same logic, there's no reason to expect a serialisation mechanism that transfers strings between computers to also transfer the null terminator; the null is only relevant within the context of a computer that is storing the string within a null-terminated array and is not part of the actual string value.

Time to count the horse's teeth.

Here's the code:

#if (ARDUINO >= 100)
#include <Arduino.h>
#else
#include <WProgram.h>
#endif

#define PRINT_THAT_THING(X) print_anything((void *)&(X),sizeof(X)) // an evil macro!

struct S { int i; char j; char* p;} s; // a struct for testing
  
void setup()
{
  Serial.begin(115200);
  int n;
  n = PRINT_THAT_THING("1234567890");
  int a[] = {1,2,3,4,5,6,7,8,9,10};
  n = PRINT_THAT_THING(a);
  n = PRINT_THAT_THING(s);
  s.i = n; s.j = 'c'; s.p = NULL;
  n = PRINT_THAT_THING(s);
  n = PRINT_THAT_THING(Serial);
  n = PRINT_THAT_THING(n);
}

void loop()
{
}

int print_anything(void *v,int n) // get a raw address in the form of a void pointer
{
  uint8_t *p = (uint8_t *)v; // recast as a pointer to byte so we can use it for our evil purposes
  for (int i=0; i < n; i++) {
    uint8_t b = *p++; // get next byte value (BwaaHaHaHAha...)
    Serial.print("0x");
    Serial.print(b,HEX); // send it out the serial port
    Serial.print(" ");
  }
  Serial.print("\nThat thing had ");
  Serial.print(n);
  Serial.print(" bytes in it!\n\n"); 
  return n; // return no of bytes sent
}

which produces this:

0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x30 0x0 
That thing had 11 bytes in it!

0x1 0x0 0x2 0x0 0x3 0x0 0x4 0x0 0x5 0x0 0x6 0x0 0x7 0x0 0x8 0x0 0x9 0x0 0xA 0x0 
That thing had 20 bytes in it!

0x0 0x0 0x0 0x0 0x0 
That thing had 5 bytes in it!

0x5 0x0 0x63 0x0 0x0 
That thing had 5 bytes in it!

0x49 0x1 0x5D 0x1 0xC5 0x0 0xC4 0x0 0xC0 0x0 0xC1 0x0 0xC6 0x0 0x4 0x3 0x7 0x5 0x1 
That thing had 19 bytes in it!

0x13 0x0 
That thing had 2 bytes in it!

And I would just like to say that if I have inadvertantly offended anyone's religion by posting this, I really don't care at all. :grin:

@pico, your code only is aliased by a char type, which is what the standard says is acceptable for aliasing, therefore it has not subverted type safety, nor done any evil casts.

@all who are interested,
There are some important differences with types that I may not have made clear in my post earlier on ( here ).
When attempting to interpret data differently ( other than char ); the thing you have to remember is char is the only type you can cast to safely ( unless casting operators for the appropriate types are implemented into user defined types. Does not apply to primitive/built-in types like 'int' ). In my earlier post I had this example below.

void PrintIntFromStream( void *v_Stream ){
  int *i_StreamPtr = ( int* ) v_Stream;
  Serial.print( "Int aliased from stream: " );
  Serial.println( *i_StreamPtr  );
  return;
}

The difference being here is i_StreamPtr is aliasing void* with a non compatible type ( int ).
Also a few commonly accepted usages ( that are wrong ):

//Not just void* but any incompatible types.
uint32_t u_Value32 = 0x00ABCD00;
uint16_t u_Value16 = *( uint16_t* ) u_Value32;

//And the incompatible types reversed.
uint16_t u_Array[] = { 0xBAD, 0xF00D, 0x00 };
uint32_t u_Int32 = * ( uint32_t* )  u_Array;

These are all examples of the strict aliasing rule being broken. Which is why casts are evil, in particular void* as it always masks the original type of the incoming data. This is fundamentally why C++ constructs are far better for this job, especially templates; as with the template definition, you are always provided with a 'known' type.