Go Down

Topic: A function to handle multiple datatypes (Read 7300 times) previous topic - next topic

pekkaa

#45
Nov 03, 2012, 10:29 pm Last Edit: Nov 03, 2012, 10:45 pm by pekkaa Reason: 1
C/C++ hasn't been my primary programming language in recent years and the strict aliasing issue has evaded me. I not sure if I understood it correctly. There are plenty of standard C functions which have void * parameters, such as memset, memcpy, read, write, etc. Does the strict aliasing rule mean the code below is not valid any more?

Code: [Select]

typedef struct {
   int a;
   char b;
 } my_struct;

 my_struct foo;
 my_struct bar;
 char buffer[sizeof(my_struct)];

 foo.a = 42;
 foo.b = 'a';
 memcpy ((void *)buffer, (const void *)&foo, sizeof(my_struct));
 memcpy ((void *)&bar, (const void *)buffer, sizeof(my_struct));


If the code above is valid, is there then something fundamentally wrong with the code below?

Code: [Select]

void sendAnything(const byte *value, unsigned int size)
{
 for (unsigned int i = 0; i < size; i++)
   Serial.write(*value++);
}
sendAnything((const byte *)&MyInt, sizeof(MyInt));



These were meant to be sincere questions, not intention to reignite the argument  :D

edit: typos

Nick Gammon

Quote
Code: [Select]

void sendAnything(const byte *value, unsigned int size)
{
  for (unsigned int i = 0; i < size; i++)
    Serial.write(*value++);
}
sendAnything((const byte *)&MyInt, sizeof(MyInt));


The objection I see to this, compared to the template version, is that you always have to cast (as you did in the last line) unless you happen to be sending a byte pointer. As is explained the C++ FAQ, "casts are evil" however sometimes they are a necessary evil. If you can code in a way that minimizes the number of casts you are reducing the evil.

I haven't addressed the question about the void pointer, yet ...
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

Nick Gammon

#47
Nov 03, 2012, 10:58 pm Last Edit: Nov 03, 2012, 11:01 pm by Nick Gammon Reason: 1
Your question about memcpy appears to be addressed here:

http://stackoverflow.com/questions/3275353/c-aliasing-rules-and-memcpy

In particular check answer 6 (at present)

Quote
The memcpy function takes void* arguments, meaning that no assumptions are made about what is being pointed to; no aliasing has occurred here. In contrast, *(unsigned*)&p interprets a pointer to void* as a pointer to unsigned, which is aliasing.


If I understand correctly (and I'm not sure I do) the anti-aliasing rule is supposed to stop you casting a type to void and then to another type, in some sort of "trickiness" that may or may not work in a particular implementation and endianness.

(edit) In particular, it might confuse compiler optimization.

However a function like memcpy, or malloc, isn't trying to cast (say) char to int, it's trying to return (or copy) a block of memory of any type.

Disclaimer: I might be wrong about this.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

Nick Gammon

Quote
Code: [Select]
void sendAnything(const byte *value, unsigned int size)
{
  for (unsigned int i = 0; i < size; i++)
    Serial.write(*value++);
}
sendAnything((const byte *)&MyInt, sizeof(MyInt));


The other objection to this, compared to the template version, is that it relies on you correctly sending down the size (sizeof(MyInt)) whereas the template version gets the size itself, reducing the chance for error (eg. during copy/paste).
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

pekkaa

Thanks Nick, I can see your point, especially on sending down the size. On the other hand, using templates makes the sketch bigger (consumes more of the precious flash memory) and people with C background are already used to functions like memset() and read() which use the same calling convention as my version of sendAnything(). I don't think that either of the alternatives is absolutely better.

Nick Gammon

Yes, I see.

Well, using templates ...

sendAnything.h

Code: [Select]

#include <Arduino.h>

template <typename T> void sendAnything (const T& value)
  {
  const byte* p = (const byte*) &value;
  for (unsigned int i = 0; i < sizeof value; i++)
      Serial.write (*p++);
  }  // end of sendAnything


Sketch:


Code: [Select]

#include "sendAnything.h"

void setup ()
  {
  int MyInt;
  sendAnything(MyInt);
  }  // end of setup
 
void loop ()  {    }  // end of loop


Size:

Code: [Select]

Binary sketch size: 1,404 bytes (of a 32,256 byte maximum)





Using pointers ...

Sketch:


Code: [Select]

void sendAnything(const byte *value, unsigned int size)
{
  for (unsigned int i = 0; i < size; i++)
    Serial.write(*value++);
}

void setup ()
  {
  int MyInt;
  sendAnything((const byte *)&MyInt, sizeof(MyInt));
  }  // end of setup
 
void loop ()  {   }  // end of loop


Size:

Code: [Select]

Binary sketch size: 1,442 bytes (of a 32,256 byte maximum)





So the templated version used less program memory.

Even if I send two different type (int and float) the templated version still is 6 bytes shorter. Of course, if you were sending dozens of types of data (would you actually do that?) then the pointer version would use less program memory.

Test on a Uno with 1.0.1.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

Nick Gammon

The reason? The compiler can optimize better with templates, in all likelihood. With pointers it can't assume anything.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

pekkaa


So the templated version used less program memory.

Even if I send two different type (int and float) the templated version still is 6 bytes shorter. Of course, if you were sending dozens of types of data (would you actually do that?) then the pointer version would use less program memory.

Test on a Uno with 1.0.1.



I though that the whole point of using templates or generic pointer types was the function will be used to send objects of several types (definitely more than two). If that wasn't the case, instead of sendAnything(), I could have as well have implemented sendMyInt() and sendMyFloat() or overloaded send(MyInt &data) and send(MyFloat &data).

pekkaa

And I'll guess that the memory consumption also depends on how complicated (or big) the function body is. sendAnything() was trivial. If the function's body is bigger then the overhead in the template alternative starts to pile up sooner.

jwatte

With "common implementation merging" in the linker, template functions that have the same actual bytecode (because they serialize values of the same size, say,) will all use the same "actual" function under the hood, leading to less bloat. The "pointer and size" function doesn't have any bloat in implementation, but instead there's a little bit of bloat each time you call it, as the size has to be passed as an argument, not hard-coded in the function.

If size REALLY matters, you have to implement it both ways and measure it :-)

retrolefty


With "common implementation merging" in the linker, template functions that have the same actual bytecode (because they serialize values of the same size, say,) will all use the same "actual" function under the hood, leading to less bloat. The "pointer and size" function doesn't have any bloat in implementation, but instead there's a little bit of bloat each time you call it, as the size has to be passed as an argument, not hard-coded in the function.

If size REALLY matters, you have to implement it both ways and measure it :-)



I've read that size always matters, but then again this is a family oriented site so I will leave it at that.  ;)

Chris Parish

Quote

If you are not considering why we posted our concerns, how about asking for a proof why it is safe. C/C++ is built upon a set of rules; if it is allowed, the standard will state it.


Up until your post everything seemed to be a battle of opinions. You were the first to quote rules and I thank you for it. Where possible I shall avoid void pointers.

Quote
Code: [Select]
Code:
#include <Arduino.h>

template <typename T> void sendAnything (const T& value)
  {
  const byte* p = (const byte*) &value;
  for (unsigned int i = 0; i < sizeof value; i++)
      Serial.write (*p++);
  }  // end of sendAnything



Thanks for this, Nick Gammon, I shall implement this on the morrow.

So one last question: Overloading or Templating?

Nick Gammon

As far as I know templating is basically "automatic overloading". That is, to save you the trouble of writing (say) 5 overloaded functions, templates will write them for you, from the template.

So I would template, as, if you ever need to change how it works, you fix one function and not five.
Please post technical questions on the forum, not by personal message. Thanks!

More info:
http://www.gammon.com.au/electronics

PeterH

If the code for all data types effectively just works out the size and address of the memory occupied by the data, it makes no odds which approach you take and could even be reduced to a macro. The important thing in my mind is whether you expect to ever need to apply this function to something where the decision about which bytes to serialise needs to be type-specific. For example, to serialise a string value it would be sensible to measure the string length rather than just output the pointer, or the first char pointed to. When exporting a struct or class that contains pointers, it might be necessary to do a deep copy rather than a shallow one. Same for any other non-trivial data structures.

However, if you're mainly interested in coping with primitive variables that have different sizes, all those considerations are irrelevant.
I only provide help via the forum - please do not contact me for private consultancy.

pekkaa

Nick has provided very convincing arguments for using templates. One more question still. The OP's requirements were:

Quote
I want a function that I can pass either a string of characters or an integer or an object ..


He probably expects to send the strings without terminating zeros. The code below won't probably work with the template version of sendAnything, will it? Can sendAnything be fixed to work with strings as well?

Code: [Select]

char *foo1 = "foo1";
char foo2[] = "foo2";

sendAnything(foo1);
sendAnything(foo2);

Go Up