How to deal with nibbles?

Hi,

I want to write a simple message code based on 2 bytes to send sensor readings to computer thru serial communication.

For that I have created this simple table

1 byte | 2byte op | sensor | daa | data OO SS | XXDD | DDDD DDDD

So, in the first 2 bits (from left to right) i represent the operation code. In following 2 i say which sensor is that about. Then, there are two bits which are don't cares. The remaining bits are for representing the 1024 possible values generated by the 10 bit ADC.

My problem is in creating the first byte. I have some difficulties in handling bits with bitwise operations.

If i have the opcode and the sensorPin and the sensor reading, how can i put the op code in the first two bits (from left to right), followed by the sensor pin number, followed by two don't cares and finally followed by the 2 msb bits in the reading value?

Thank you very much,

Nuno

You can create a struct with bitfields. This way you can access the individual parts of the byte using the structure. Makes the code more readable than using those >> or << too :slight_smile:

typedef struct {
unsigned int op : 2;
unsigned int sensor : 2;
unsigned int dontcare : 2;
unsigned int data : 2;
} SENSOR_READING;

SENSOR_READING sr;

void some_function(void)
{
sr.op = something;
sr.sensor = something_else;
sr.data = also_something;
}

Hum, but doing this is not going to make this structure incredibly big, as all the fields are int, when my codes are not so many.

I really only need 14 bits to represent all i need.

Will not this over charge the serial write?

Thanks,

Nuno

This struct is only 8 bits. The numbers after the : specifies the number of bits in the individual fields. So the struct as specified matches the format of your byte #1.

Cool!

Thanks,

Nuno

I just thought of something - bitfields are compiler specific, and I’ve never used bitfields with avr-gcc so I don’t know how it arranges the bits. So you might need to reverse the order of the fields if this struct does not work for you.

skumlerud, have you actually tried this?

A quick test on Solaris (no Arduino handy) confirmed my suspicion that struct members will still occupy data on byte boundaries (e.g. no 2 bits here and 3 bits there).

To the original poster, I would advise either learning about bitwise operations (it can be a bit confusing but it isn't all that hard), or to change your protocol so that the members of the data structure are all whole bytes.

-j

skumlerud, have you actually tried this?

Not on the Arduino, but I've done this a lot on my Ataris when accessing bitfields in the AES structures.

kg–

A quick test on Solaris (no Arduino handy) confirmed my suspicion that struct members will still occupy data on byte boundaries (e.g. no 2 bits here and 3 bits there).

I am extremely surprised to hear that. The C spec does allow the compiler to make the decision on how to align the bitfields, but I have never heard of a platform that just willy-nilly byte aligns everything. How did you come to that conclusion? What happens when you try to print sizeof(SENSOR_READING)? If the members are indeed byte aligned, is it possibly the case the certain optimzations were disabled?

In Arduino’s case, I did an experiment and observed that:

  1. skumlerud’s entire structure is indeed contained in a single byte
  2. an array of 100 of them takes up 100 bytes as expected.
  3. furthermore, the bits are allocated from low to high, i.e. the “op” field is stored in the low 2 bits.

Perhaps my experimental sketch would be the best demonstration:

typedef struct {
unsigned int op : 2;
unsigned int sensor : 2;
unsigned int dontcare : 2;
unsigned int data : 2;
} SENSOR_READING;

void setup()
{
Serial.begin(9600);
Serial.println(sizeof(SENSOR_READING));
SENSOR_READING a[100];
Serial.println(sizeof(a));

for (byte i=0; i<255; ++i)
{
SENSOR_READING s;
memcpy(&s, &i, 1);
Serial.print(“i=”); Serial.print((int)i); Serial.print(" “);
Serial.print(“op=”); Serial.print((int)s.op); Serial.print(” “);
Serial.print(“sensor=”); Serial.print((int)s.sensor); Serial.print(” “);
Serial.print(“dontcare=”); Serial.print((int)s.dontcare); Serial.print(” ");
Serial.print(“data=”); Serial.print((int)s.data); Serial.println();
}
}
void loop(){}

yields these results

1
100
i=0 op=0 sensor=0 dontcare=0 data=0
i=1 op=1 sensor=0 dontcare=0 data=0
i=2 op=2 sensor=0 dontcare=0 data=0
i=3 op=3 sensor=0 dontcare=0 data=0
i=4 op=0 sensor=1 dontcare=0 data=0
i=5 op=1 sensor=1 dontcare=0 data=0
i=6 op=2 sensor=1 dontcare=0 data=0
i=7 op=3 sensor=1 dontcare=0 data=0
i=8 op=0 sensor=2 dontcare=0 data=0
i=9 op=1 sensor=2 dontcare=0 data=0
i=10 op=2 sensor=2 dontcare=0 data=0
i=11 op=3 sensor=2 dontcare=0 data=0
i=12 op=0 sensor=3 dontcare=0 data=0
i=13 op=1 sensor=3 dontcare=0 data=0
i=14 op=2 sensor=3 dontcare=0 data=0
i=15 op=3 sensor=3 dontcare=0 data=0
i=16 op=0 sensor=0 dontcare=1 data=0
i=17 op=1 sensor=0 dontcare=1 data=0
...

Mikal

The C spec does allow the compiler to make the decision on how to align the bitfields, but I have never heard of a platform that just willy-nilly byte aligns everything. How did you come to that conclusion? What happens when you try to print sizeof(SENSOR_READING)? If the members are indeed byte aligned, is it possibly the case the certain optimzations were disabled?

Solaris 9, GCC v 3.3.2. No command line flags supplied, used default optimization (and default everything). Code is

#include <stdio.h>
main()
{
  typedef struct {
    unsigned int op : 2;
    unsigned int sensor : 2;
    unsigned int dontcare : 2;
    unsigned int data : 2;
  } SENSOR_READING;

  SENSOR_READING sr;

  printf("struct is %d (%d) bytes\n", sizeof(SENSOR_READING), sizeof(sr));

}

the output was

struct is 4 (4) bytes

I made the assumption that it was byte aligned based on the fact it occupies 4 bytes. I could be wrong about that, and it in fact occupies the lower 8 bits of a word aligned struct. Hmm, time for another test:

#include <stdio.h>
main()
{
  typedef struct {
    unsigned int op : 2;
    unsigned int sensor : 2;
    unsigned int dontcare : 2;
    unsigned int data : 2;
  } SENSOR_READING;

  union {
    SENSOR_READING sr;
    unsigned int i;
  } u;

  u.sr.op=3;
  u.sr.sensor=3;
  u.sr.dontcare=3;
  u.sr.data=3;

  printf("%x\n", u.i);
}

output was

ff000000

So, it appears my assumption was wrong; The struct apparently uses 8 contiguous bits, but the struct was word aligned.

I would be extremely wary about making assumptions based on the internal data representation of a user-defined type.

-j

Good analysis, j. Just out of curiosity, when you make an array of 4 structs, is the sizeof the array 16 or 4? I'd expect 4.

I would be extremely wary about making assumptions based on the internal data representation of a user-defined type.

Me too, but it seems like in this case, we can experimentally determine what the avr compiler does, and make skumlerud's proposal (or a variant thereof) work.

Mikal

Just out of curiosity, when you make an array of 4 structs, is the sizeof the array 16 or 4? I'd expect 4.

I'd expect 16 on the Sun, since it uses a 32 bit word and is word-aligned by default. A quick experiment verified that assumption.

On the ATmega, I'd expect it to be 4 bytes, as it's an 8 bit, byte-aligned system. Maybe when I get home I'll get out my Arduino and check it out.

-j

Hi,

I'm seeing that this as caused a lot of discussion! :)

I was diggesting the information you people let me and I was questioning myself. How am I going to transfer the structure to the other side?

There is no Serial.print for structure. Do I have to pass all the values? One by one?

Maybe I really want is to make simple structure, based on a byte or two, send it thru serial and have it decoded on the host side.

I would like to hear your opinion about this.

What about the bitwise operation to make the same thing the way I was asking? How could I achieve that?

Many thanks,

With my best regards,

Nuno

Take a look at the chunk of code with a union in it.

Your struct happens to be small enoguh to fit in a short int. The few cases where I've used it before, I used an array of characters instead.

-j

I would be extremely wary about making assumptions based on the internal data representation of a user-defined type.

I really don’t see the problem here, unless we’re trying to interface with a specific file format or perhaps a hardware register where the actual layout of the bits is important. For internal structures it doesn’t matter how the data is represesented - you’re accessing it through the struct anyway. Using a struct will make the code easier to maintain than modifying bits directly.

If we’re accessing hardware or a file that must be compatible across platforms, the code will be platform- and compiler-specific anyway. Using a struct in this case is preferable, just modify the definition of the struct as necessary when porting to another compiler or platform. If you’re using bitwise operations to manipulate bytes directly in your code, porting could be a major job.

I was diggesting the information you people let me and I was questioning myself. How am I going to transfer the structure to the other side?

There is no Serial.print for structure. Do I have to pass all the values? One by one?

That question is really hard to answer unless you tell us what the “other side” is :wink: You can do everything from just printing the individual bytes via Serial.print() and decode the bytes using a similar struct on the other side, to print elaborate XML-statements of the individual struct members.

Maybe I really want is to make simple structure, based on a byte or two, send it thru serial and have it decoded on the host side.

That depends on how many of these structs you need on the Arduino. It only has 1Kb RAM, so if you need 150 of these structs it will be 450 bytes with individual bytes, and still only 150 bytes with your existing format. 300 bytes makes a huge difference on the Arduino :slight_smile:

What about the bitwise operation to make the same thing the way I was asking? How could I achieve that?

Something like this:

int reading, op, sensor, data;

op = something;
sensor = something;
data = something;

reading = (op << 6);
reading |= (sensor << 4);
reading |= data;

Slightly messier than using a struct IMHO.

I really don't see the problem here, unless we're trying to interface with a specific file format or perhaps a hardware register where the actual layout of the bits is important. For internal structures it doesn't matter how the data is represesented - you're accessing it through the struct anyway.

But it's not internal use only - once you start using the struct to effectively format your output, it becomes external. The application is to generate bytes for a serial protocol that gets transmitted to other devices/programs.

-j

But it's not internal use only - once you start using the struct to effectively format your output, it becomes external. The application is to generate bytes for a serial protocol that gets transmitted to other devices/programs.

As I said, the problem is only when interfacing with files of a specific format or hardware. But the problem is still there if you skip the struct and manipulate the bits directly. You still have to deal with the actual format. Using a struct for this makes the code easier to maintain and port. You would of course need to know how the compiler you're using handles structs, but as long as proper documentation exists this is not a problem.

But the problem is still there if you skip the struct and manipulate the bits directly.

No, because if you declare an unsigned byte and twiddle the high bits in it, you know full well what you are changing; it is not implementation dependent (other than endianness, but we won't go there).

Using a struct for this makes the code easier to maintain and port. You would of course need to know how the compiler you're using handles structs,

You do realize these are completely contradictory statements? :)

Anything that depends on a compiler implementation detail (or worse, an optimization setting) is not portable.

-j

You do realize these are completely contradictory statements?

Anything that depends on a compiler implementation detail (or worse, an optimization setting) is not portable.

Structs are used all the time in all portable C-code - this doesn't make the code non-portable. Sure, how the data ends up in memory when using structs are compiler dependent, but this is not a problem. When you need to specify a specific binary data structure using a struct, there are ways to ensure consistent results.

No, because if you declare an unsigned byte and twiddle the high bits in it, you know full well what you are changing; it is not implementation dependent (other than endianness, but we won't go there).

Now you're making assumptions yourself. First of all, there's no such thing as an unsigned byte in C. The size of an int is compiler dependent - in the old days an int was two bytes in TurboC and four bytes in gcc. short int was two bytes in both. Also, different compilers will arrange data in a struct differently, depending on optimization and other design decisions.

Secondly, you can manipulate bits directly but still get in trouble when dealing with "foreign" systems. The problem is not the use of a struct or not, the problem is that you still have to know the inner details of the data you're interfacing with either way. The big difference is when you need to change things, then a struct will make life easier.