bytesVal |= (bitVal << ((DATA_WIDTH-1) - i));
bytesVal << 1; bytesVal |= bitVal;
I've looked at that code 10 times and I can't see any reason it will work for 2 chips and not for 3.
byte bitVal;
long bitVal;