Custom ShiftOut for 74HC595

Hello I am currently working on one project in which I need to have 31 HIGH or LOW outputs. For this problem I used four 74HC595 in Daisy Chain configuration. The hardware seemed to work fine using standard shiftOut() function for times in a row, meaning that I had to shift four times to activate a certain output.
As my project advanced I am forced to use unsigned long to output at a certain pin of 595’s for example if I wanna acces pin Q3 on shift register I would shift 2^3, Q5 = 2^5 and so forth.
So I tried to write my custom function so that I would be able to shift values up to 2^32 , but I had no luck it is either there is no output on any of the pins OR all outputs are active. Here is my code would there be anyone cabable of pointing out problems that might be wrong with this code? NOTE: this code is desgined to shift bits in most significant order.

void shiftOut_32b(uint8_t dataPin,uint8_t clockPin,unsigned long ID, uint8_t delay = 0)
{
  unsigned long one = 1;
  uint8_t i;
  
 for(i=31; i>=0;i--){
digitalWrite(dataPin,(ID & (one << i))); // writes to data pin in MSB order 
 digitalWrite(clockPin, HIGH);  
   delayMicroseconds(delay); // adds a delay to the clock if needed
 digitalWrite(clockPin, LOW);
} 
}

ID variable is the position or which output should be activated as mentioned previously 2^3 = 8; ID = 8, meaning Q3 would activated.

the most important part of the code is (ID & (one << i) this part of the code shifts variable one by i-positions and uses & operator to either output 0 OR some number >0.

thanks in advance :slight_smile:

Hi, try this:

digitalWrite(dataPin, bitRead(ID, i));

Paul

there is an error in this:

uint8_t i;
for(i=31; i>=0;i–){

if i == 0 and – is applied it becomes 255 and it never stops

try this…

void shiftOut_32b(uint8_t dataPin, uint8_t clockPin, uint32_t ID, uint8_t delay = 0)
{
  uint32_t mask = 1 << 31;

  for (uint8_t i=32; i>0; i--)
  {
    digitalWrite(dataPin,  ID & mask);    // writes to data pin in MSB order 
    mask >>= 1;
    digitalWrite(clockPin, HIGH);  
    if (delay>0) delayMicroseconds(delay); // adds a delay to the clock if needed
    digitalWrite(clockPin, LOW);
  } 
}
  • uses less shifts, (64 versus ~1000) => bit faster
  • the call to delayMicroseconds() is really made conditional

give it a try

If you need a faster shiftOut you might have a look at this code - http://forum.arduino.cc/index.php?topic=184002.0 -

robtillaart:
there is an error in this:

uint8_t i;
for(i=31; i>=0;i–){

if i == 0 and – is applied it becomes 255 and it never stops

try this…

void shiftOut_32b(uint8_t dataPin, uint8_t clockPin, uint32_t ID, uint8_t delay = 0)

{
 uint32_t mask = 1 << 31;

for (uint8_t i=32; i>0; i–)
 {
   digitalWrite(dataPin,  ID & mask);    // writes to data pin in MSB order
   mask >>= 1;
   digitalWrite(clockPin, HIGH);  
   if (delay>0) delayMicroseconds(delay); // adds a delay to the clock if needed
   digitalWrite(clockPin, LOW);
 }
}



+ uses less shifts, (64 versus ~1000) => bit faster
+ the call to delayMicroseconds() is really made conditional

give it a try

Thanks for the reply. Your code is neat and I like the mask idea .But I tried it and no luck. I even checked the hardware side of things for two times, but I can only conclude that the code is not working properly. When I uploaded your code there was no behaviour what so ever. Maybe its the rate data is being pushed to the registers? Should I tinker with the delay a little bit, but by what amount?

Megamorphf:
Thanks for the reply. Your code is neat and I like the mask idea .But I tried it and no luck. I

There are couple of issues.
There should have been a warning indicating a potential issue,
but the Arduino team silences all compiler warnings from the IDE by default.
If you turn on verbose output from the compiler using
[File]->[Preferences]
I would recommend always leaving this on to avoid missing important warnings.

When enabled, you will see a warning like this:
test1.pde: In function ‘void shiftOut_32b(uint8_t, uint8_t, uint32_t, uint8_t)’:
test1.pde:297:24: warning: left shift count >= width of type

The first issue is really insidious and often trips up people (including me).
The warning mentioned above will point to this line:

  uint32_t mask = 1 << 31;

The default size in C is int. But on AVR int is 16 bits.
If you look closely, you can notice an issue. The 1 is an int by default which
on AVR is a 16 bit value but is being shifted by 31 bits.
This will not create the mask as intended it will create a mask of 0.
This should be changed to:

  uint32_t mask = 1L << 31;

The next issue is silent and is from abusing the digitalWrite() API.
The API for digitalWrite() is:
digitalWrite(pin, value), where value is either HIGH or LOW.
The value parameter is not an arbitrary zero/nonzero value, it is HIGH or LOW.
The Arduino team should have used a stronger type for this rather than a simple define, that
way code like this would hard break rather than silently fail.
This particular case is quite interesting in that there are multiple issues cascading together.
The result of the mask expression

ID & mask

is 32 bits.
Normally what would be happening when using a 32 bit or even 16 bit “value” is that
since digitalWrite() is declared as a uint8_t (8 bits), digitalWrite() will only look at the lower 8 bits.
Code that abuses the API and attempts to pass down a zero/non zero values rather than HIGH or LOW
gets burned and will fail when the value is larger than 8 bits.

The safest way to ensure library code works, is not to abuse APIs.
This ensures reliability over time and portability.

So change the code to this:

 digitalWrite(dataPin,  (ID & mask) ? HIGH : LOW);    // writes to data pin in MSB order

But even if the digitalWrite() API were not abused, it wouldn’t matter in this particular case
because of another issue, which makes this case particularly interesting.

In this case because of the 31 bit shift issue earlier, the compiler realized that
mask is zero and hence since the result of masking anything with zero
is always zero, it always passes a zero down to digitalWrite().
The optimizer optimized out shifting of mask since it will always be zero
as well as the runtime mask with ID and simply calls digitalWrite(dataPin, 0);
(Optimizers are so smart…)

To make the code work, both issues need to be corrected:

void shiftOut_32b(uint8_t dataPin, uint8_t clockPin, uint32_t ID, uint8_t delay = 0)
{
  uint32_t mask = 1L << 31;

  for (uint8_t i=32; i>0; i--)
  {
    digitalWrite(dataPin,  (ID & mask) ? HIGH : LOW);    // writes to data pin in MSB order 
    mask >>= 1;
    digitalWrite(clockPin, HIGH);  
    if (delay>0) delayMicroseconds(delay); // adds a delay to the clock if needed
    digitalWrite(clockPin, LOW);
  } 
}

In the larger picture, 32 bit values on the AVR are not efficient.
Here is the output generated by the above code:

00000648 <_Z12shiftOut_32bhhmh>:
     648:       6f 92           push    r6
     64a:       7f 92           push    r7
     64c:       8f 92           push    r8
     64e:       9f 92           push    r9
     650:       af 92           push    r10
     652:       bf 92           push    r11
     654:       cf 92           push    r12
     656:       df 92           push    r13
     658:       ef 92           push    r14
     65a:       ff 92           push    r15
     65c:       0f 93           push    r16
     65e:       1f 93           push    r17
     660:       68 2e           mov     r6, r24
     662:       76 2e           mov     r7, r22
     664:       49 01           movw    r8, r18
     666:       5a 01           movw    r10, r20
     668:       c0 2e           mov     r12, r16
     66a:       e1 2c           mov     r14, r1
     66c:       f1 2c           mov     r15, r1
     66e:       01 2d           mov     r16, r1
     670:       20 e8           ldi     r18, 0x80       ; 128
     672:       12 2f           mov     r17, r18
     674:       90 e2           ldi     r25, 0x20       ; 32
     676:       d9 2e           mov     r13, r25
     678:       60 e0           ldi     r22, 0x00       ; 0
     67a:       d8 01           movw    r26, r16
     67c:       c7 01           movw    r24, r14
     67e:       88 21           and     r24, r8
     680:       99 21           and     r25, r9
     682:       aa 21           and     r26, r10
     684:       bb 21           and     r27, r11
     686:       00 97           sbiw    r24, 0x00       ; 0
     688:       a1 05           cpc     r26, r1
     68a:       b1 05           cpc     r27, r1
     68c:       09 f0           breq    .+2             ; 0x690 <_Z12shiftOut_32bhhmh+0x48>
     68e:       61 e0           ldi     r22, 0x01       ; 1
     690:       86 2d           mov     r24, r6
     692:       0e 94 9a 06     call    0xd34   ; 0xd34 <digitalWrite>
     696:       16 95           lsr     r17
     698:       07 95           ror     r16
     69a:       f7 94           ror     r15
     69c:       e7 94           ror     r14
     69e:       87 2d           mov     r24, r7
     6a0:       61 e0           ldi     r22, 0x01       ; 1
     6a2:       0e 94 9a 06     call    0xd34   ; 0xd34 <digitalWrite>
     6a6:       cc 20           and     r12, r12
     6a8:       21 f0           breq    .+8             ; 0x6b2 <_Z12shiftOut_32bhhmh+0x6a>
     6aa:       8c 2d           mov     r24, r12
     6ac:       90 e0           ldi     r25, 0x00       ; 0
     6ae:       0e 94 ad 05     call    0xb5a   ; 0xb5a <delayMicroseconds>
     6b2:       87 2d           mov     r24, r7
     6b4:       60 e0           ldi     r22, 0x00       ; 0
     6b6:       0e 94 9a 06     call    0xd34   ; 0xd34 <digitalWrite>
     6ba:       da 94           dec     r13
     6bc:       e9 f6           brne    .-70            ; 0x678 <_Z12shiftOut_32bhhmh+0x30>
     6be:       1f 91           pop     r17
     6c0:       0f 91           pop     r16
     6c2:       ff 90           pop     r15
     6c4:       ef 90           pop     r14
     6c6:       df 90           pop     r13
     6c8:       cf 90           pop     r12
     6ca:       bf 90           pop     r11
     6cc:       af 90           pop     r10
     6ce:       9f 90           pop     r9
     6d0:       8f 90           pop     r8
     6d2:       7f 90           pop     r7
     6d4:       6f 90           pop     r6
     6d6:       08 95           ret

It might be more efficient to call an 8 bit routine 4 times.

— bill

@bperrybap
thanx for fixing

bperrybap:

Megamorphf:
Thanks for the reply. Your code is neat and I like the mask idea .But I tried it and no luck. I

There are couple of issues.
There should have been a warning indicating a potential issue,
but the Arduino team silences all compiler warnings from the IDE by default.
If you turn on verbose output from the compiler using
[File]->[Preferences]
I would recommend always leaving this on to avoid missing important warnings.

When enabled, you will see a warning like this:
test1.pde: In function ‘void shiftOut_32b(uint8_t, uint8_t, uint32_t, uint8_t)’:
test1.pde:297:24: warning: left shift count >= width of type

The first issue is really insidious and often trips up people (including me).
The warning mentioned above will point to this line:

  uint32_t mask = 1 << 31;

The default size in C is int. But on AVR int is 16 bits.
If you look closely, you can notice an issue. The 1 is an int by default which
on AVR is a 16 bit value but is being shifted by 31 bits.
This will not create the mask as intended it will create a mask of 0.
This should be changed to:

  uint32_t mask = 1L << 31;

The next issue is silent and is from abusing the digitalWrite() API.
The API for digitalWrite() is:
digitalWrite(pin, value), where value is either HIGH or LOW.
The value parameter is not an arbitrary zero/nonzero value, it is HIGH or LOW.
The Arduino team should have used a stronger type for this rather than a simple define, that
way code like this would hard break rather than silently fail.
This particular case is quite interesting in that there are multiple issues cascading together.
The result of the mask expression

ID & mask

is 32 bits.
Normally what would be happening when using a 32 bit or even 16 bit “value” is that
since digitalWrite() is declared as a uint8_t (8 bits), digitalWrite() will only look at the lower 8 bits.
Code that abuses the API and attempts to pass down a zero/non zero values rather than HIGH or LOW
gets burned and will fail when the value is larger than 8 bits.

The safest way to ensure library code works, is not to abuse APIs.
This ensures reliability over time and portability.

So change the code to this:

 digitalWrite(dataPin,  (ID & mask) ? HIGH : LOW);    // writes to data pin in MSB order

But even if the digitalWrite() API were not abused, it wouldn’t matter in this particular case
because of another issue, which makes this case particularly interesting.

In this case because of the 31 bit shift issue earlier, the compiler realized that
mask is zero and hence since the result of masking anything with zero
is always zero, it always passes a zero down to digitalWrite().
The optimizer optimized out shifting of mask since it will always be zero
as well as the runtime mask with ID and simply calls digitalWrite(dataPin, 0);
(Optimizers are so smart…)

To make the code work, both issues need to be corrected:

void shiftOut_32b(uint8_t dataPin, uint8_t clockPin, uint32_t ID, uint8_t delay = 0)

{
  uint32_t mask = 1L << 31;

for (uint8_t i=32; i>0; i–)
  {
    digitalWrite(dataPin,  (ID & mask) ? HIGH : LOW);    // writes to data pin in MSB order
    mask >>= 1;
    digitalWrite(clockPin, HIGH); 
    if (delay>0) delayMicroseconds(delay); // adds a delay to the clock if needed
    digitalWrite(clockPin, LOW);
  }
}




In the larger picture, 32 bit values on the AVR are not efficient.
Here is the output generated by the above code:


00000648 <_Z12shiftOut_32bhhmh>:
    648:      6f 92          push    r6
    64a:      7f 92          push    r7
    64c:      8f 92          push    r8
    64e:      9f 92          push    r9
    650:      af 92          push    r10
    652:      bf 92          push    r11
    654:      cf 92          push    r12
    656:      df 92          push    r13
    658:      ef 92          push    r14
    65a:      ff 92          push    r15
    65c:      0f 93          push    r16
    65e:      1f 93          push    r17
    660:      68 2e          mov    r6, r24
    662:      76 2e          mov    r7, r22
    664:      49 01          movw    r8, r18
    666:      5a 01          movw    r10, r20
    668:      c0 2e          mov    r12, r16
    66a:      e1 2c          mov    r14, r1
    66c:      f1 2c          mov    r15, r1
    66e:      01 2d          mov    r16, r1
    670:      20 e8          ldi    r18, 0x80      ; 128
    672:      12 2f          mov    r17, r18
    674:      90 e2          ldi    r25, 0x20      ; 32
    676:      d9 2e          mov    r13, r25
    678:      60 e0          ldi    r22, 0x00      ; 0
    67a:      d8 01          movw    r26, r16
    67c:      c7 01          movw    r24, r14
    67e:      88 21          and    r24, r8
    680:      99 21          and    r25, r9
    682:      aa 21          and    r26, r10
    684:      bb 21          and    r27, r11
    686:      00 97          sbiw    r24, 0x00      ; 0
    688:      a1 05          cpc    r26, r1
    68a:      b1 05          cpc    r27, r1
    68c:      09 f0          breq    .+2            ; 0x690 <_Z12shiftOut_32bhhmh+0x48>
    68e:      61 e0          ldi    r22, 0x01      ; 1
    690:      86 2d          mov    r24, r6
    692:      0e 94 9a 06    call    0xd34  ; 0xd34
    696:      16 95          lsr    r17
    698:      07 95          ror    r16
    69a:      f7 94          ror    r15
    69c:      e7 94          ror    r14
    69e:      87 2d          mov    r24, r7
    6a0:      61 e0          ldi    r22, 0x01      ; 1
    6a2:      0e 94 9a 06    call    0xd34  ; 0xd34
    6a6:      cc 20          and    r12, r12
    6a8:      21 f0          breq    .+8            ; 0x6b2 <_Z12shiftOut_32bhhmh+0x6a>
    6aa:      8c 2d          mov    r24, r12
    6ac:      90 e0          ldi    r25, 0x00      ; 0
    6ae:      0e 94 ad 05    call    0xb5a  ; 0xb5a
    6b2:      87 2d          mov    r24, r7
    6b4:      60 e0          ldi    r22, 0x00      ; 0
    6b6:      0e 94 9a 06    call    0xd34  ; 0xd34
    6ba:      da 94          dec    r13
    6bc:      e9 f6          brne    .-70            ; 0x678 <_Z12shiftOut_32bhhmh+0x30>
    6be:      1f 91          pop    r17
    6c0:      0f 91          pop    r16
    6c2:      ff 90          pop    r15
    6c4:      ef 90          pop    r14
    6c6:      df 90          pop    r13
    6c8:      cf 90          pop    r12
    6ca:      bf 90          pop    r11
    6cc:      af 90          pop    r10
    6ce:      9f 90          pop    r9
    6d0:      8f 90          pop    r8
    6d2:      7f 90          pop    r7
    6d4:      6f 90          pop    r6
    6d6:      08 95          ret




It might be more efficient to call an 8 bit routine 4 times.

--- bill

Yay it works! I really want to thank you for your help on this project if it weren’t for you I still would be scratching my head on this code. I really admire your dedication to explain things also I love that you pointed out the specific problems related to this project. As for efficiency I don’t mind about the execution speeds that much. Thanks for help Bill and I owe you a firm handshake :slight_smile:

Megamorphf:
Thanks for help Bill and I owe you a firm handshake :slight_smile:

The usual way to thank someone on this forum is to give "karma" by clicking the green "+" under their name.

PS. Its also good etiquette, when replying, to keep your quotes short & relevant, rather than quoting the entire previous post! :wink:

Paul