SPI library for ATtiny MCUs

I wrote an SPI master library for ATtiny44/84 and ATtiny45/85 MCUs that uses the ATtiny USI hardware module. As compared with using shiftOut(), it's faster, smaller and provides a much more consistent bit clock. All feedback welcome.

Looks good, Jack.

Can I put a link to it on my SPI page?

Excellent work!

https://github.com/JChristensen/tinySPI/blob/master/tinySPI.cpp#L23
You are setting all the bits. I suggest something more “direct”…
USICR = _BV(USIWM0) | _BV(USICS1) | _BV(USICLK);
My preference is this so as to document the value of every bit…
USICR =
(0 << USISIE) |
(0 << USIOIE) |
(0 << USIWM1) | (1 << USIWM0) |
(1 << USICS1) | (0 << USICS0) |
(1 << USICLK) |
(0 << USITC);

https://github.com/JChristensen/tinySPI/blob/master/tinySPI.cpp#L24
Did you mean to strobe the clock in begin?

https://github.com/JChristensen/tinySPI/blob/master/tinySPI.cpp#L42
Is it necessary to block interrupts / ensure a consistent clock?

Have you considered moving the code (function bodies) to the header file? That would allow the compiler to inline most / all of the code.

I admit I’ve done the same, but for a reference implementation, remember the rules about underscores:

In C++ these variable names are reserved:

Reserved in any scope, including for use as implementation macros:

  • identifiers beginning with an underscore and an uppercase letter
  • identifiers containing adjacent underscores (or “double underscore”)

I’ve been gradually migrating my use of the _BV macro to the bit macro, eg.

  USICR = bit (USIWM0) | bit (USICS1) | bit (USICLK);

My preference is this so as to document the value of every bit…

As for that, I’m not sure.

When you are asked what you had for dinner, do you say what you didn’t have?

  • We had fish.
  • And chips.
  • No pasta.
  • No pizza.
  • No beef.
  • No chicken.
      (0 << USIWM1) | (1 << USIWM0) |

I look at that and think “he used USIWM1 and USIWM0”. Oh, no, wait. He didn’t use USIWM1. It’s adding to the workload.

Nick, absolutely, and thanks to you both very much! I'll try to explain my thinking a bit, please give me any further thoughts, your input is very much appreciated!

https://github.com/JChristensen/tinySPI/blob/master/tinySPI.cpp#L23
You are setting all the bits. I suggest something more "direct"...

I think Nick caught it, I am preserving the value of USIWM1. This is in case the user previously set SPI_MODE1, then did end() then another begin(), I wanted that setting to stick. That's probably a long shot; initializing USICR could have been coded more simply otherwise. I debated this for a while, but that's where I ended up. Of course, begin() could be documented as "resets the data mode to SPI_MODE0". Probably here and elsewhere I should add some more comments.

https://github.com/JChristensen/tinySPI/blob/master/tinySPI.cpp#L24
Did you mean to strobe the clock in begin?

I just meant to initialize the registers and pins in begin(). The example code in the datasheet loads the data register and then initializes USICR and strobes the clock immediately before the transfer loop, so that works there, but I did all the strobing in the transfer loop in the interest of consistency, although at the cost of one more trip through the loop.

https://github.com/JChristensen/tinySPI/blob/master/tinySPI.cpp#L42
Is it necessary to block interrupts / ensure a consistent clock?

Every once in a while I'd see a clock that was maybe ten times longer. I assumed this may have been the millis() timer overflow interrupt handler. That seemed like a long time (100-200µs with a 1MHz clock) but I didn't dig into it. Would you expect the overflow handler to take that long?

As I think about this some more, I'm not sure why a slave would care about the occasional stretched clock and I'd of course rather not inhibit interrupts. What are your thoughts? OTOH when I looked at shiftOut(), its clock is all over the board, amazingly inconsistent.

Have you considered moving the code (function bodies) to the header file? That would allow the compiler to inline most / all of the code.

I was not aware of that, sounds like something I should learn about.

I've been gradually migrating my use of the _BV macro to the bit macro

Is that your macro? I like it, I always thought _BV looked a "bit" clunky.
EDIT: Duh... That'd be Arduino's macro.

Here are examples of the SPI clock timings I’m seeing without inhibiting interrupts. These come from an input capture sketch running on a 328P*. For bit 0, the value in the count column is just the raw timer value, for bits 1-7 it’s the delta. Less than one percent of the bits get stretched with USI; shiftout() has a little different profile, more frequent stretches but not as extreme as compared to the average.

  • I’ve been wanting one of those logic analyzers, this may be justification :wink:

Could you post an example of this usage so I can reproduce it?

Never mind, I see you provided examples.

Let me ask this, Jack. What do you mean by "stretched clock" exactly?

My preliminary analysis does not indicate this. What are you measuring?

OK, I've spotted it.

The clock is stretched after the first clock bit. As you realize no doubt, because the clock bits are clocked out in the tight timing loop there is a chance for an interrupt to occur during it.

uint8_t tinySPI::transfer(uint8_t spiData)
{
    USIDR = spiData;
    USISR = _BV(USIOIF);                //clear counter and counter overflow interrupt flag
    ATOMIC_BLOCK(ATOMIC_RESTORESTATE) { //ensure a consistent clock period
        while ( !(USISR & _BV(USIOIF)) ) USICR |= _BV(USITC);
    }
    return USIDR;
}

It is stretched by around 13 µS which sounds about right for a Timer 0 interrupt.

Since the time taken to clock out a byte is around 9 µS I am inclined to leave it how you had it. A no-interrupt pause to clock out a single byte would not inhibit the Timer 0 timing noticeably, and it is not as if you are going to be doing much else.

Unless your pin mappings are different to mine, aren't you using the same pin for MOSI and SS?

// ATMEL ATTINY 25/45/85 / ARDUINO
//
//                  +-\/-+
// Ain0 (D 5) PB5  1|    |8  Vcc
// Ain3 (D 3) PB3  2|    |7  PB2 (D 2) Ain1  - SCK  (USCK)
// Ain2 (D 4) PB4  3|    |6  PB1 (D 1) pwm1  - MISO (DO)
//            GND  4|    |5  PB0 (D 0) pwm0  - MOSI (DI)
//                  +----+

You used D0 for LATCH:

//pin definitions
const int LATCH_PIN = 0;           //storage register clock (slave select)
const int DATA_PIN = 1;            //data in
const int CLOCK_PIN = 2;           //shift register clock

The others agree (SCL = D2, MISO = D1)

In fact I think the terminology is confusing there:

const int LATCH_PIN = 0;           //storage register clock (slave select)

I call slave select "the thing brought low to start a transmission". However "register clock" would be SCK wouldn't it?


tinySPI::tinySPI() 
{
}

If the constructor doesn't do anything you can omit it.

Nick, it may not technically be a stretched clock pulse, it may be a stretched wait between two pulses, which is what your image shows. At any rate, it can only be seen if the ATOMIC_BLOCK is removed. Pin definitions and all are the same as yours.

I’m now leaning towards taking it out, and worrying about it if and when it actually causes a problem.

Thanks! Gotta get one of those Saleae analyzers!

Well, that stretched clock is much longer than the whole byte, so I am inclined to leave it how you had it. Putting the ATOMIC_BLOCK in stops the stretching and would only have a marginal side-effect.

Yeah, I don't know that it'd really cause an issue but it just seems cleaner to have the consistency. I'm good with it either way, we can pick one and see if it causes any problems down the road, burn that bridge when we get to it :smiley:

So your feeling on having interrupts inhibited for ~80µs is that that's not too long?

Coding Badly, any strong feelings one way or the other?

It wasn’t 80 µS by my measurements, more like 10 µS.

Given what I observed I would rather you leave the class like it was. The occasional delay of Timer 0 firing is insignificant, especially as it has a whole 1000 µS to fire and not lose a tick.

Nine or ten µS sounds about right for your loop:

------------------------------------------------------------
------------------------------------------------------------
uint8_t tinySPI::transfer(uint8_t spiData)
{
    USIDR = spiData;
 110:   6f b9           out     0x0f, r22       ; 15   (1)
    USISR = _BV(USIOIF);                //clear counter and counter overflow interrupt flag
 112:   80 e4           ldi     r24, 0x40       ; 64   (1)
 114:   8e b9           out     0x0e, r24       ; 14   (1)
    ATOMIC_BLOCK(ATOMIC_RESTORESTATE) { //ensure a consistent clock period
 116:   8f b7           in      r24, 0x3f       ; 63   (1)
    return 1;
}

static __inline__ uint8_t __iCliRetVal(void)
{
    cli();
 118:   f8 94           cli   (1)
 11a:   01 c0           rjmp    .+2             ; 0x11e <_ZN7tinySPI8transferEh+0xe>   (2)
        while ( !(USISR & _BV(USIOIF)) ) USICR |= _BV(USITC);
 11c:   68 9a           sbi     0x0d, 0 ; 13   (2)
 11e:   76 9b           sbis    0x0e, 6 ; 14   (1/2/3)
 120:   fd cf           rjmp    .-6             ; 0x11c <_ZN7tinySPI8transferEh+0xc>   (2)
    (void)__s;
}

static __inline__ void __iRestore(const  uint8_t *__s)
{
    SREG = *__s;
 122:   8f bf           out     0x3f, r24       ; 63   (1)
    }
    return USIDR;
 124:   8f b1           in      r24, 0x0f       ; 15   (1)
}
 126:   08 95           ret   (4)

Clock cycles in brackets. Assuming you are running at 8 MHz, the inner loop will be 16 * 5 cycles * 125 nS (10 µS).

I had something similar here:

http://www.gammon.com.au/forum/?id=10892&reply=5#reply5

That's good enough for me, will leave as is. It does need a few more comments though.

I was running at 1MHz so that's the difference. I counted ten cycles for the transfer loop, and also calibrated the internal osc on the ATtiny using Coding Badly's TinyTuner, and my input capture sketch reported 160 timer counts (with its timer @ 16MHz) for each clock cycle, which works out to 62.5ns * 160 = 10µs ... pretty neat! :smiley:

Don't know why I didn't think to check your site, you have just about everything there! I did search just a bit for similar libraries and was surprised to not find more. Always fun to roll your own so I just went for it.

I counted five.

        while ( !(USISR & _BV(USIOIF)) ) USICR |= _BV(USITC);
 11c:   68 9a           sbi     0x0d, 0 ; 13   (2)
 11e:   76 9b           sbis    0x0e, 6 ; 14   (1/2/3)
 120:   fd cf           rjmp    .-6             ; 0x11c <_ZN7tinySPI8transferEh+0xc>   (2)

Since the sbis does not skip usually that is one cycle, plus the other four.

You haven’t commented on my remarks about the pin allocations.

I didn't do the Attiny84, which you did. I've got four of them too, but I suspect they are untouched so far. :wink: