Bit Bang SPI question on the receiving part.

The code below works fine for a mode 0 device. The device wants an idle byte between sending a read command and reading the result.
What puzzles me is that I have to start reading the result on miso BEFORE sending the 1st clock pulse for the third byte transmitted. If I were using hardware spi I would just send the three bytes and the result would be in spdr. i.e. the slave clocks out its 1st data bit on the falling edge of the 16th clock(the last clock of the 2nd byte).

Does this make sense and I'm over-thinking it? Maybe the way to think about it is that the avr reads miso on the rising edge of the 17th clock (for hardware spi)?

  temp=cmd;
  for(i=0;i<8;i++){//first write the command
    digitalWrite(mosi,(temp&0x80));      //by setting mosi for each bit
    digitalWrite(sck,HIGH);              //then pulsing the clock
    digitalWrite(sck,LOW);
    temp<<=1;
  }
  for(i=0;i<8;i++){  //now we send 8 clock pulses while the enc thinks
    digitalWrite(sck,HIGH);
    digitalWrite(sck,LOW);
  }
  result=0;
  for(i=0;i<8;i++){  //now we read the data back
    result=(result<<1)|digitalRead(miso); //by grabbing miso for each bit
    digitalWrite(sck,HIGH);                //then pulsing the clock
    digitalWrite(sck,LOW);
  }
  return(result);