Bug in twi.c

Hello..
there is a bug in twi.c which effectively disconnects TWI/I²C after an error in Slave Receiver mode.

The bug is in SIGNAL(TWI_vect) at case label TW_SR_DATA_NACK/TW_SR_GCALL_DATA_NACK

the wrong code is:
~~ // nack back at master~~
~~ twi_reply(0);~~

This should be:
twi_reply(1);

Explanation: See ATmegaxxx Documentation (doc8161.pdf) from Atmel. In Fig. 21-16 on page 236 the states $88 (TW_SR_DATA_NACK) and $98 (TW_SR_GCALL_DATA_NACK) are final states. There is nothing to "nack back at master". This has already been done! Unlike the $80 (TW_SR_DATA_ACK) and $90 (TW_SR_GCALL_DATA_ACK) states, there will be no $A0 state (TW_SR_STOP) to terminate. Table 21-4. on page 235 shows that if we now leave TWEA off, we will never be able again to be selected as receiver. Calling twi_reply(1) will set TWEA and thus allow further selection as slave.

Also SIGNAL is deprecated, the ISR() directive should be used.

Added to the issue list: Google Code Archive - Long-term storage for Google Code Project Hosting.

aweiler:
(TW_SR_GCALL_DATA_NACK) are final states. There is nothing to "nack back at master". This has already been done! Unlike the $80 (TW_SR_DATA_ACK) and $90 (TW_SR_GCALL_DATA_ACK) states, there will be no $A0 state (TW_SR_STOP) to terminate. Table 21-4. on page 235 shows that if we now leave TWEA off, we will never be able again to be selected as receiver. Calling twi_reply(1) will set TWEA and thus allow further selection as slave.

The Atmel I2C architecture is so that when we set or clear the TWEA bit, this will determine the response to the next byte received. When the ISR is called, acknowledge (positive or negative) has already been sent and so we can only influence what will happen next time around. In short we can say that responses are deferred by one byte.

In this context it is true that leaving the TWEA bit cleared will prevent a slave from being addressed. That is, the address will be recognized, but the call will be actively rejected (NAK will be returned).

It is not so however that TW_SR_GCALL_DATA_NACK and TW_SR_DATA_NACK are final states. The master will still issue a bus stop condition and the slave receiver will catch this (TW_SR_STOP), enable TWEA and so be ready to accept new requests. Without a final stop condition the bus will be considered busy and so no communication between any I2C devices on the bus could take place. Figure 21-16 in the datasheet shows how this is handled.

If what you claim (no stop condition issued after data nack) is true, then there would also have been an issue with clearing TWEA for the buffer overflow condition in response to TW_SR_DATA_ACK (this is also what must happen for TW_SR_DATA_NACK to ever be called). This is however not the case as a stop condition will always follow.

In summary I think this is a non issue and implementing the proposed modification is not correct. Clearing TWEA has the effect of cutting the session short in cases when the receiver buffer overflows and I trust this is both intentional and desired.

BenF:
It is not so however that TW_SR_GCALL_DATA_NACK and TW_SR_DATA_NACK are final states. The master will still issue a bus stop condition and the slave receiver will catch this (TW_SR_STOP), enable TWEA and so be ready to accept new requests. Without a final stop condition the bus will be considered busy and so no communication between any I2C devices on the bus could take place. Figure 21-16 in the datasheet shows how this is handled.

I attached a screen shot of the figure in question. I added two big red arrows pointing at the states I mean. Yes it is true that a STOP is sent, but the fact that a stop is sent does not imply that a TW_SR_STOP ($A0) is reported in TWSR. This is the case after $80/$90. In the $88/$98 case, there is no state reported. $88/$98 is the last thing you see in TWSR. This is what I mean with "Final state". Afterwards there is $F8, which means "No relevant state information available"

BenF:
If what you claim (no stop condition issued after data nack) is true, then there would also have been an issue with clearing TWEA for the buffer overflow condition in response to TW_SR_DATA_ACK (this is also what must happen for TW_SR_DATA_NACK to ever be called). This is however not the case as a stop condition will always follow.

Not always. As mentioned above, a TW_SR_STOP is seen in TWSR only in the cases where it is shown in the figure.

I made this change in my code and it works perfectly. The TW_SR_DATA_ACK is not affected by this change.

If I understand you correct, you suggest that the absence of $A0 in the NACK path on figure 21-16 should be taken literally.

Do you have any test/trace to show that this is the case or a reference in the Atmel datasheet text to support this (other sources)?

Also consider the following:

    case TW_SR_DATA_ACK:       // data received, returned ack
    case TW_SR_GCALL_DATA_ACK: // data received generally, returned ack
      // if there is still room in the rx buffer
      if(twi_rxBufferIndex < TWI_BUFFER_LENGTH){
        // put byte in buffer and ack
        twi_rxBuffer[twi_rxBufferIndex++] = TWDR;
        twi_reply(1);
      }else{
        // otherwise nack
        twi_reply(0);
      }
      break;

In the fragment above, the else case will set up for NACK to be returned provided that master actually sends another byte (one byte is already lost in this case even though it has been acknowledged).

Next step would be for master to send another byte (in which case TW_SR_DATA_NACK will be called) or the master may send a stop condition. Have you considered both of these scenarios?

I set up a master/slave test and found that aweiler is correct when pointing out that the TWI/ISR is indeed not called once NACK has been returned in response to slave receive data. The issue then is that our slave receiver remains disabled and so further communication from master to slave is no longer possible.

The condition that triggers this issue is as follows (TWI_BUFFER_LENGTH as configured for slave):

  • Enable twi for slave receive
  • Master sends “TWI_BUFFER_LENGTH + 2” bytes or more to slave
  • Receiver is now disabled

A fix for this might be as follows:

    case TW_SR_DATA_NACK:       // data received, returned nack
    case TW_SR_GCALL_DATA_NACK: // data received generally, returned nack
      // reset buffer
      twi_rxBufferIndex = 0;
      // ack future responses and leave slave receiver state
      twi_releaseBus();
      break;

The above patch will reset state and allow communication as either slave, master or both to continue.

In the case when “TWI_BUFFER_LENGTH + 1” bytes are received, one byte will be lost with no error returned to master. A fix for this additional issue may be as follows:

    case TW_SR_DATA_ACK:       // data received, returned ack
    case TW_SR_GCALL_DATA_ACK: // data received generally, returned ack
      // if there is still room in the rx buffer
      if(twi_rxBufferIndex < TWI_BUFFER_LENGTH)
          twi_rxBuffer[twi_rxBufferIndex++] = TWDR;	// put byte in buffer

      // can we accept additional bytes
      if(twi_rxBufferIndex < TWI_BUFFER_LENGTH)
	  twi_reply(1);	// yes, ack additional bytes
      else
      	  twi_reply(0);	// no, nack if master sends additional byte
      break;

The above patch will return an error to master whenever number of bytes received exceeds maximum slave receiver buffer capacity.

BenF:

      twi_rxBufferIndex = 0;

twi_releaseBus();

OK, this is much better.

Also, you asked for a trace. Sorry, I cannot provide this, but it seems this is no longer necessary.