New and growing well-documented, feature-complete I2C device library

In terms of doing a slave module, I feel that it would be best to do it like I2C EEPROM:

On construction / init: User passes object a pointer to a preallocated buffer and the length of the buffer
On master write: first byte or first word (depending on how you want to do it) parks read pointer at a specified index
Any further bytes sent are added sequentially starting from pointer to the buffer.
For ex: (using word addressing)
START 0x00 0x00 0x01 STOP
would write 0x01 to buf[0]
START 0x00 0x02 0x01 0x02 0x03 STOP
would write 0x01 to buf[0x02], 0x02 to buf[0x03], and so on.

On master read: return sequential data starting from where the read pointer was parked in the previous write, incrementing index each time
If overflow (master wants to read past length - 1) continue to return last byte

I think that this mostly removes the need for various protocols: it also has the advantage of the ability to pass a struct into the byte* instead of an array, which makes it very easy to use after initial set up.

Another suggestion is to include a detect method for devices with multiple possible addresses. This method would return true if the start byte is ACKed, and false otherwise - that way it can dynamically get the actual address to use.

for ex:
Device can be at 0x4A or 0x4B

0x4A(W) -> no ACK move to 0x4B
0x4B(W) -> ACK use 0x4B for future communications.