That is right, the Slave does not know how much data is requested. It is part of the normal I2C protocol.
This is almost never a problem, since there should be known data packet between the Master and the Slave.
The Slave does have the possibility to reply with less bytes than requested.
For example when a Master requests 10 bytes, and the Slave decides to write only 2 bytes, the Master should check the number of bytes that are received.
Another example is the I2C EEPROM. The Master can request just one byte from the EEPROM or many bytes. During the I2C session, the EEPROM keeps sending bytes, as long as the Master keeps requesting new data.