Thinking out loud:
Quickly going through this thread I had the following thoughts:
(datasheet)
The DHT21 measures temperature -40..80 C (+- 1C)
The DHT21 measures relative humidity 0..100% (+- 3%)
Time between two measurements >= 1.7 seconds.
Given the accuracy one need just 1 byte for T and one byte for H.
This allows 0.5C and 0.5% values if one wants to, but given the accuracy not meaningful.
For T one needs an offset of -40.
==> 2 bytes per measurement (in fact some values are not used YET)
So storage looks like [T][H][T][H][T][H][T][H][T][H][T][H].... etc
If one has a number of measurements one can check if one can apply run length compression. If multiple measurements are the same it would be possible to compress them. This can be encoded as [Run length flag][length][T][H] or short [R][L][T][H].
Given this length of 4 bytes. the run length compression is more efficient if there are 3 consecutive measurements that are the same.
As we do not know the actual measurements it is impossible to calculate the savings in storage but they could be substantial. (which is an indication you measure too often).
A second way of saving storage could be that one makes measurements every 2 seconds and add these to a circular buffer (size 30).
Once a minute the average is calculated and stored in a second circular buffer (size 60) for the last hour.
Once per per 30 minutes the average of the minutes is calculated and stored in a 3rd buffer (so 48 samples per day).
So 3 buffers, one high frequency last minute buffer, one medium frequency last hour buffer and one low frequency half hour buffer. Only the last need to be stored in EEPROM.
Given you have 128 KB storage == 65535 measurements \which are half an hour apart. 65535 x half hour = 1300++ days so more than 3 years.
By "playing" with the buffer sizes and the frequency of averaging you optimize this idea to your needs. E.g. if the long term only needs one average per hour you go to 7+ years of storage.