That makes no sense to me.
If Vref is 5v the ADC will treat give the highest possible value for anything greater than (5v - 5v / Nsteps) - and it is probably irrelevant for this comment whether N is 1023 or 1024.
If your program then gets a reading of the highest value (let's say it gets 1023) your program must consider the real value to be between (5 - 5/N)volts and 5 volts.
I remain of the view that is is easier to understand what is happening if you imagine an ADC that produces the values 0, 1, 2, 3, 4 and 5 for the range 0v to 5v because the errors are much more obvious.
...R
In reality, stage 1023 measures the same range, the interpretation makes it "look like" the data shifts up. Consider how the 4.995V to 4.999V stage gets dropped when converting the read by /1024.
Each stage gets a slice. A 1/1023 slice is wider than a 1/1024 slice by 1/(1024^2). The stretch is there's 1024 wider slices.
Divide by 1024:
Highest read is 1023. 5000mV * 1023/1024 = 4995 in data maximum regardless of input when minimum may be 0V.
I present that the /1024 conversion does not represent the ADC data or the input voltage so well as have room to point and preach as if the true meaning of the ADC is the whole of the process. If you can't be substantially more precise, you got no room to pick.
Divide by 1023:
If the read is 1023 then /1023 gives 5000mV input * 1023/1023 = 5000 in data. 1022 gives 4995, 511 gives 2497 and 512 gives 2502, 0 gives 0.
5000 * 1000/1023 gives 4887.
5000 * 1000/1024 gives 4882 where the ADC step is 4.882 to 4.887mV.
I don't tell you to not use /1024 conversions.
I give reasons why you shouldn't but I don't condemn the use of /1024 outright.
People should have a choice as to which way to be ever so slightly wrong in at least some cases. Don't pre-judge.