Your reasoning is "flawed" because there is no angle and there is no continuous transfer function from voltage to digital value

The math problem to solve is that you are looking at the reverse function of a

surjective but non

bijective piecewise affine function. (the steps)

You really lost information when the ADC converted the input Voltage into a digital representation, and the amount of information you lot is directly related to the number of bits (and hardware quality) of the representation and there is

no way to get back to the original value.

All you can do is take an estimate, based on simple principle that it has to be a likely original value (ie within the interval). There is no good or bad estimate as long as they sit there in the interval. You can add extra personal constraints like "I want to see a nice 5V when the ADC tells me 1023" or "I want to minimize the max average error" or "I want a constant max error"... Those are arbitrary choices you can impose to yourself.

that means basically that any of the affine function crossing every "step" is legit to consider.

but it does not need to be an affine function either, you might want to compensate specific bias you have at specific steps so this would be a legit curve as well

if you have 2 bits --> you have 4 possible values for the ADC. So it's exactly the same thing just with less precision... and in a perfect world, they would map this way:

_{0.00V to 1.25 V would map to 01.25V to 2.50V would map to 12.50V to 3.75V would map to 23.75V to 5.00V would map to 3 (and beyond 5V if the hardware does not burn)}so when you have a

x value 0,1,2 or 3 you need to decide

arbitrarily what voltage was the input. The point I make above is that dividing by 4 to get a possible estimate of the original voltage by using

x * 5 / 4 always gives you the entry point of the interval (0 - 1.25, 2.50, 3.75) whereas dividing by 3 gives you a point somewhere within the interval (0, 1.66, 3.33, 5) which are not worse than the other values... they are just as likely possible estimates.

Now assuming you have input varying amongst the full scope and you do millions of reading, when the ADC tells you 1 the second approach (dividing by 3) will give you 1.6666 for something that was between 1.25 and 2.50 so the max error you are doing is 0.83 whereas when you divided by 4 the max error is 1.25V..

--> as said above, taking the middle of the interval

(x + 0.5) * Vref / 1023 is actually what minimizes this max error. You could decide to take always the min of the interval (divide by 1024), always take the max of the interval (same absolute max error), or shoot anywhere randomly within the interval... All are valid answers for an estimate.