It's not as simple as that - over sampling can give better time-discrimination or voltage discrimination or both, and its critical how much random noise is present - in the absence of noise over-sampling only gives more time information. Too much noise and it makes the accuracy worse. Right amount of noise and the averaging both reveals the missing fractional sample data and cancels out the noise.
A saw-tooth wave of amplitude 0.5LSB works even better than random noise.
Pic is of some battery-voltage readings from Arduino-compatibles - here the noise present is significantly less than one LSB of the ADC so no amount of averaging will give 12 bits here:
In particular look at the top trace - for only about 1/4 of the time is the signal close enough to an ADC step for noise to cause any variation. For the rest of the time it just reads the exact same value. Here the systems were battery powered so the power rails were very clean and the ADC shows its true performance. Extra noise would have to be injected to enable averaging to produce more bits.
If you have an inherently noisy sensor though, this technique is very valuable, but you can't just blindly assume you'll get 12 bits without seeing what the noise level is...