janost:
The mistake you did, Nick, with your 17cycles and the 9th bit was not the hardware but not checking your software.A 16MHz AVR can do a heck of a lot more than displaying 20char/per line VGA.
I would be interested to hear how, excluding external hardware, given that 20 characters is 160 pixels (at 8 pixels per character).
According to my calculations as described here you have 31.74 µS for each horizontal scan line (using 525 lines at 60 Hz).
(1/60) / 525 * 1e6 = 31.74 uS
Divide that by 800 pixels for one line including the pulse width (96 pixels) the back porch (48 pixels) and the front porch (16 pixels).
((1/60) / 525 * 1e9) / 800 = 39.68 nS
So that is 39.68 nS per pixel.
Now clearly we can't clock out a pixel every 39.68 nS with a CPU clocking at 62.5 nS per clock pulse.
You can't clock out a pixel in a single clock cycle (I don't think, unless you clock out a fixed value), so you need two clock cycles for the SPI hardware to do it. Now the closest you get then is:
125 / 39.68 = 3.15
So, rounding up, four "VGA" pixels in that time. That is, each pixel is stretched 4 x horizontally.
So if you can demonstrate where my calculations are wrong, and you can send "a heck of a lot more" I would be pleased to hear it.