originally Super Mario Bros was created at 255*240px β obviously standard UNO ain't gonna cut it with standard SPI (honestly I'm not even sure it can handle preparing frame buffer at at least 20 FPS), so 2 options: use at least 8bit parallel port (if UNO can handle frame buffer) or use something beefier, like STM32 or ESP32. Side note, original NES CPU was 1,7MHz
Currently in my mind is 240240 px 1.3" RGB "IPS" SPI LCD or 320240 8bit parallel one.
Originally there was only 2bit/pixel color index (4 colors per pixel), but modern LCD drivers do not work at this low color mode (I believe lowest is 4/4/4) so driving LCD will be more complex task than calculating frame buffer.
And to be honest I want more color on future games I'll make/remake for this setup.
I want to make this cos I want to get better at C++ coding and get knowledge on driving LCDs without 3rd party libraries, plus it's kinda cool project to make.
All tiles and level layout is out there on Internet already, now all I need to do is to get animation parameters so I don't have to spend ages recreating them...
You MAY be able to do this on a Teensy or ESP32. Anything less won't cut it, and still you may struggle as you already notice the processors are a tad slower (by almost an order of magnitude) than the NES.
There's processing power to consider, but also memory (you have just 2k RAM on the ATmega328p!) and program/data storage space.
Overall this is going to be a HUGE undertaking. If it's just about the programming, why not using something with real performance - like the Pi.
wvmarle:
You MAY be able to do this on a Teensy or ESP32. Anything less won't cut it, and still you may struggle as you already notice the processors are a tad slower (by almost an order of magnitude) than the NES.
There's processing power to consider, but also memory (you have just 2k RAM on the ATmega328p!) and program/data storage space.
Overall this is going to be a HUGE undertaking. If it's just about the programming, why not using something with real performance - like the Pi.
You probably misunderstood β NES β 1.7Mhz, Arduino UNO β 16Mhz, UNO almost 10 times faster
It's kinda sorta (may be) possible to make this work with Arduino UNO... But due to very poor specs/performance/price ratio I will not be using UNO, I will be using STM32 (blue pill), it has 72Mhz CPU, 20 KB RAM, 128 KB FLASH and it's SPI goes up to 36Mbps, which is plenty for 25FPS even on 16bit color.
But the Nintendo has dedicated peripherals that accelerate the processor significantly for gaming purposes. It would be very slow as a quadcopter autopilot but putting sprites on the screen will be many many times faster than the faster-on-paper Arduino.
MorganS:
But the Nintendo has dedicated peripherals that accelerate the processor significantly for gaming purposes. It would be very slow as a quadcopter autopilot but putting sprites on the screen will be many many times faster than the faster-on-paper Arduino.
How?
I read somewhere about "memory shifting" β "GPU" shifts entire "screen" and redraws only moving elements, is this what you are talking about?
Anyway, STM32 should have enough power and memory to handle this game, even when redrawing entire frame, if for some reason it will not β next step will be ESP32
Dedicated hardware to do specific tasks. We actually do the same with Arduinos.
Don't drive a TFT display directly. Send a description of what's to be on screen to a dedicated chip that builds the screen and drives all the individual pixels.
Don't try to play MP3 from an Arduino, use a DF Player instead. The mp3 decoder chip can't do anything but decoding sound files.
Modern GPUs can do really complex things such as real-time ray tracing and other highly complex processing, stuff even the fastest Intel CPU couldn't even dream of doing, and with a fraction of the clock speed.
Memory shifting sounds like a very sensible thing to do build into a GPU hardware to help with all those slider style games. There are no doubt other things it can do. Sprites are usually uploaded to graphics memory, after which the CPU only has to tell where that sprite has to go. The GPU may even handle collision detection between sprites for you.
wvmarle:
Don't drive a TFT display directly. Send a description of what's to be on screen to a dedicated chip that builds the screen and drives all the individual pixels.
Don't try to play MP3 from an Arduino, use a DF Player instead. The mp3 decoder chip can't do anything but decoding sound files.
STM32 should be capable of driving LCD (when I'll make library for it), there is no need to add extra hardware in my opinion. I'm not going to play MP3, that is way overkill for this task, simple buzzer will do.
The SPI bus limit is one thing - the SPI data buffer another.
A 320x240 px, 16-bit colour screen holds 153,600 bytes of data. That's a lot, well over half of the 256k of a Teensy 3.6, almost a third of the 520 KiB of the ESP32
It'd require 25 MHz for SPI bus clock to get to 20 fps, that's fine, as long as the display can handle it.
I think I share my calculations before, also I wrote I'll be using 240240px display, not 240x320.
Calculations for 240240 LCD (5/6/5) ~19Mbps/20FPS ~23Mbps/25FPS.
If I decide to use 320x240 I'll go with 8bit parallel instead of SPI interface.
wvmarle:
SPI is done in hardware, 8-bit parallel not. That's a major plus for SPI.
Anyway, you'll just have to start building and see how it goes. The graphics output is just part of the game.
I know 8bit isn't hardware, I've ordered (not for this project) 3.5" 320x420 16bit to play on (as I said I'm doing this for learning purposes, not for end product)
.
I can't get ST7789 240x240 to work in hardware SPI but after a lot of googling I found that it might be issue with CS pin β there isn't one on this LCD and instead of SPI mode 0 this one supposed to use mode 3.
EDIT from the future:
To get ST7789 working with Adafruit library find "Adafruit_SPITFT_Macros.h" thats located in Arduino IDE library folder in Adafruit_GFX_Library you need to replace "SPI_MODE0" with this "SPI_MODE3" in two paces to get ST7789 to work with hardware SPI.
While waiting on my logic analyzer (so I can code my own LCD library) I decided to dissect level design, I want to share few interesting things I found:
Originally I thought sprites were 8bit per pixel but was a bit suspicious/surprised that there are only few colors per sprite (8 bits β 256 colors), so what gives?
Well, I found out why β I notices that all sprites uses only 3 different colors, and many of them have transparency, so they used only 2bits per pixel (2bits - 4 diferent values, 3 for color index, 4th for alpha). I assume "GPU" applied specific color according to sprite index, color index and level lookup table. It's pretty clever, they saved a lot of space by doing this. 16*16 pixel sprite takes only 64 bytes of memory.
Actually entire game uses only 24 different colors, at this point I don't know why, even with optimization described above they could use more colors. I guess it's a thing to find out...
I recreated Mario sprite binary code using 2bits per pixel, can you see him? :
Bit decoding part is working (4 pixels per byte, serial output):
Because the lookup is done in hardware, not software. Adding 1 more bit to the hardware will almost double the cost of the chip with the extra complication and interconnects.
When you are building millions, saving 10 cents per unit is significant.
MorganS:
Because the lookup is done in hardware, not software. Adding 1 more bit to the hardware will almost double the cost of the chip with the extra complication and interconnects.
When you are building millions, saving 10 cents per unit is significant.
If thats the case then I don't have to worry about it, I'm working with much more capable hardware.
I try to figure out how levels is constructed.
First I thought they store some sort of level 2D array where every 16px square (I refer to this square as a "unit") represents a sprite ID, but I noticed that few tiles are not on this 16px grid, and some levels are really huge β longest 6240pixels (390 "units" wide), so this approach is not optimal. Another guess is that every Sprite has its own array with its position on that level, but considering that level is up to 390 "units", byte is not enough to store position, and I don't think they used bigger variables than byte, so I'm out of ideas at the moment... Need to do some googling.
3Dgeo:
If thats the case then I don't have to worry about it, I'm working with much more capable hardware.
Really? But if the hardware does not have this very special function optimized for games, you have to do it in software, which is going to waste a lot of time.
"Sprite" refers to the things which move. The background is defined another way. How many types of objects are on the background? How many bits do you need to represent them?
3Dgeo:
If thats the case then I don't have to worry about it, I'm working with much more capable hardware.
Serious?
Specialised hardware can easily be orders of magnitude faster than a software solution on a multi-purpose processor.
It's not just because that modern PCs still come with dedicated graphics chips!
MorganS:
Really? But if the hardware does not have this very special function optimized for games, you have to do it in software, which is going to waste a lot of time.
wvmarle:
Serious?
Specialised hardware can easily be orders of magnitude faster than a software solution on a multi-purpose processor.
I'm grateful for your input, but man, what the hell is wrong with you two?
I'm using 32bit 72Mhz CPU, you think it can't handle simple bit math? It will crush task like this with no problem, no matter even if dedicated hardware is way faster at this task β this CPU will brute force the sh** out of this task
Yes, it will be tricky to code so MCU send pixel data directly to display (with no frame buffer) because frame buffer would be way too big for this MCU, but I think I can manage that.
It will be fine, relax, I did calculations β hardware is good enough, even at 16bit colors
On another note, I did more digging and I can confirm that sprites/tiles uses 2bit method as described earlier, but where I was wrong is that tiles are 16x16, actually tiles are 8x8.
Also I realized that background pattern (clouds, trees, fences) repeats every 768 pixels, odd number, 512+256 or 8*96 ??? interesting... And I'm guessing bottom 2 ground rows of tiles are filled entirely by the code and masked after by "hole" X array.
For comparison: that hardware will have a REALLY hard time decoding an mp3 file. In contrast, a cheap dedicated chip is doing this without breaking a sweat. Same for graphics.
But if you think you can do all that on generic hardware... go ahead. I'm not stopping you. I do know my old 486DX2-80 computer used to rely on graphics hardware to get halfway decent frame rates for games, as it was unable to do the rendering by itself.