Go Down

Topic: Arduino/ATmega328 C64 Emulator (Read 31759 times) previous topic - next topic

janost

Hi
Yes, I'm doing what the subject says.
I got inspiered by the thread "Arduino 6502 emulator + BASIC interpreter" by miker00lz and got his sketch running with 64Kb RAM and some clever caching/virtual memory.

Yesterday when I got it working with the SID-library it got me to want to boot the original C64-ROMs on it.

It has not booted yet but I'm hours from doing so.

A big problem is video and it might never emulate the Graphics of a real C64 but at the least something.

janost

First boot of the VIC-20 :)

This is with a 32Kb EEPROM so it has 11K free.

nickgammon

Looks cool. Try feeding my GPascal compiler into it. In its simple form (compiling stage) it just does text input/output.

See here for what to expect. There is a link to the disk image:

http://www.gammon.com.au/forum/?id=11203

(It might be a bit big if you only have 11 kB free).
Please post technical questions on the forum, not by personal message. Thanks!

More info: http://www.gammon.com.au/electronics

janost


Looks cool. Try feeding my GPascal compiler into it. In its simple form (compiling stage) it just does text input/output.

See here for what to expect. There is a link to the disk image:

http://www.gammon.com.au/forum/?id=11203

(It might be a bit big if you only have 11 kB free).


The test for the VIC-20 had 32KB memory.

The platform has 64Kb.

I'll try it out.

janost

One of my goals is to boot the Apple II.

I have never touched that platform other than clones.
But it would be very cool.

janost

It now has it's own Composite B/W video output.
Yes, still on a singel chip.

Up to 40x25 textmode with only 2 resistors.
The videocode is just around 1000bytes and 12bytes RAM + videobuffer.

But I'm still struggling to get it to sync correctly.

cosmicfrog


janost

Thanks

It is a bit tight on memory but I think I can squeeze in a 1000byte videobuffer into RAM.

To be able to shift out a 4MHz pixel rate I had to ditch the serial RX/TX port and use it in USART SPI mastermode.
So the video is on the Arduino TX pin and the sync on digitalpin 2.
Works really great.

janost

#8
Nov 06, 2013, 10:04 am Last Edit: Nov 06, 2013, 11:18 am by janost Reason: 1
I managed to get it to sync.
The problem was my 3.5" LCD-TV. I replaced it with a 5" LCD-TV and it syncs both with NTSC and PAL modes.
Now I have a 40x25 textmode output.
Yes, it supports a graphicsmode of 160x100 with a character ROM set containing 2x4 blockdrawing characters.

Using the UART in SPImode, I noticed no 9th bit problem. It shifts 8bits/byte.

Because the pixel Clock is 8MHz I'm now struggling with loading a videoshift byte in just 16 clockcycles.

It may have to be done with inline assembly but I have no idea how to get a pointer to my videoRAM/charROM passed to the inline assembly code.

Code: [Select]

Pointer-register
A very special extra role is defined for the register pairs R26:R27, R28:R29 and R30:R31. The role is so important that these pairs have extra names in assembler: X, Y and Z. These pairs are 16-bit pointer registers, able to point to adresses with max. 16-bit into SRAM locations (X, Y or Z) or into locations in program memory (Z).

The lower byte of the 16-bit-adress is located in the lower register, the higher byte in the upper register. Both parts have their own names, e.g. the higher byte of Z is named ZH (=R31), the lower Byte is ZL (=R30). These names are defined in the standard header file for the chips. Dividing these 16-bit-pointer-names into two different bytes is done like follows:

.EQU Adress = RAMEND ; RAMEND is the highest 16-bit adress in SRAM
   LDI YH,HIGH(Adress) ; Set the MSB
   LDI YL,LOW(Adress) ; Set the LSB

Accesses via pointers are programmed with specially designed commands. Read access is named LD (LoaD), write access named ST (STore), e.g. with the X-pointer:


Example

X Read/Write from adress X, don't change the pointer LD R1,X
ST X,R1

X+ Read/Write from/to adress X and increment the pointer afterwards by one LD R1,X+
ST X+,R1

-X Decrement the pointer by one and read/write from/to the new adress afterwards LD R1,-X
ST -X,R1

Similiarly you can use Y and Z for that purpose.

There is only one command for the read access to the program storage. It is defined for the pointer pair Z and it is named LPM (Load from Program Memory). The command copies the byte at adress Z in the program memory to the register R0. As the program memory is organised word-wise (one command on one adress consists of 16 bits or two bytes or one word) the least significant bit selects the lower or higher byte (0=lower byte, 1= higher byte). Because of this the original adress must be multiplied by 2 and access is limited to 15-bit or 32 kB program memory. Like this:

  LDI ZH,HIGH(2*Adress)
   LDI ZL,LOW(2*Adress)
   LPM

Following this command the adress must be incremented to point to the next byte in program memory. As this is used very often a special pointer incrementation command has been defined to do this:

  ADIW ZL,1
   LPM

ADIW means ADd Immediate Word and a maximum of 63 can be added this way. Note that the assembler expects the lower of the pointer register pair ZL as first parameter. This is somewhat confusing as addition is done as 16-bit- operation.
The complement command, subtracting a constant value of between 0 and 63 from a 16-bit pointer register is named SBIW, Subtract Immediate Word. (SuBtract Immediate Word). ADIW and SBIW are possible for the pointer register pairs X, Y and Z and for the register pair R25:R24, that does not have an extra name and does not allow access to SRAM or program memory locations. R25:R24 is ideal for handling 16-bit values.

As incrementation after reading is very often needed, newer AVR types have the instruction

  LPM R,Z+

This allows to transport the byte read to any location R, and auto-increments the pointer register.


Hmm.., Thinking..

janost

This is the pgm_read_byte function from pgmspace.h

Code: [Select]

(__extension__({                \
    uint16_t __addr16 = (uint16_t)(addr); \
    uint8_t __result;           \
    __asm__                     \
    (                           \
        "lpm" "\n\t"            \
        "mov %0, r0" "\n\t"     \
        : "=r" (__result)       \
        : "z" (__addr16)        \
        : "r0"                  \
    );                          \
    __result;                   \
}))


How do I interpret this?
Does it load Z from addr16?

janost

Ok, I'm beginnig to understand this :)

Code: [Select]

Constraint Used for Range 
a Simple upper registers r16 to r23 
b Base pointer registers pairs y, z 
d Upper register r16 to r31 
e Pointer register pairs x, y, z 
q Stack pointer register SPH:SPL 
r Any register r0 to r31 
t Temporary register r0 
w Special upper register pairs r24, r26, r28, r30 
x Pointer register pair X x (r27:r26) 
y Pointer register pair Y y (r29:r28) 
z Pointer register pair Z z (r31:r30) 
G Floating point constant 0.0 
I 6-bit positive integer constant 0 to 63 
J 6-bit negative integer constant -63 to 0 
K Integer constant 2 
L Integer constant 0 
l Lower registers r0 to r15 
M 8-bit integer constant 0 to 255 
N Integer constant -1 
O Integer constant 8, 16, 24 
P Integer constant 1 
Q (GCC >= 4.2.x) A memory address based on Y or Z pointer with displacement.   
R (GCC >= 4.3.x) Integer constant. -6 to 5 

fungus


How do I interpret this?


Inline assembly is messy with this compiler.

This page helped me a lot: http://www.nongnu.org/avr-libc/user-manual/inline_asm.html

No, I don't answer questions sent in private messages (but I do accept thank-you notes...)

janost



How do I interpret this?


Inline assembly is messy with this compiler.

This page helped me a lot: http://www.nongnu.org/avr-libc/user-manual/inline_asm.html



Thanks. Yes, it helps.

janost

#13
Nov 06, 2013, 02:42 pm Last Edit: Nov 06, 2013, 03:01 pm by janost Reason: 1
The character ROM will need 1024bytes for charcode 0-127.
charcode 128-255 is the same characters but in videoreverse so with a simple XOR, use the same image.

There are 2 charsets on a CBM. To support switching takes another 1024bytes.

To support Hires (  :smiley-eek-blue: ) graphics 160x100 the blockgraphic charset requires 2048bytes.
The same XOR trick can work here also to reduce it to 1024bytes?

The videocode Clocks in at just a bit over 1K.

My videocode supports 22x23 videomap with 4MHz pixelclock and 40x25 videomap with 8MHz pixelclock.
Both in PAL and NTSC. It also has a selectable bordercolor, CBM style, of black or white.
Using SCART for color or 4 resistors for 16shade B/W it can do color but needs additional RAM for storing colors.

It is interruptdriven with Timer 0 at 15625/15748Hz so other tasks can run when not blasting pixels.
That is 184/200 of the 262/312 lines so its like running the AVR at 6MHz instead of 16MHz.

All on a single 328 chip and 2 resistors.

I would say its much more capable than the TVout library?
Perhaps I will release this as a standalone cbmTVout library?

fungus


It is interruptdriven with Timer 0 at 15625/15748Hz so other tasks can run when not blasting pixels.
That is 184/200 of the 262/312 lines so its like running the AVR at 6MHz instead of 16MHz.

All on a single 328 chip and 2 resistors.

I would say its much more capable than the TVout library?


Definitely. 320x200 is more than double the resolution of 'tvout'.

I'm currently looking to build a Space Invaders machine based on Arduino. Space Invaders needs 256x224 resolution so that fits perfectly. With character mapping it should be possible to squeeze all the graphics in.


Perhaps I will release this as a standalone cbmTVout library?


Can we see it? Just a demo of the video display code will do.

No, I don't answer questions sent in private messages (but I do accept thank-you notes...)

Go Up