XBOX360->GC converter

Hi,

I am currently working on a xbox360 to gamecube pad converter.
It's just a fun project. If it won't work in the end, it's fine.

I didn't buy an Arduino yet. I read about Arduinos and the XBOX/GC protocols.
I read a lot about the theory by a guy that built N64/SNES->GC converters.

The code is basically done and working in a simulated environment with a simulated console.
I can read and write in that protocol correctly. Detect console, detect probes, detect polls, send input.
The problem is that it is probably waaaay too slow to work with a real GC.
My simulated console reads/writes at the same speed my board reads/writes, that's why it works.
I'm using the emulator on tinkercard btw.

The real GC protocol runs at 4us per bit. A 1 is signalled by 0111 and a 0 by 0001. So I basically have to read/write every 1us.
Question 1: To achieve that, I probably have to write clock cycle perfect ASM, right? So I have to speed it up to that level and I have to make sure that every path takes the same time with NOPs? I assume it is a lot of work, but it should be possible right?

My current loop looks like this:
loop() {
-4 times: Read/write "sub"bit, depending on state. Fill NOPs to delay 1us
-Put subbits together to real information bit, match patterns with a buffer and react by sending controller input
}

Don't worry about the GC data bus being bidirectional and 3.3v (compared to maybe 5v Arduino).
The bus has an internal pullup in the console, so if I want to send 1 I just leave it and if I want to send 0, I just pull it down. I think it's no problem.

The xbox360 pad protocol seems rather complex, so I didn't bother to write an interface for it, but planned to use an USB Shield with the library.
Now question 2:
The library is probably going to mess up my timings completely right?
It seems to have a bit of overhead for my dimension of timings.
Also, it probably has to run periodically, so a 3 state model would not work either.
-state1: read GC, goto state2, when poll pattern detected
-state2: read XBOX pad, goto state3 when done
-state3: send converted pad input, goto state1

Maybe someone here has some experience in it and can make an estimation.

Thanks and best regards
Jan

There are "Arduinos" available now with much greater speed and memory capacity, while still being small and and inexpensive. Maybe with those you could avoid writing assembler and stick to C++. Then, if you are successful, you have software that is more easily transportable to other mcus in the future. Examples of these are the Teensy 3.x family, the Maple Mini, the AdaFruit Feather M0, ItsyBitsy M0 and Trinket M0.