Oh. You had same problem as I do. Arduino is slow... Well I had another problem. I can't create easy libraries with bare avr c. So I started project to overcome this problem.
As a result I have very nice implementation for digital pins. That is only thing really working yet. Timers and analogRead is next.
Wanted to let you know what I have found out, so here is the project: GitHub - raphendyr/yaal: Yet another AVR Abstraction Library