combine IO control directly in byte not in bit, it's take a better performance..
and others ..
original I used C it takes 100uS only only doing thing , then in assembly do I/O Time period stuff together, it takes only 10uS, it's very good for me in real time small system programming..