You have to track two events:
- what to do next
- when to do it
For the "what" a state machine may be a good solution. Then determine the interval after which the next step shall be taken, and wait for that time - see BlinkWithoutDelay.
For the IR part you need the IRremote library, I think that you found it already.