Most instructions take 1 clock cycles, things that access memory or branch take 2 more or less, there are some that take a few more. So figure nearly 16 million instructions per second.
It would be possible to make a multiple pin pulseIn type call that would watch all of the pins you cared about for pulse start and end. It gets tricky to tune because you end up with a bunch of code paths to be timed and corrected.
If you can live with 64 microsecond accuracy, which gives you about 20 distinct positions for an input, then it can be done fairly well by watching all the inputs and one of the PWM counters (which tick every 64 microseconds). Record when a pin goes high, record when it goes low, subtract to get the length. If you get a negative value, you rolled over the counter... add a counter's worth to it.