@Idahowalker, my question is about how to efficiently handle a large amount of inputs/outputs. How my code currently is, doesn't matter, as my question is a general one.
@anon57585045, I'm already kind of doing that, except not 8x10, but 10x8 bits. Only my code did not reflect any of that, and I'm still not sure if I find that easier than "just" 80 bits. I'm in currently in the process of finding that out.
@madmark2150, bumpers (not flippers) directly to the switches, but still inputs are also connected. Internally the pinball machines gathers all kinds of information about how often certain switches are triggered. Probably for finding out how "fair" a certain game is. Still, in my version I want everything to be controlled by the CPU. Call me stupid, but that is what I want 
@anon57585045, actually my first version did exactly that... store the separate bytes in a separate variable and compare these with the previous state. If nothing changed, I skipped comparing any of the bits. This worked well, but as @westfw mentioned, it's true that "just" comparing ~80 bits on a 16 Mhz CPU is peanuts, so I bitshifted all bytes into a single variable to test the difference. Only the Arduino Editor just lost my complete sketch after renaming it, so I have to start over 
@westfw, unfortunately I get an average of 2 to 3 bounces on switches with capacitors in ~50 μs, and an average of 4 to 5 bounces on switches without capacitors in ~100 μs. But today I thought of a different kind of debouncing, which in theory should work without any delay by just flipping the logic: instead of acknowledging the new state after the state being stable for a certain time (which introduces a delay), directly acknowledge the new state if the previous state was stable for a certain time (and thus ignore any state changes after that moment for a certain time). I have to rewrite my sketch after losing it, so I hope to be able to test this out tonight.
@noiasca, my question is about how to efficiently handle a large amount of inputs/outputs. I already mentioned that I used chained shift-registers, meaning that I "just" read bytes and have to compare bits. How my schema currently is doesn't matter, as my question is a general one. I'm currently working on the inputs, so the outputs are mostly out of scope for now. I'm using the Arduino Mega.