I will have at least one 4x4 keypad, most likely two, so yes, a diode matrix might be more suitable. I've never worked with one, I just used code to scan trough the 4 rows one column at a time, not sure if the diode matrix has advantages?
Or maybe I should use one shift register per one 4x4 keypad matrix? i.e. hook up the matrix's 4 row and 4 column pins to the 8 pins on the shift register and then have Arduino code translate the output of the shift register?