miminuno:
Overall, I would need an additional 32 output pins and 32 input pins
Have you thought of using shift registers? You can use Serial In/Parallel Out registers for outputs and Parallel In/Serial Out registers for inputs. You can hook them to your SPI ports and transfer data at 8 million bits per second.