Well done for getting it sorted.
It currently works great with 2 devices, but I'm not sure how much modification will be needed to support 3 or more, I will try to enhance it later to see if I can make it universal for any number of displays.
The advantage of the way you address the MAX7219 is for scrolling you only need to add 1 byte from the vertically defined font to scroll where with the horizontal way I used you need to do a lot of AND & OR on 8 bytes to achieve the same thing. The current code I use could theoretically do 24x8 matrix with little modification but to go beyond that needs a bit more work. The vertical font way is a lot easier to expand to larger matrix arrays and could scroll a lot faster. I like the idea of using the 8x8 matrix blocks, beats the hell out of hand soldering 128 individual LED's. I might be tempted to do a smaller wordclock using these.