I have no idea why this is happening
They have a completely different way of sending out the bits (one is timer/interrupt based, the other uses noop loops for the delays), the timing almost surely differs because of that. Both never will be near the reliability of a hardware serial interface.
You haven't explained why you can't use one of the implementations on both Arduinos.