Which should be the perfect protocol and why?

MorganS:
Imagine what happens if the main CPU sends a command to the GSM one during that delay(5000). It can't respond or even take a note of what was said so that it can do it later.

You have to design the system to work without delays. So you can do all of that on one Arduino and it will be much simpler than trying to get five to talk to each other.

Right!!That delay(5000) [for example] will cause the whole thing to clog and it will eventually the biggest pitfall.

I think It's rather going to the other way, as you and others said... and also else I will require a ACK mechanism after each command sent and that means still the same multitasking and then what's wrong to use the single one??