I'll convert the flow to onRequest, instead of onReceive, and poll Slave 1 from the Master. Did I get that correct?
Not really. The usual way to use I2C communication is to have a register based device. To control the device you write to some registers or read from them. To set a register you first send the register address and then the value. Reading is similar. You first write the register address. When you start reading the device will start at the last register address you wrote.
To get your communication between the Arduinos up you have to define your set of registers and what they are good for. Then implement the slave that way to act like a chip device, so when the master writes some register it does one thing, when the master writes to the second register it does another thing. The master can also read the value of some register to get resulting or sensed values back from the slave. This is a completely different way of talking to the slaves than you would with a two way serial connection.
Please help me understand why they work in one instance and not the other.
If a write the serial interface succeeds within an interrupt handler depends on several factors. The problem is, it's executed while another part of your program is interrupted. The buffer of the serial interface is filled up with new characters. If the buffer fills up to it's upper limit, the call blocks till the buffer gets emptied. If the call was made within an interrupt, other interrupts are disabled, so the buffer will never be processed.
You have to take care for the variables too. Every variable you use inside an interrupt handler and in your normal code has to be declared "volatile", else the compiler may optimize it away.
Also, do calls to different slave devices take any processing cycles away from the non-called slaves?
No, the hardware will take care for that, the MCU knows the address and will call the interrupt routine only if it's been addressed accordingly.
I'm having a difficult time understanding the distinctions when it's truly needed.
In most situations they are not needed. If the datasheet of a device specifies that the MCU has to wait some time before putting some other line HIGH or LOW, you may have to insert some delays but in almost all cases I know, these delays are not calls to delay() but to delayMicroseconds().
The delay() call is handy if you want or have to wait some time and you know that the MCU doesn't have any other task to do. Many people are lazy and just insert some delay() calls although they don't know why they're doing so. Often they tried and with some delay it worked for them, later some other people use the same sketch with minimally different circumstances and the code will fail. You should always take the correct way (perhaps wait for some pin to become HIGH or for a value to appear in some register, etc.).