It looks like you've got them reversed (or maybe I'm looking at it wrong). HV side connects to the arduino and LV side connects to the ESP. So on the LV side TXI is received from the ESP and snd sent to the arduino at 5v through TXO, and RXO on the LV side is 3v received from the RXI from the arduinos 5v output. TXO is OUTPUT / TXI is INPUT