Have you checked the return type of Serial.read()? I think it's probably a 'byte'. So you're reading exactly one byte, then print 'yes', read a byte again, etc. In the meantime, your Arduino is probably experiencing timing issues since it can't keep up with printing and reading soft-serial at the same time.
Anyway, as always when people try to knit together an Arduino and an ESP board, my question is: what limitation on the ESP8266 do you think you have that justifies keeping an Arduino in the system as well? Not sure what kind of Arduino you're using, but often people even use a Nano, UNO or something else that's completely overpowered by the ESP (even the old 8266!) without realizing they're bogged down by the complexity of having an inferior microcontroller in the circuit for no good reason at all.