There are two sources (microphones), so the phase is the relative phase of one with respect to the other.
Which triggers a thought - perhaps an FFT is not the best way to make a phase comparison of two sine waves (which is what the non-technical term "peak frequency" actually is).