There is not an easy way to distinguish noise from voice, that's an artificial intelligence problem!
There might be parameters that are measurable that are good clues, like the power spectrum
of the sound envelope, or even just the max level averaged over a few seconds