This is likely because there is no sound card or other sound decoding device to convert the seemingly random electrical input from the microphone into usable sound frequency and amplitude data.
Sound cards don't do that. They convert sound into the "seemingly random electrical input" you see on the arduino.
Reliably recognizing a sound is an almost impossible task to do even on a large computer. There a few projects that recognise a word or to but they are not very reliable, suffering from both false positive recognition and false negative recognition.