...Our ears are very sensitive to noise, and building audio electronics can be frustrating!
And just so we are clear - "Noise" is unwanted sound that's always present (usually-hopefully in the background) and most-noticeable with quiet sounds or when there is no signal.
"Distortion" only exists when there is a signal and the most common type of distortion is clipping, which happens during loud parts like when you try to get 2W out of a 1W amplifier, etc.
And since microprocessors can't directly put-out the negative-half of the audio waveform you can get rectification-distortion (at all levels) if you don't properly deal with that.
You should test the amplifier and ESP32 separately. i.e. You can test the ESP32 by with powered computer speakers or by plugging-into a low-power stereo, etc. (I wouldn't plug anything "unknown" into a good hi-fi system. )
And you can test the amplifier with your computer's sound card or with your phone, etc.
The amplifier probably has WAY too much gain. Is there a volume control? Any noise will be amplified and the amplifier can easily be driven into clipping. If the processor and amplifier are sharing the same power supply you essentially need zero (voltage) gain. (You do need power/current gain because the processor can't drive an 8-Ohm speaker.)
Low-pitch hum is power-line hum. That can come from the power supply, or it can be "picked-up" electro-magnetically if the circuit isn't shielded, or it can sometimes be a ground loop.
High-pitch whine is some kind of switching noise. That's usually from a switching power supply. But it can also come from the processor.
Hiss is "normal" analog-amplifier noise. It's usually more of a problem in preamps (where you have a weak signal and high gain) than in power amps.