I am considering using ESP32 in stereo audio EQ and filtering application. This is what I would like to do
Hardware:-
Use an external codec that has DAC/ADC at sampling rate of 8Khz.
On processor 1
Use external codec to take in the audio stream at 8Khz
Process both the channels with their individual IIR or FIR filters.
Use codec to output the audio at 8Khz.
Optionally, dynamically change filters parameter based on user input
On processor 2
Run user app to host a WiFi server that shows a html user interface to the user.
The user connects to the device via a web browser from PC/Android. The user interface shows the current filters, optionally, the user may change the filters dynamically.
Is this possible to accomplish using Arduino IDE and ESP32 with suitable codec shield?
My first comment is why 8KHz? That means a top frequency of 4KHz max. This means you need good filtering in the analogue domain before you digitise it, and the quality is equivalent to an analogue land line.
Yes you could do all that but how will you receive it? You will need to encode the sound with a voice over IP algorithm because Wi-Fi is not real time. It seems a lot of effort for something that will work so poorly.
I have written some IIR filters for ESP32 that are controlled via a HTML interface. I like this adafruit DAC for outputting good quality audio. The hard part is the ADC. I'm not aware of a cheap 16-bit ADC breakout that can sample that fast. If someone knows of one I'd like to hear about it. To avoid this I transmit stuff via Bluetooth to the ESP32, do DSP on ESP32 and output to the adafruit DAC or the adafruit class D amp if I want to connect to a speaker
Following on from Mike's question regarding the sampling rate. Is this project a speech only project or is it for music?
Also... To prevent aliasing (false frequencies) the analog has to be low-pass filtered to half the sample rate (4kHz in your case) before digitizing (Nyquist sampling theory). For "communications quality" you might be able to get-away without that, but almost all audio ADCs (and soundcards) include an anti-aliasing filter.
Grumpy_Mike:
A rate if 8KHz is not very fast. If I were doing this I would use one of those I2S boards.
Could you provide a link to one of these boards? Fast is meant in a relative sense. A hobbyist breakout at 16-bit resolution and 8KHz is "fast" because I'm not aware of a hobby board that is faster at that resolution. For instance, the ads115 breakout can only reach 860Hz at 16 bit.
Obvioulsy, professional ADCs can reach much faster speeds at this resolution but that's not what I'm talking about.
I'm referring to easy to use hobby boards
https://www.pjrc.com/store/teensy3_audio.html
It is not specific to the Teensy it will work with anything that you can get a driver for.
It is a total waste from, the audio point of view, getting a 16 bit A/D and only running it at 8KHZ.
Grumpy_Mike: PJRC Store
It is not specific to the Teensy it will work with anything that you can get a driver for.
It is a total waste from, the audio point of view, getting a 16 bit A/D and only running it at 8KHZ.
Grumpy_Mike:
My first comment is why 8KHz? That means a top frequency of 4KHz max. This means you need good filtering in the analogue domain before you digitise it, and the quality is equivalent to an analogue land line.
Yes you could do all that but how will you receive it? You will need to encode the sound with a voice over IP algorithm because Wi-Fi is not real time. It seems a lot of effort for something that will work so poorly.
The input is LF (less than 200Hz) so I choose the minimum of what codecs usually support which happens to be 8KHz.
WiFi is just for the user interface, html etc. There is no streaming over WiFi.
On the processor that does filtering, a timer could be run that is fired via the ADC clock. Is it correct?
I have written some IIR filters for ESP32 that are controlled via a HTML interface. I like this adafruit DAC for outputting good quality audio. The hard part is the ADC. I'm not aware of a cheap 16-bit ADC breakout that can sample that fast. If someone knows of one I'd like to hear about it. To avoid this I transmit stuff via Bluetooth to the ESP32, do DSP on ESP32 and output to the adafruit DAC or the adafruit class D amp if I want to connect to a speaker
Following on from Mike's question regarding the sampling rate. Is this project a speech only project or is it for music?
Thanks for the links. It provides me the building blocks for my project. Can ESP32 also do FIR?
DVDdoug:
Also... To prevent aliasing (false frequencies) the analog has to be low-pass filtered to half the sample rate (4kHz in your case) before digitizing (Nyquist sampling theory). For "communications quality" you might be able to get-away without that, but almost all audio ADCs (and soundcards) include an anti-aliasing filter.
wonderfuliot:
Thanks for the links. It provides me the building blocks for my project. Can ESP32 also do FIR?
There is no reason why it can't. It was just easier to implement the IIR filters because they operate only on past values (the FIR requires future values too). Also, IIR filters only require the storage and update of 2 values for every filter cascade. The FIR filters require the storage and update of a buffer of the same order as the filter. While not impossible it just requires a bit more thought.
Depends what you mean by "real-time". I think you need to specify a tolerance for "real-time". For some people waiting for X samples and constructing an acausal filter is "real-time" enough. For instance, waiting for 10 samples at 44.1KHz results in a delay of 0.2ms. That might be fine, depending on the application. The filter now depends on "future" values relative to the output sample of the filter (i.e. the nth output depends on the n-10 to n+10 input samples) and is sufficiently "real-time".
A reductionist view is that IIR and FIR are just doing math operations, so one could do FIR filtering on an abacus, just not very quickly.
In this view, the difference between a IIR implementation and a FIR implementation is that the IIR generally requires significantly fewer arithmetic operations than a FIR implementation.
ESP32 has a 32-bit floating point arithmetic logic unit that can perform a multiply/accumulate (macc) in about 100 nS. Macc is the core operation in most digital signal processing (DSP) filter operations. Ignoring overhead (which isn't realisitic, but gives a back-of-the-envelope upper bound) that means something on the order of 1000 FIR filter taps per stereo sample at 8 kHz sample rate. One would have to know the specific requirements for your filtering, but that seems like quite a lot of processing power, even if real world filter throughput is a quarter of that.
the2ndtierney:
Depends what you mean by "real-time". I think you need to specify a tolerance for "real-time". For some people waiting for X samples and constructing an acausal filter is "real-time" enough. For instance, waiting for 10 samples at 44.1KHz results in a delay of 0.2ms. That might be fine, depending on the application. The filter now depends on "future" values relative to the output sample of the filter (i.e. the nth output depends on the n-10 to n+10 input samples) and is sufficiently "real-time".
I think what I was grasping at is that FIR filter delays are usually very many times larger than the
period of the signal at the cutoff frequency.
In control systems such large delays mean instability issues, FIR is usually precluded. This is
what I was really getting at by real-time, being part of a feedback loop.
In audio you can get pre-echo effects on percussive transients with large FIR filters.