Using Arduino to isolate frequencies and volume?

Hello, this is a beginners question, wondering if and arduino is right for my project where I need to analyse some audio from a microphone.
I have tried to read up on this forum, but haven't found a satisfactory answer to my question.

I have a project in university and I want to build a prototype device that just has a light that turns on when the noise level of a specific range of frequencies reach a threshold, and I would like to monitor the sound with a microphone. Specifically, I want to build a device with a light that turns on when a child is speaking loud enough, and does not react to adults speaking.

I have found this project, a clap switch, which probably could be very similar to what i want to build, but how do I isolate the frequencies?

My questions are:
Is an Arduino and a microphone + code enough to make this happen?
What is the simplest way to code this? I have been looking at FFT, but to me it seems like its overly complex for this task.

Coming from some music production background, I was first thinking that I would be able to set a BandPass filter and then a volume threshold on an Arduino somehow and that would suffice with an LED also hooked up.

Thankful for response. I hope I posted this in the right forum.
Happy Wednesday!

Specifically, I want to build a device with a light that turns on when a child is speaking loud enough, and does not react to adults speaking.

That's tough. As you probably know, real-world sound is complex and it contains many frequency components. Throw-in the usual volume-variations and it gets more complicated.

You might just try some experiments with [u]Audacity[/u] which can record sounds and then display a spectrum. If you can reliably see the difference between adults and children, maybe there's a chance of doing it in software.

Is an Arduino and a microphone

You need a preamp with a biased output. (The output has to be biased because the Arduino can't read the negative-half of the AC waveform.) The cheapest & easiest solution is a [u]microphone board[/u] with a mic and all of the electronics built-in. (That particular board probably doesn't have enough gain and it doesn't have adjustable gain, but there are other-similar options.)

What is the simplest way to code this? I have been looking at FFT, but to me it seems like its overly complex for this task.

FFT (or FHT) is "the way" to get frequency information. You might be able to get-away with a high-pass filter, but I don't know if there is a filter library for the Arduino so you might have to write your own code from scratch. (There are FFT & FHT libraries.)

Specifically, I want to build a device with a light that turns on when a child is speaking loud enough, and does not react to adults speaking.

Forget it until you are no longer a beginner. Then you can forget it.

I would endorse DVDdoug's suggestion of getting Audacity and have look at the sort of spectrums you get. If you can't spot the difference between a child's voice and an adult female's voice with the pattern recognition algorithms in your head, you have no chance of doing it on a computer, let alone an Arduino with it's 1980's level of processing power.

We get this sort of thing asked at least once a week in one guise or other.

Thank you very much for your answer!

With the response I get from you two now I will probably try to do this in some other way than with arduino then! haha

Sounded very simple in my head first, but now with your explanations, I seem to understand the issues more.

Audacity sounds interesting, gonna look into that more! :slight_smile:

Grumpy_Mike:
If you can't spot the difference between a child's voice and an adult female's voice with the pattern recognition algorithms in your head, you have no chance of doing it on a computer, let alone an Arduino with it's 1980's level of processing power.

We get this sort of thing asked at least once a week in one guise or other.

Ok, yes, but the very point of this would be to hopefully demonstrate that there is no need for to much analysis.
The point would be to show that it is possible to tell that a kid is "screaming" or being annoying in a scenario where we are sure that adults speak in a natural tone of voice.

It would however have to distinguish between a kids loud voice and loud background noise, and not perfectly at that either.
The whole idea was that the device could identify that a kid in a closed room was behaving bad in this simple sense. much how like SoundEars work, but simply a bit more specific towards kids...

Does this change any thoughts? Or does it still seam un-feasible for a newbie?

Thank you for your previous answer! :slight_smile:

Does this change any thoughts?

Well no, sorry. Given that the page you linked to says:-

We do this through our 20 years of experience in the noise monitoring industry,

And that is a company not an individual trying to learn stuff. They are very cagey about exactly how they do it. It is almost certain that they use a DSP ( Digital Signal Processor ). These sort of things have instructions specifically designed for the sort of operations you need to do, and they are very powerful processors.

It would however have to distinguish between a kids loud voice and loud background noise,

Sometimes what seems easy to you, sometimes is very hard for a computer. This is why computer vision was only 10 years away in 1970, and the same goes for AI.

Audacity sounds interesting, gonna look into that more!

Yes I would strongly recommend you do that.

First, the older 8 bit boards are woefully under-powered for this sort of project. Plan on using 32 bit hardware, at least ARM Cortex M3 or M4.

Personally, I don't believe in the philosophy expressed on this thread, that this stuff is so hard and can't be made accessible for beginners. Why? Well, because I've put quite a lot of work into a large audio library over the last 4 years. It has 2 types of tone detection, plus 1024-point FFT analysis, which really are easy to use. The library runs on Teensy 3.2, 3.5 & 3.6.

Here's the library on github:

A few years ago we made a tutorial for getting started. The tutorial was written before either of the tone detection features, but it does cover FFT and the general way you use the library. There's a 31 page PDF and a 45 minute full walkthrough video, in case you get stuck and want to see how to do any part.

https://www.pjrc.com/store/audio_tutorial_kit.html

Usually FFT frequency bins end up being too coarse if you're looking for very specific frequencies. If you use the audio library design tool, scroll to near the end of the long list of audio features. In the "analyze" section, you'll find "tone" and "notefreq".

The "tone" analysis looks for the amount of a single (or narrow bandwidth of a) frequency. It's actually using the Goertzel algorithm. You can configure how long the analysis computes, where longer times give higher selectivity, but slower response. Like everything in the design tool, just click on the object you want and the right-side panel updates with the documentation for the functions you can use in Arduino to control it. There's also a couple examples (in the libraries File > Examples > Audio menu) for using 7 of these tone detection objects to decode DTMF dial tones.

The other one is "notefreq". It uses the very advanced YIN algorithm to search for the strongest fundamental frequency. Unlike "tone" analysis, where you choose the frequency, it searches for the frequency and tells you what it finds. Usually the main application is recognizing musical notes. It works very well, even on complex sounds like guitars and tubas. It also consumes a lot of CPU time, nearly all the power of a Teensy 3.2, so if you're going to use this and want to also have FFT or other computationally heavy stuff, you'd probably want to use the faster Teensy 3.6 board.

As you can see in the example and design tool documentation panel, these fairly advanced DSP features come packaged in a library that's very easy to use and you can access their results from Arduino sketches using the familiar available() and read() functions, to know when the analysis has produced more data, and to read it into your code.

Sure, inside the library is rather complicated DSP code. But just like you don't need to be an expert mechanic to simply drive a car, you really don't need an advanced engineering degree specializing in DSP to simply use these powerful analysis tools. All you really need is a good library. That's what I've tried to make for you, and for everyone who might need to do this sort of sound processing. Hope it helps?

While I completely agree with that and admire the tools you have produced they are only tools. And what you need to know is to how to use these tools to achieve your aim. Not only that that you need to know if your aim is indeed achievable using those tools.

It is like being given a lathe and wanting to make a round rod into a hexagonal rod. It takes a very special lathe to do that.

Similar sentiments were regularly expressed about microcontroller programming in general, before Arduino came along and dramatically lowered the barrier to entry. What used to be a path of formal learning with a daunting list prerequisites became a platform for experimentation, with a focus on accessibility for novices.

Sure, a master craftsman with years of experience can do tremendous work with powerful tools. But is that any reason to discourage novices from experimenting with those tools? Especially in a case like this question, where the expressed goal is recognizing tones, why not jump in feet first and try stuff?

On the tools analogy, indeed if you want to make a hexagonal rod, that's going to seem like a daunting task if you have only a lathe.

However, modern technology is providing new types of tools, with decreasing cost and improving usability. Today you can use a 3D printer to make hexagonal shapes. Sure, like any tool it comes with all its own challenges and limitations. Maybe if you're a grumpy machinist you'll immediately dismiss additive manufacturing tools. But a quick glance around any maker gathering shows people - relative novices without formal machinist training - are indeed putting 3D printing to all sorts of imaginative uses.

With audio processing too, some tasks that formerly seemed daunting now have excellent tools. Every month or so when I look at this part of Arduino's forum, there seems to be more threads with defeatist attitudes about detecting the frequency of natural sounds and musical notes having complex & strong overtones / harmonics. Indeed if your imagination revolves around high-Q bandpass filters & envelope detection, that would seem pretty much impossible. But times have changed. 16 years ago a paper was published with a highly effective algorithm, and now the upper end of today's micrcontrollers are plenty fast enough to run it in real time. New tools make this formerly very hard task fairly easy to do. I really wish there were some way to get that idea across to people who keep saying it's so hard.

This trend is only going to accelerate in the coming years. Soon Cortex-M7 at 500+ MHz speed (and often 2 instructions per clock) will become a mainstream microcontroller, and GHz speeds with more advanced peripherals and coprocessors are on the horizon. After so many years of microcontroller fabrication with 1990s era silicon nodes, we're finally starting to see migration to 65 & 40 nm (hardly state of today's art) and soon 28 nm (predicted to become the cost-effective "sweet spot" for the next decade or so).

Very soon low-end machine learning (likely with training on PC-class hardware) will become quite feasible on micocontrollers. So will far more advanced algorithms computed in real time. Personally, I'm excited about the amazing tools I and others will be able to provide. I really want to encourage people to play & experiment with them. Even in the hands of novices, pretty amazing feats that formerly seems impractical will become fairly easy to do.

Of course you can still turn parts on a lathe if you want, and indeed for certain tasks that will always be the best way, but the lathe no longer the only powerful tool in the shop!

I would not disagree with anything you said, but given that this is a beginner we are talking to:-

But is that any reason to discourage novices from experimenting with those tools? Especially in a case like this question, where the expressed goal is recognizing tones, why not jump in feet first and try stuff?

Well, the key thing is what he wants to do, which is:-

Specifically, I want to build a device with a light that turns on when a child is speaking loud enough, and does not react to adults speaking.

Given that an adult female is often used to voice a male child in things like cartoons and advert voice overs, and that fools us, with all the processing power in the brain, then the chances of picking up any measurable distinguishing features are slim. This is the sort of project I would have given to a final year degree student to investigate, expecting a report explaining exactly what were the problems when I was a Senior University Lecturer ( for U.S. readers == professor )

Now it is a debated point if you inspire a beginner by telling them something could not be done or if you discourage them by saying something could be done while clearly there is no obvious path.

My observations from experance of this forum is that people want to know how to do something and if you can't tell them how to go about it they get discouraged. We have enough problems explaining how to do something as simple ( to me ) as driving an LED matrix.

I thought by pointing the OP towards exploring the, admittedly not real time, audio processing available in Audacity was a positive response. And I would by no means discourage him from exploring the tools you have produced. However, I would not like to over state my expectations of his success.

Wow! Im amazed of your help in this topic!

Big props to PJRC and Grumpy_Mike, but I think you might overestimate the essence of my goals?

Well, as I said, the idea that I have is based on it working only good enough, with a lot of room for error. I downloaded an ableton 10 free trial and downloaded some audiotracks that fit the purpose with background noise, adults talking and annoying and screaming kids. I simply put a bandpass filter on it at around 11kHz and a noise gate and I'm very pleased with the outcome. the gate clicks in on all the occasions it should, with an acceptable margin of error.

While I'm reading this I both get excited and a headache. This prototype is not meant to be spent more than around 5-10 hours of building and if it is as complex to build this on arduino as it now seems to me, then I might have to take a step back and just elaborate some more on the filters I have used in ableton.

I will have to do some more testing, but I'm quite happy with the simplicity of the solution.

Happy monday!

Funny you should mention lathe and 3D printing work. The lathe is the machine I have the most experience with, but just last week I bought a 3D printer. I guess I was tired of only making round things haha

Well I have seen very specialised lathes that can turn square bars, but they are not very common.

As to the filters then they would be better implemented on a teensy, but you might want to implement them in hardware. You can do that simply with an op-amp or two. If you know the order of the filter that gives you good results in Ableton you can simply implement this in hardware.
Where digital filters score is in the simplicity of adjustment of roll off and Q.