Hello everyone, I am thinking a project but I don't know how to do it so I want to ask here. I didn't work with sounds yet. What I wanna do is, I want to count a target word or voice. What I don't understand is if there is a noise, how will I separate voices. For example lets say target word is "yes". I want to count the word "yes" that I say myself not the other ones. I don't know how to do that but I have an idea , I don't know if it is true. If I put microphone close to me and measure the my "yes" voice. Lets say it is between 300 and 500. If I count the 300-500' datas, will my project work? I found this graphic on the youtube video. I don't have microphone yet so I wanted to ask you guys first.
That requires "advanced processing" beyond what the Arduino can do.
There is a shield (add-on board) called EasyVR that supports a limited vocabulary. (I don't know how well it works but I'm pretty sure it's not "perfect".)
Just for reference... CD quality audio has 44,100 samples (data points) per second, and you do have to know the sample rate for the data to "make sense". You don't need CD quality for voice but 500 samples is "nothing". And of course, you're not going to get the same-exact samples every time you say "hello". Even if you play a recording to with the exact sound the digital data will be different because the sample-points are not synchronized with the acoustic/analog signal and you're sampling different points on the "wave".
The Audacity website has a little introduction to digital audio. I don't know exactly how voice recognition works but it involves FFT to analyze the frequency content. Then there is some advanced processing/analysis beyond that. There is a program called Dragon Naturally Speaking that runs on a regular computer and Siri & ALexa run on powerful servers on the Internet.
For FFT, I can use MATLAB with Arduino. That module is quite expensive for me . I saw dog barking module 0.32150 USD. It is really really cheap
. That voice part is hard I guess. What about animals? For example can I count cats meow? datasheet in this, it says dog, tiger, duck pig etc. How this module separate right sound? I am asking because I really don't know
It isn’t.
That link shows a chip that will produce the sound of a bark it will not recognise the sound of a bark. It has a pre programmed sound in it so you need a different chip to produce the sounds of other animals.
What you are trying to do is not really possible on a small Arduino.
In that case you'll need a computer for MATLAB so you don't need an Arduino. But, FFT is only the beginning. But like I said I don't know how voice recognition actually works... You'll have to do more research to figure-out what to do with the FFT data. This is "advanced stuff".
If you want to play about with voice recognition you will need an Arduino BLE Sense 33 board.
Then you need to read about sound recognition
Sound recognition data
Then the how to do it with AI - Google’s flagship machine learning library: “TensorFlow“
Tensor Flow
It is not an easy ride.
Okay, thank you.
Okay, thank you. I will look machine learnings. Arduino RP2040 has microphone maybe I will buy it and try to do something with it. What about microphone module. I resarched and found this: "Microphone module's sound waves are translated into resistance and It's resistance is exactly the same as the amount of sound.This means that where is a lot of sound then there is also a high resistance and where is almost silence there's almost no resistance. " With potentiometer I can adjust high resistance side and I will put the microphone close to me. While this I can't count "yes" but I can count the time that I speak right? With potentiometer I can cancel the other voice and I can count only my speak times right?
No you can’t you can only affect the sensitivity. There are m many types of microphones and each one works in a different way. That description is quite a crude one.
The signal from a microphone is tiny and it needs amplifying to a 3.3 or 5.0 volt peak to peak signal before you apply it to the computer’s A/D converter.
The sort of microphone built into Arduino boards is called a MEMS type and works on a totally different principle. It works more like an accelerometer than a conventional type.
Okay thank you. I will try to understand sound classification using machine Learning then.
I would suggest the BLE 33 sense board because that is the board that is used in those AI examples in that link.
I think that trying to cope with those examples on a very different processor would be too much for you to do, given the level of questions you are asking.
Thank you I will look at it. I need to examine them thoroughly. I said Arduino rp2040 because I found this . This is using python. I also found article when I research but I don't know which sensor they are using (cheap ones or expensive ones). "The system collected animal's certain data through the acoustic sensor nodes and used an audio recognition algorithm to collect animal's certain sounds."
I also found this. Some of them using 2 microphones "Acquisition of chewing signals by two microphones that constitute a sensing unit where a first microphone captures the chewing sounds made by the animal and a second microphone captures the environmental sounds;".
This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.