Voice Recognition Techniques

Robo_Pi · October 11, 2015, 9:13pm

Hello,

I would like to start a project to set up an Arduino dedicated to voice recognition. I would like to do this without having to buy any special voice recognition shields or hardware, save for a microphone.

I'm just wondering if this is possible to do just using an Arduino analog input and a microphone?

And if so, can anyone point to any tutorials or library examples along these lines.

Thank you.

DVDdoug · October 11, 2015, 9:35pm

I don't think that's possible. Even with a shield, from what I understand it's voice recognition capabilities are very limited.

The amount of processing (and memory) required depends on how much vocabulary you need (it's not too hard if you just need to distinguish between "Yes" or "No", or the numbers one through ten) and it's easier if it doesn't need to be speaker independent.

Dragon Naturally Speaking requires the power of a PC or Mac, and Siri uses powerful servers on the Net.

And, there's the issue of accuracy. No speech recognition system is 100% accurate. Even humans do have 100% accurate speech recognition, and computers are worse.

And on the analog side, you'll need a preamp for the microphone.

Henry_Best · October 12, 2015, 4:19am

Voice recognition :- recognising WHO is speaking.
Speech recognition:- recognising WHAT is being said.
The two terms are not interchangeable.

Robo_Pi · October 12, 2015, 5:05am

Henry_Best:
Voice recognition :- recognising WHO is speaking.
Speech recognition:- recognising WHAT is being said.
The two terms are not interchangeable.

In that case, I'm more interested in speech recognition. In fact, it would be great if the program worked for everyone. It's my understanding that often times speech recognition software only understands the person it was trained to listen to.

The main thing I would like to do is learn how to program speech recognition. Actually I guess this doesn't even need to be specific to the Arduino. If I could just get a generic idea of how to write a speech recognition algorithm I could adapt it to the Arduino.

Mainly I was wondering if anyone has already done this with an Arduino?

I saw a view videos that uses a $50 "Voice Recognition Shield" for Arduino. (evidently the commercial industry is calling these things "voice recognition" when they should be calling them "speech recognition") Although they probably are fairly sensitive to the voice that trains them. So they might be voice specific even if they can't really recognize different people's voices.

In any case, here's the video.

Voice Recognition (VRBot/EasyVR) + Arduino P1

This is pretty interesting, but I'm wondering if this can be done without the $50 shield.

Here's the shield, and they are calling this a "Voice Recognition Shield"

EasyVR Shield 3.0 - Voice Recognition Shield

I would like to do something along these lines, except with software only, just using the Arduino analog input pins, a microphone, and preamp.

I would be happy if I could just learn how to write a program to recognize a single word or phrase. Once I see how that's done I could expand it from there myself.

anatolyk69_gmail.c · October 12, 2015, 7:15am

Mainly I was wondering if anyone has already done this with an Arduino?

I did it, with UNO and Leonardo. Mic + preamplifier ( gain 100 - 250) + arduino. In case of Leonardo, you don't need an amplifier, internal PGA works beautifully well with a regular condenser mic.
Having 2k memory, UNO capable to store 1 sec. audio track, compressed output of the FFT, than store to EEPROM or run cross-correlation with already saved track. 1 word only w/o external SD cards or other storage. Voice recognition.
Speech is "deteriorated" version of voice recog., so more compression could be done to store ~10 words in arduino itself.
There is a web-archive of the project, original post is lost:

Robo_Pi · October 12, 2015, 5:25pm

Thanks Magician, this is exactly the type of thing I'm looking for. Do you still have your Arduino sketch? It's no longer available in the archives. Could you post it here?

anatolyk69_gmail.c · October 12, 2015, 6:27pm

Certainly.

https://drive.google.com/file/d/0Bw4tXXvyWtFVLUFyMmdLb2RBam8/view?usp=sharing

https://drive.google.com/file/d/0Bw4tXXvyWtFVaktLMExzeUZnU1U/view?usp=sharing

You need only 1 mic, this drawings from another project "sound localization".

https://drive.google.com/file/d/0Bw4tXXvyWtFVT2tERVkwUEVBUHc/view?usp=sharing

https://drive.google.com/file/d/0Bw4tXXvyWtFVOUVaRk5NbS1hVzg/view?usp=sharing

AFAIR, spectrogram is what my laptop says "test left" - linux ubuntu OS test speaker phrase

https://drive.google.com/file/d/0Bw4tXXvyWtFVWjU5aUlWdXphY2M/view?usp=sharing
https://drive.google.com/file/d/0Bw4tXXvyWtFVWjU5aUlWdXphY2M/view?usp=sharing

DVDdoug · October 12, 2015, 7:09pm

In fact, it would be great if the program worked for everyone. It's my understanding that often times speech recognition software only understands the person it was trained to listen to.

And how much vocabulary do you need?

The main thing I would like to do is learn how to program speech recognition. Actually I guess this doesn't even need to be specific to the Arduino. If I could just get a generic idea of how to write a speech recognition algorithm

I believe that's an advanced topic, but you can probably search the Net or get a book. If you took a university class in speech recognition it would probably be a 2rd or 4th year class, if not a postgraduate class. :o

It should be a LOT easier on a computer since you already have a soundcard, and if you're on a laptop you've already got a microphone. Or, you can read from a WAV or MP3 file, etc., and you don't need a microphone (and you won't have to talk to the computer over-and-over during development). And, you'll have lots more memory & processing power, and a video monitor to display the text.

It's a little more effort to install and configure an IDE/compiler on your computer, but that's just "overhead" that you only have to do once. GUI programming adds another level of complexity, but you don't have to do GUI just because you're on a PC or Mac.

P.S.
Although you probably won't be writing your own FFT/DFT library or anything like that, it might be good to have some digital signal processing under your belt. There is a good FREE online DSP book called [u]The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.[/u]

Robo_Pi · October 12, 2015, 8:51pm

@DVDdoug, Thanks for the suggestions. I actually have several IDEs on my computer and I've been searching for examples. Typically what I find are examples that already use other software or libraries that handle the actual voice or speech recognition. For example, "using speech.RecognizationEngine;". And then they just explain how to use that class.

I actually have quite a bit of background in digital and analogy signal processing. But I've never applied this knowledge specifically to speech recognition. I just bought a bunch of Arduino boards, and I thought I would dedicate one on my robot specifically for speech recognition. So that's the main reason I posted here on the Arduino site.

@Magician, Thanks for posting your Arduino sketch. That's exactly the type of approach I had in mind. I did get some compile errors when I tried to compile the code though. This might be because I'm using the Arduino 1.6.5 IDE. I might try downloading the 1.0.1 version and see if it will compile with that.

The errors I'm getting on the 1.6.5 version are as follows:

Build options changed, rebuilding all
VOR_remix_3f:40: error: 'prog_int16_t' does not name a type
VOR_remix_3f:51: error: 'prog_int16_t' does not name a type
In file included from VOR_remix_3f.ino:27:0:
VOR_remix_3f.ino: In function 'void fft_radix4_I(int*, int*, int)':
VOR_remix_3f:227: error: 'Sinewave' was not declared in this scope
VOR_remix_3f:228: error: 'Sinewave' was not declared in this scope
VOR_remix_3f:229: error: 'Sinewave' was not declared in this scope
VOR_remix_3f:231: error: 'Sinewave' was not declared in this scope
VOR_remix_3f:232: error: 'Sinewave' was not declared in this scope
VOR_remix_3f:233: error: 'Sinewave' was not declared in this scope
VOR_remix_3f.ino: In function 'void loop()':
VOR_remix_3f:394: error: 'Anatoly' was not declared in this scope
VOR_remix_3f.ino:73:24: note: in definition of macro 'mult_shf_s16x16'
'prog_int16_t' does not name a type

anatolyk69_gmail.c · October 12, 2015, 10:31pm

Tempora mutantur, as they say.
https://drive.google.com/file/d/0Bw4tXXvyWtFVWklIVWZoQm03WEU/view?usp=sharing

https://drive.google.com/file/d/0Bw4tXXvyWtFVWklIVWZoQm03WEU/view?usp=sharing

Try. You will see FFT Radix-4 algorithm in the main sketch, since than I write a library, faster and better SplitRadix. But try first, and see if you can make 90% match. Later on it would make sense to implement a library

VOR_remix_3f.ino (22 KB)

Topic		Replies	Views
Voice Recognition!! Audio	4	985	May 6, 2021
Speech recognition General Guidance	14	8915	May 5, 2021
Speech Recognition With Arduino General Guidance	35	5103	May 5, 2021
Arduino speech reognition General Guidance	8	2910	May 5, 2021
Getting started with voice recognition General Guidance	2	2203	May 6, 2021

Voice Recognition Techniques

Related topics