Continuous Speech to Text in Arduino

Hi all,

I'm working on a school project with my Arduino and what I want to do is have the user talk at a steady pace (or slightly slower) and then have the Arduino match the words spoken to words programmed into it's database.

I've done some research and it seems two of the big speech add ons for Arduino are BitVoicer and EasyVR shield. The problem is that these seem more geared towards command recognition rather than converting speech to text. I can imagine that if there were APIs for changing things in code these solutions might still do what I want, but I haven't found the ability in the documentation yet. Wondering if anyone has experience in this area?

For illustration here is how the COMMAND solutions above seem to work:

(command) "Turn on", (option) "blue"/"red"/"green".

  • it's just a quickly recognized phrase, then a word which corresponds to executing an action.

And this is what I WANT to be able to do:

text in database or coded into Arduino, "The quick brown fox jumped over the lazy log"
inputted through mic, "The quick orange goat jumped over the lazy dog"
converted to strings and compared in my code, "The" (hit) "quick" (hit) "orange" (miss) "goat" (miss) "jumped" (hit) "over" (hit) "the" (hit) "lazy" (hit) "dog" (miss).

I don't need a solution that can execute actions or do comparisons (although that would help), if I can just get speech to text then I can parse the strings and do the hit/misses myself.

Hope someone can help with this. :slight_smile:

I don't think the arduino has enough processing power to do speech recognition.

nilton61:
I don't think the arduino has enough processing power to do speech recognition.

I think you may be right. I have been researching a little bit. Still haven't seen anything other than command based speech recognition.

Can anyone else confirm? I may have to go to Raspberry Pi as I have found that there is a speech recognition package named Sphinx from CMU that might do the trick.

Arduinos are powerful enough to replicate the
historic machine that detects the word "Watermelon"
from the uniquely unusual sequence of vowels.

Peace
--Devon

hi!
i am working on a project which is related to speech processing. I have to convert speech to text from mic to display on 16*8 display. kindely help me in speech to text coding using Arduino uno.

A small 8-bit microcontroller is 3 or more orders of magnitude too feeble for this task. A proper CPU is where to start, checkout what there is for the RPi for instance.

Dragon Naturally Speaking runs on a PC or MAC. Siri and Alexa run on powerful servers on the Internet.

And this is what I WANT to be able to do:

text in database or coded into Arduino, "The quick brown fox jumped over the lazy log"
inputted through mic, "The quick orange goat jumped over the lazy dog"
converted to strings and compared in my code, "The" (hit) "quick" (hit) "orange" (miss) "goat" (miss) "jumped" (hit) "over" (hit) "the" (hit) "lazy" (hit) "dog" (miss).

That's particularly difficult... The incorrect sentence is perfectly valid English and it makes since, except I've never seen an orange goat. It turns-out that humans and machines need context and there are no context clues indicating the words are incorrect.

I've forgotten the statistics, but if someone reads a random list of words we'll only understand something like 80 percent of them. Our brain is usually processing in the background so we aren't aware of all the words we're missing, but sometimes somebody says something to you, and don't "get it" right away... You might say, "What?", or after thinking about it for half a second you'll figure-out what was said.