Go Down

Topic: Doing Text to Speech without a Dedicated Chip? (Read 1 time) previous topic - next topic

tnecniv

Hello, now that I have got my speaker working, I am interested in making my arduino do some very basic text to speech. More specifically, I need it to say a number anywhere from zero to sixty.

It seems the simplest approach seems to be a dedicated chip like, http://www.sparkfun.com/products/9578, but I would prefer a more cost effective solution.

I was told to check out phonemes, but I have not come across a good library to generate them. The best thing that I have found is http://code.google.com/p/tinkerit/wiki/Cantarino, but it seems rather complex and lacks documentation.

Is there any solution that I am missing?

CrossRoads

Sure - do a switch:case routine, have the sounds stored in serial/SPI EEPROM, read out and write to a serial/SPI DAC, little it of lowpass filtering to cut out digital switching noise (R/C filter), then into an amp to drive your speaker.
Record all your sounds  on PC, write a sketch to load them into specific EEPROM locations so you know where to read them back from.
Could even do the same to capture them - hit a button to start recording from serial/SPI ADC & store to EEPROM, hit button again to stop.
Use 8-12-16 bit devices at a speed that will get fidelity you want (speed at least 2X the upper frequency you want).
For recording may want to pre-store sampled sound in serial SRAM for playback confirmation prior to writing to EEPROM due to EEPROM generally having slower write speeds.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

tnecniv

#2
Jun 30, 2011, 09:53 pm Last Edit: Jun 30, 2011, 10:00 pm by tnecniv Reason: 1
Thanks, that sounds fairly doable, but I have not done anything with EEPROM yet (in fact, I had not heard of it until you mentioned it). Googling is just giving me hits to the library so how would I load EEPROM with my sounds? Do I write a "load" sketch that is different from my normal sketch? What sound format would you recommend using?

I am reading some stuff on EEPROM now, but do you have any recommended reading for an introduction on the subject?

CrossRoads

Sure, write a EEPROM load sketch.
You can do it byte by byte, there is library code for that, takes like 3.3mS per byte, or you can load pages at a time, overall would be much quicker.
Depends on the EEPROM you select. I would recommend an ATMEL part to maintain compatibility with the ATMEL uCs.

Try this application note.
http://www.atmel.com/dyn/resources/prod_documents/doc8546.pdf

I would write a sketch to capture 2 bytes from ADC, write to 2 bytes of SRAM, do that to capture your sound/spoken number/whatever.
Then write those bytes in blocks to the EEPROM.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

CrossRoads

Maybe Serial Flash would be better, I think of them generically as EEPROM also, but these are available in much larger sizes.

Here is another good app note,  and these data rates may be all you need.

http://www.atmel.com/dyn/resources/prod_documents/doc1456.pdf
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

tnecniv

Thanks, I will look at those.

I lack a microphone that I could use to record the audio segments to interface with my arduino, but I do one in my laptop. Is there some way I can record the sound clips on my laptop and transfer the data over?

CrossRoads

Yes.
Your (windows?) laptop may have a program called Sound Recorder that can capture sound and store in some file format. You will have to adapt that capture format to suit the DAC you are using. For example, I beieve Sound Recorder can be set to various quality levels/samplng rates.  Output may go from -32767 to 32768, that is 16 bit swinging around 0. If your DAC is 16 bit you are using a single supply, the data may have to be modified to reflect 0 to 2^16 with 2^15 as the centerpoint, then output thru a capacitor to let it swing around 0 going to your speaker.
Or use dual supply DAC so everything can swing around 0.
Lets of ways to get there. Start with capturing some sound, see what the data looks like.
Or just get a microphone somewhere. I used to make little boxes with a Radio Shack condenser microphone, 9V battery, two 1/4" jacks for recording with. You could do similar. That might be easier to stay independent of .wav files & stuff.
Where are you located?
We had a local surplus store that had nice Sony microphones with 1/8" jack for $10, I picked up a couple to play with. Maybe you can find something similar.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

CrossRoads

I was just reading this thread - this might be even easier to adapt to your needs

http://arduino.cc/forum/index.php/topic,65192.0/topicseen.html

see the links that fat16lib has posted.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

Go Up