Hi! I want to use a TTS conversion library to use in ESP8266 that sounds human-like, but I don't want to use the internet for it, since I am planning to give this to people in low-resource areas. Is there any library that will help me do this? Is it better if I use the ESP32? Is it impossible with both? Thanks!!
It should work provided You find that library. No suggestion from here.
Great! But what library could I use?
What libraries have you considered using? Post links.
What will you use for text input? What amplifier/speaker for speech output?
I was planning on using earlephilhower/ESP8266Audio, but I'm using an SD card, so I can't, it says that it uses the CS pin, and I need to change it, which I prefer not to since establishing the SQL communication was so difficult.
Then, I saw earlephilhower/ESP8266SAM, but it uses the library before.
I also saw GitHub - horihiro/esp8266-google-tts, but it uses the internet, so it doesn't meet my requirements.
Finally, I saw GitHub - jscrane/TTS: Arduino Text-to-Speech Library, but I saw on the internet that it sounds like a robot.
I need something to be children friendly. If it is impossible to achieve without the internet, then it's fine. But I prefer for the ESP8266 or ESP32 to do the processing themselves.
P.S. Sorry for only posting 2 links, it only allowed me to post 2
Should be easy to fix. But that is an audio player library. What will you use to generate "human sounding speech"?
Yes it will.
I think that any TTS that doesn't involve the internet will sound like a robot.
Why do you think that only with the arrival of TTS that uses the internet did TTS stop sounding like a robot.
The internet brings many more resources to the problem than you could ever get from a standalone system.
Ohh you are completely right. It's ok then, thank you very much!!!
Not true. You need only code that produces human speech sounds, but since that has largely been developed by major corporations for use "on the internet", much of it is proprietary. The hobby market for such code is small, so there is little incentive to develop it.
There are offline text to speech apps for Android and iPhone. It may be worthwhile to investigate whether that source code is available.
Current work, especially if it uses deep learning, seems to be mainly coded in Python. This open source example is worth a look: TTS download | SourceForge.net
You need more than human speech sounds. They change depending on the context the words are used in to get the emphasis correct. These need vast amounts of trained AI to analysis the words before and after the one you want to say, it is not just a matter of one word one sort of sound.
If it were that easy it would have been done decades ago. Remember most stand alone controllers have the computing power and storage of an early 90s desktop computer.
Understood. Good code of course takes context into account. Check out the open source TTS language model I linked in post #9. The sound quality is excellent and it will run on a Raspberry Pi.
Festival also run on RPi, and is probably a lot faster. It has pretty reasonable voice quality. Demo here.
I just tried the phrase "the cat sat on the mat". I wasn't too impressed with the way it said "mat" it was more like mut.
I'm assuming you used Alan" for the voice? It is Scottish accent. ![]()
Others aren't so bad.
This is old but funny (voice recognition and Scottish accents):
--- bill
Yes I did. I tried some of the other voices, and one missed out the opening word "the".
By the way, I remember that clip form the original transmission, it is funny.
I did look at some of the instructions about how it worked and how you install it. It looks like it is not very easy to get going and also that you need some sort of software server, so it needs a network but not access to the internet. I could be wrong but that is the impression I got.
It also requires some sort of hardware sound output module, otherwise it says the voice quality is poor. It is also composed of about five or six modules that need downloading and compiling.
Still, it is interesting and quite a bit better that I expected.
This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.