Voice tone analysis for mental health

Hi,

I am currently in process of building an Arduino to detect human voice tone for mental health purposes. The project is to tell the difference from a low tone hello to a higher tone hello. Humans can quickly identify how someone is feeling through their tone of voice so I wanted to build something that could help identify how someone is feeling through their tone of voice using their first words and then produce a relevant reply from the Arduino depending on the tone of their hello (low for sad and high for happier).

Please let me know if this is something someone can help me with, I currently have the Arduino Uno and a sound detector connected up to the board but I am rather new to Arduino and I'm guessing I will need more components and with COVID-19 this is limiting my access to university, any help would be greatly appreciated to get me on the right path on what I want to achieve.

The Arduino doesn't have enough processing power or memory for anything like that.

This is a hard AI problem really - yes you could measure pitch, but simply recognizing words
needs serious DSP grunt to start with - some of the Arduino models with faster processors
might get somewhere with this, but tone of voice sounds like advanced research in machine
learning, not a simple microcontroller hack...

But when you get it perfected I need one... I'm not "the sensitive type"... :smiley: :smiley: :smiley: :smiley:

How will your project take into account that voice pitch differs quite widely from person to person?

It would be a good idea to get this working on a PC before trying to fit it into an Arduino.

jordandriver:
Hi,

I am currently in process of building an Arduino to detect human voice tone for mental health purposes. The project is to tell the difference from a low tone hello to a higher tone hello. Humans can quickly identify how someone is feeling through their tone of voice so I wanted to build something that could help identify how someone is feeling through their tone of voice using their first words and then produce a relevant reply from the Arduino depending on the tone of their hello (low for sad and high for happier).

Please let me know if this is something someone can help me with, I currently have the Arduino Uno and a sound detector connected up to the board but I am rather new to Arduino and I'm guessing I will need more components and with COVID-19 this is limiting my access to university, any help would be greatly appreciated to get me on the right path on what I want to achieve.

I am currently working on something familiar. I want to build a voice authentification device where you create something like a "voice fingerprint". I think for your project you'd need actually the same data that i would use. But believe me it's not an easy one. For the time domain voice fingerprint it would require tons of audio material of you speaking which just wouldn't fit on a smaller microcontroller. But the frequency domain is more interesting. I analyzed the spectrum of my voice and it has - like every other voice - a fundamental frequency with an overtone series that is actually very unique compared to other humans. I would suggest you start from there. First you need to find out which person is talking into the MIC and then you can make further examination of the audio material. Because like jremington is pointing out: there is quite a huge difference between different persons voices. So to create a device for sensing mental health issues you'd need to know who is talking first.

Let me know if you are interested in how far i've got from now. Just write me a message or better: post in this thread. Maybe i make a new thread to my project, too if i can present you guys some results. But at the moment I am only being able to make a discrete / fast fourier transformation of the audio files of my voice. But I'm having trouble right now with writing an algorithm that compares the loudness of the frequencies from recording 1 to recording 2. It needs a little tolerance as my voice samples slightly differ from recording to recording and that's the exact issue i need to solve before i can present you guys something.
interesting project though! keep us updated.

I think there is a fundamental misunderstanding about what “tone of voice” means in this context. It does not mean the pitch of your voice, but a whole lot of other stuff about how your voice sounds.

Have you ever noticed that at a passport control or customs point the officer will try and get you to say a few words. The are not interested in what you say but how you say it, or to put it another way your tone of voice. They can tell if you are stressed or not. It is not foolproof but is is good.

Some insurance companies do an automatic analysis on your voice when making a claim as a sort of lie detector. So this sort of thing can be done, but I suspect something more powerful that an Arduino. This problem requires some quite sophisticated AI to do. I recommend you look up “tensor flow” to see how this might be done on one of the upper end Arduinos. But it will take a lot of training data of good quality to make it work.

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.