Go Down

Topic: rMP3 response time (Read 670 times) previous topic - next topic

damixor

I am thinking about using a Rogue Robotics rMP3 on an Arduino Mega to play sound files spliced together to form sentences.  I was wondering if anyone can tell me what the typical delay would be between the end of one file and the beginning of the next, and if there are any noticeable blips in sound at the beginning or end of a file.  I would like to be able to splice things mid-sentance or mid-word, such as "33" being a combination of "thirty" and "three" from two separate files.  I apologize if this has been posted already, but I have not been able to find any information regarding this.

Thank you!

gbulmer

#1
Apr 24, 2010, 05:21 am Last Edit: Apr 24, 2010, 05:21 am by gbulmer Reason: 1
How about asking the makers http://www.roguerobotics.com/?

They appear to have support, and forums.

mowcius

There are not any 'blips' at the beginning or end of the file but I am not sure how fast you can get the next file to play after the first one and if there is a gap as I have not tried to get it faster for my purposes. I did think of using it to count numbers but I have not got anywhere with that yet.

Bhagman is on these forums but he might not see this thread so I would recommend contacting directly (the website has info).

All in all, the board is great though if you are wanting high quality audio.

Mowcius

I haven't done any extensive tests on timing, but there are two sources for delays:

1) Time to open file - this is in the order of 10s of milliseconds (typically 10 to 50 milliseconds).

2) MP3 files are notorious (depending on the software used to compress them) for having massive amounts of data in the header (for older tags, and other header information) that is pushed through the decoder.

We can't do anything about #1.  It takes time to open the file.  (Although, I'm looking into a new way to queue files with the "PC N" command).

If you want to eliminate the time wasted with the MP3 headers, you can use PCM WAV files instead, but because of the bandwidth required for PCM files, you'll be limited to mono 22kHz or stereo 11kHz audio.  If you're just doing speech, mono 11kHz or 22kHz is fine (e.g. AT&T Labs Natural Voices only produces mono 16kHz PCM data).

b

mowcius

Quote
AT&T Labs Natural Voices

That was also what I was looking at for speaking but I had not thought about the WAV files.
10ms is pretty good, I would have thought this would be short enough to not notice.

I knew you would turn up and save me on this thread  ;D

Mowcius

Go Up