I'm new in the Arduino universe. I'm a postdoc researcher and someone from my lab suggested implementation of Arduino for a project of mine. Essentially, I want to use an external TTL signal to trigger the delivery of different sounds (up to 3 different 50 ms sounds in a sequence, ± 8 ko per sound, 44100 Hz, 16 bits). Up to now, seems ok I think.
My problem is this: I don't want to turn the Arduino into a DAC (which I understand is rather simple) due to sound quality concerns related to the Arduino capabilities. Hence, I would like to send a digital signal of the sound to an external USB DAC/headphone amplifier to drive my headphones. This would have the advantage to make sound level calibration possible through volume manipulation of the external DAC/amplifier.
Would anyone have a suggestion as to how to proceed?
You could use a 32-bit arduino/clone version that can provide 16-bit output quality from PWM. The only issue I can think of with using PWM as a DAC is that you need a higher level of oversampling to get CD-quality audio. We're talking 1980's technology here
I am using this I2S DAC for a Bluetooth audio streaming receiver and it gives excellent audio quality. The arduino that I'm using is ESP32-based which has I2S digital output built in, but there may be arduino libraries to bit-bang I2S from I/O pins. Not sure.
[edit]
It occurs to me that if these sounds aren't being generated on the fly, then it may be simpler to use an MP3 module to play back prerecorded audio. Would that work?
Thanks for the reply! The sounds are indeed not generated on the fly but pregenerated as wav files. I stumbled upon that: Adafruit Audio FX Sound Board - WAV/OGG Trigger with 2MB Flash : ID 2133 : $19.95 : Adafruit Industries, Unique & fun DIY electronics and kits. I think some thing like that could work. I could output it to an headphone amplifier and then calibrate my headphones output level by adjusting the amplifier volume. Could the Arduino be used to take the TTL trigger as input and transform it based on pre established rule (sound sequence for instance) so that it sends the appropriate trigger to the Adafruit sound board? Could the processing delay be estimated? I must remain around a millisecond..!
The problem with that is, the devices to make the Arduino into a working USB host are awkward. The RPI is a more plug and play platform for what you describe because it can easily support headphones and other audio devices like that.
You can use the Arduino with a sound player module as you describe. I see no reason why it wouldn't work. The processing delay would be measured in microseconds not milliseconds.
The Adafruit board might be the way to go. Or there are other "audio shields" for the Arduino. The audio shield has a DAC, the clock for the DAC, a slot for a memory card to hold the audio file, etc., and the Arduino just acts as a controller to tell it to start/stop, what file to play, and maybe to adjust the volume.
That won't work... USB soundcards & USB audio interfaces need a driver and the driver runs on Windows, OSX, or Linux.
Adafruit has pretty good support: you can ask them. For just about any other audio player, you'll have to get one and test it. It's not likely to be something they already specify. There's going to be overhead with responding to the command, reading the filesystem and then playing back the audio.
For most applications, a few 10's of milliseconds response time wouldn't be a problem, so I would not assume that the delay is a millisecond or less.
It's perhaps not useful due to the complexity, but this might not be completely hopeless as some devices use a generic driver, such that it is just plug and play with no driver installation. You could potentially dig into the Linux source code and adapt that for the Arduino. But it's a mammoth project.
It is 8 bit sound, but have you ever listened to 8 bit sound? It is not as bad as some people should have you believe. This is the sort of results you can get
Why are you interested in the delay?
Have you had much practical experience of doing stuff with sound?
What actually do you want your project overall? Because it might be an X-Y problem.
I am sure. There is not enough processing power in an Arduino Uno to bit bang the I2S protocol.
Thanks for your reply. I have played my fare share on a NES so yeah, I know 8 bit. Indeed, 8 bit would be ok. The most important thing is the delay since I am recording EEG signal time locked to the audio signal presentation. The sound must also play on the shortest possible delay after the trigger input because I am interested in the precise timing of that trigger relative to the sound.
I'm not sure a microcontroller is necessary for these requirements. I would recommend simply feeding the audio voltage(from PC or external sound card) to both the headphones and the ADC that you are using to record the EEG. Then whatever jitters and delays are introduced by having sound generated on a PC can be accounted for because you have an exact copy of when the audio reached the subject's headphones. You don't even need TTL pulses
That is totally true BUT the microcontroller must be present to accept the input trigger which is responsible to play the sound. Whithout this constraint, your solution would be totally right.
We perform very similar experiments and the audio is generated on a PC. The sound is then output using the headphone output of the PC and sent to both headphones and the EEG ADC. No triggering is required as the signals are then inherently time locked. I must be missing something in your experimental paradigm
Almost everyone has. 1980's pop rock hits were full of Yamaha DX-7 backings. Nobody cared that it was 8 bit sound.
I went to the music store to check one out, and at that level, I could hear the scratchiness. That's when I learned that it was 8 bit. Sales person winced and admitted it.
The math used for the linear predictive coding approach was very sophisticated, but the speech model was not, which explains why that chip never went anywhere.
They also produced the much more sophisticated SP1000 speech recognition and synthesis chip, which has equally bad sound and poor recognition performance, as demonstrated by this excellent but long video. Hearsay 1000 voice synthesis and recognition for the C64 - YouTube
(Great historical introduction, though, so it is worth watching the entire thing)
Yes I understand this. The thing is that the audio sequence is composed of different sounds which rely on different conditions to be presented. Some sounds are simply presented based on a predetermined sequence and other sounds are presented based on an external trigger sent to the stimulus presenter from a BNC port. There is a certain amount of looping in the script to make it happen. The microcontroller's job is simply to present the trigger to the sound stimulus presenter. Once the sound is generated then, yes, we will do it as you say. Also, the system has to implement some sort of motor response loop to modify the delay presentation of some sound relative to the other. The latter part is called a Stop-Signal Task.