I'm pretty sure that can't be done with the Arduino (alone), and certainly not on a "basic" Arduino like the Uno or Mega.
It's normally done by a powerful server (or multiple powerful computers) somewhere "in the cloud".
...Your iPhone doesn't understand English (or whatever language you speak) but Siri (in the cloud) does.
I've seen little self-contained gizmos that can translate text. They are probably limited to common phrases and single words.
It MIGHT be possible on a computer. The most difficult part is "universal" speech recognition. It's not as difficult if it has a limited vocabulary to recognize single words like "yes", "no, or numbers, etc. Understanding spoken language in context is a lot more difficult.
You can get programs like Dragon for speech recognition (speech-to-text). Text-to-speech is built-into Windows. The hard part is the actual translation in-between.
There are text translation applications like DeepL, but I think it requires the Internet with the "work" done on servers.
welcome to the arduino-forum.
You seem to be a real beginner. The project-idee of making a language translator is really not a beginner-project.
Here is a a visual description why
At the moment - with almost zero knowledge about software-developping you might have such an imagination
I do (yet) not know much it is like this
Can somebody show me how to modify it to be like this
This is your imagination at the moment.
Though the reality is:
even this is faaaaar away from language recognition / translation
Even if you have learned programming at this level
Is still 2 to 5 years of software-development in a specialists TEAM on HIGH-END computers ( 5GHz, 128 GB RAM ) away
Compared to an arduino uno: 0.000002 GB RAM at 0,016 GHz clock-frequency
or in smaller units
Arduino Uno 2 kB RAM, High end-Computer 13.421.772 kB RAM
Arduino Uno 16 MHz 8 Bit High end-computer 5000 MHz 64 Bit
Arduino Uno Single core 16 MHz * 8 bit= 128 MIPS High end Computer 5000 MHz * 10 cores * 64 bits= 3.200.000 MIPS
Do you understand why your idea is great but unachievable on a microcontroller?
I think the best you could do is digitize the speak with a CODEC (code encoder/decoder IC) and upload to the cloud for translation then back down through a CODEC to a speaker of some type. The Arduino probably one of the ESP devices would do the control work for you but the cloud the actual translation. This link will give you an idea of what is available. This would be a difficult project for somebody with experience, I would recommend you pick another project.
recording audio in a good enough quality so that some kind of online-service can do voice-recognition needs either a good amount of computation-power for doing the analog to digital conversion and a good amount of RAM. Even if you just record 20 seconds it will be multiple megabytes if stored as WAV-Data.
Or you need way more computation-power for compressing the the audo into MP3 or OGG-format or you need specialised encoder-chips that are developed for that special purpose of MP3-compressing/encoding.
All these are specialties that are not very common for microcontrollers.
Things will become easier if you change to a RaspBerry PI 4
( not a raspberry pi pico, and not a raspBerry Pi Zero)
Do you read RaspBerry PI 4 "RaspBerry PI FOUR
A RaspBerry PI 4 "RaspBerry PI FOUR
has enough computation-power for using a USB-microphone for audio-recording
and offers all the other things like high-speed internet-connection runing python, Java etc.
For using online-cloudservices that do the voice-recognition and translation.
Hey man! Did you developed it? I wanted to use my Python program, which is coded using various python libraries to translate realtime speech into another spoken language. Can you little bit guide me if you know Arduino? I can help you in software/ coding part.