Need some advice on building a humanoid assistant robot

Hey there,

I'm actually pretty new in this field. I'm not a robotics enthusiast, just a software engineer with a curiosity in tech, so I don't have much knowledge about robotics.

I have a project in mind though. I want to build a small "humoid" robot (probably just a small box) that can receive my voice and send to OpenAI servers through API, receive the response and read it loud. I want to use the latest GPT-4o.

I'll be needing a voice recording module and also some type of speaker I guess. I looked at this product, but I'm not sure if it's enough for this type of project: [ISD1820 Voice Recording and Playback Module - with speaker, buy at an affordable price - Direnc.net®](Voice Recording and Playback Module)

I'm aware that this project requires a solid knowledge about robotics in general, but I trust myself that with enough research everything is possible.

Do you guys have any idea or suggestion on this matter? Any advice would be super helpful.

Thanks a lot.

How can a small talking box be described as a humanoid robot?

I would suggest using a raspberry pi. Similar projects have already been done using Google Home, Amazon Alexa etc.

It's because I was planning on making a face out of cardboard haha. Thanks. Do you have any specific resources that you can share?

There's no robotics to this. There are libraries for Arduino to work with chatGPT APIs. But the hard part of this project will be the speech to text and text to speech engines. That will be close to impossible with an Arduino. I haven't seen any solution for that yet that was reliable.

There are a few speech to text modules, but they have to be trained and only recognize a few words.

https://www.reddit.com/r/RASPBERRY_PI_PROJECTS/comments/125ojua/you_can_make_a_chatgpt_virtual_assistant_using/