Your original idea that this would be a "simple" robot is slightly understated, :-).
First off, you're going to have to build a real robot, meaning doing all the basic
mechanical, electronics, computer, sensor, and software aspects involved with any robot. Right there, that's probably 5X or 10X more work that most Arduino projects attempt.
Once past the basics above, then comes the real
work, proper sensors and software that works correctly. This is where many robot projects fall down. So, you already see one big problem - how to recognize and follow the one correct person, when several are running around and getting in the way? No easy answers.
You have to do the project in stages, and each stage is a learning curve that leads to the next stage.
1. basic mobility platform and controller.
2. forget about speech recognition until maybe 4th or 5th iteration on the project [if it ever works].
3. add some simple sensors that get you in the right direction, to do the basic job.
4. build from there.
I can think of several fairly easy ways to get a robot to track someone. First, PIR works if there's only 1 person, plus it's always helpful to tell if something moving is a person. Secondly, instead of the person calling it, the robot could either track a light carried by the person, or else detect a sound made by the person. The light could be an IR Led blinking at a certain rate. The sound could be like a "clicker" - ie, castanet or dog clicker, or even an ultrasonic dog whistle that people cannot hear.https://www.google.com/search?site=&tbm=isch&source=hp&biw=1163&bih=886&q=dog+clickerhttps://www.google.com/search?site=&tbm=isch&source=hp&biw=1163&bih=886&q=dog+whistlehttps://www.google.com/search?site=&tbm=isch&source=hp&biw=1163&bih=886&q=castanet
All in all, you just have to do it in stages, see what works, and use what you learn at each stage to go on the the next stage. In the end, this is how you do every 'new' project. Good luck.