A ragged spy device

Detecting sound is easy.
Differentiating conversation from the background noise, not so much.

Identifying the direction of the loudest noise, probably with a rotating, directional mic.
Turning a vehicle to move in that direction, sure.

Object avoidance - this opens up a large subject matter. I will leave it to you to google up any of the zillion projects that people have started trying to do this. Some with more success than others.

Recording sound for some time. The arduino is not much of a large data processor.
Do you need it recorded? Would turning on a transmitter be enough?

How will your vehicle know when to stop moving? If it simple moves towards the largest noise and avoids objects, won't it drive right up to the people having the conversation? This does not appear to meet your goal in the title.