The main issue with a ethernet streaming mp3 player is the buffering of data packets. The AVR has only 2k of RAM which is not enough to buffer the TCP/IP packets form the ethernet controlller.
The VS1053 has a 2K internal memory dedicated to buffering incoming data. I wonder if that would be enough of a buffer, so I'm going to try this out. At 160kbps, that buffer holds 100ms of audio, so you've got that long to fill it up with data over the network. In theory, that should be plenty of time. It will be interesting to try!