I'm sure that'll "round off" the square wave, but I don't think that will center it; in other words, you'll get audio output from approx 0-5 volts, not -2.5 to +2.5...?
Sure it will. A key characteristic of a capacitor is that it will block DC voltages but pass AC voltages. Look at most common DC single polarity powered multi-stage audio amplifier circuits. They use capacitance coupling to isolate the DC bias between each stage but allow the audio (AC) to pass to the next stage for amplification of the ac signal.
There can be audio amplifiers using full complementary circuits and dual polarity voltages that don't require capacitance coupling, but they are more complex and expensive. Classic opamps are DC coupled internally and that is why external input offsets voltage are sometimes required to remove or change the offsets of the applied input signal.
Simple two winding transformers are also used to isolate the high voltage DC bias but pass audio (AC) in the classic single tube class A amplifiers.
As long as the final load, the speaker in this case, has one lead referenced to ground, (circuit common) the other lead will 'see' a true 'centered waveform with no DC bias or offset. It's no difference when applied to sending a 0-5v square wave from an output pin through a series capacitor on to a grounded load. It won't be a sine wave, but it will be 'centered'.
Lefty