Another optimization question - can I speed up this 32 bit multiply?

robtillaart:

// dh, dl -> 12 bit tmp:
uint32_t tmp = (dh << 4) | (dl >> 4);
tmp = (tmp * playing->volume) >> 10; // Volume is 10 bit, so 0..1023. 4095*1023 / 1024 = 4091 max.

you could try to keep it 16 bit,

uint16_t tmp = dh >> 2; // ~~~ ((dh << 4) | (dl >> 4)) >> 10 do the shift 10 first
tmp = tmp * playing -> volume;

it may have less distinct values but maybe it does the work well enough?
it should definitely be faster :slight_smile:

I'm afraid I need a lot better than 6-bit audio quality. And I really don't want to go lower than 12-bit.