The problem is that you are disabling the Sleep enable bit when you set the BODS and BODSE bits because you are using an assignment not a bitwise or. However in this case you have to as you are clearing a bit as well.
You need to replace this code:
sleep_enable();
MCUCR = (1<<BODS) | (1<<BODSE); // set both BODS and BODSE to 1
MCUCR = (1<<BODS) | (0<<BODSE); // set BODS to 1 and BODSE to 0
With one of these two
(1)
MCUCR = (1<<BODS) | (1<<BODSE); // set both BODS and BODSE to 1
MCUCR = (1<<BODS) | (0<<BODSE); // set BODS to 1 and BODSE to 0
sleep_enable(); //this should allow sleep_cpu() to still be called within 3 clock cycles as this only takes one (sbi instruction)
(2)
MCUCR = (1<<BODS) | (1<<BODSE); // set both BODS and BODSE to 1
MCUCR = (1<<SE) | (1<<BODS) | (0<<BODSE); // set BODS to 1 and BODSE to 0, set the sleep enable bit at the same time.