Much ado about nothing new. The Wikipedia article on "floating point" describes a number of floating point calculation anomalies, and has this quote:
The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations.
Every computer science text I've ever seen has warned about expecting floating point calculations to be absolutely precise. It may surprise you to know it, but you're not the first to uncover an issue like this one.
Speaking qualitatively, you're defining a value whose precision is beyond the capacity of 32-bit floating point to represent. From http://arduino.cc/it/Reference/Float
Floats have only 6-7 decimal digits of precision. That means the total number of digits, not the number to the right of the decimal point.
You're asking it to give you meaningful results to 8 decimal digits of precision. This platform wasn't built for that, and its developers tell you so quite clearly.
Quantitatively, here's what happens: 8.3199996, as a 32-bit FP number, has sign bit 0, exponent 3, offset exponent 82H, mantissa 851EB8H - 051EB8H, with the MSB suppressed - and an FP representation of 41051EB8H. That representation isn't exact, and every number between 8.3199993 and 8.3200001 has the same representation. 1000 has sign bit 0, exponent 9, offset exponent 88H, mantissa FA0000H - 7A0000H with sign bit suppressed - and an FP representation of 447A0000H. Because 1000 is a reasonably-sized integer, the representation is exact. Multiplying them yields sign bit 0, exponent 12, mantissa 103FFFFH with 011B following. That multiplication overflows the 24-bit mantissa space, so the exponent bumps to 13, offset exponent 8EH, and the mantissa shifts to 81FFFFH with 1011B following. That rounds up to 820000H, and results in an FP representation of 46020000H, which is the 32-bit FP representation of 8320, exactly. floor(8320) is, well, 8320. Dividing 8320 by 1000 yields 8.320, and its 32-bit FP representation is, as described above, 41051EB8H - identical to the representation of 8.3199996. When you ask for that number with seven digits after the decimal, you get what you got, and it's correct within the well-known limits of floating point.
So, you got exactly the result that you could expect with a 32-bit floating point engine. There are no optimization quirks, no bugs, no problems with the compiler, and everything is kosher.
I'm well aware of how computers do arithmetic ...
That's great, because it means that you can work out for yourself exactly how this result is calculated, and quantitatively demonstrate in this forum what, if anything, is wrong.
Now I don't know much AVR assembly language, but this *surely* doesn't look like floor() is being called... let's not even talk about the multiplication and division operations being performed!!!
We can't tell what code led to the *.s files you quoted, but, assuming that it's the code from your original post, there are some good reasons why the program wouldn't call floor(), or do any other arithmetic. You define x, and never change it. The optimizer might well decide, correctly, that x makes more sense as a constant. That would make x*1000.0, float(x*1000.0), and float(x*1000.0)/1000.0 into constants as well. It's likely that the compiler did the math itself, and just plugged the results - in 32-bit floating point - into the output code. For this program, that's valid optimization, without fault.
Summarizing: You gave the program a number - 8.3199996 - that it finds indistinguishable from 8.32, asked it to distinguish between them, and complained when it couldn't tell the difference. If you need to reliably discern differences between numbers that differ in the eighth significant decimal digit, you've selected the wrong platform. The Arduino does other things very well, but it doesn't claim to be a floating point calculation engine - in fact, it claims that it's not. If you have to get exact results using floor() for every possible number, then floating point isn't your vehicle either - a 64-bit floating point engine will show the same kinds of anomalies, just less frequently and further downstream from the decimal point. Maybe you can buy or program something to work in BCD. The fault, dear Brutus, is not in your compiler, but in yourself.
Summarizing some of the histrionic statements that have been made in this thread:
Seriously now, this is one EGREGIOUS compiler error... HOW can one trust ANYTHING coming out of such a compiler? And more importantly, what is the solution? Working with the optimizer turned off all the time? If so, where can I purchase an additional 512KB of flash for my Arduino? ;-/
Full of sound and fury, signifying nothing.