Your Wish List

I haven't looked in detail, so I might be wrong, but I believe the Arm Cortex M0 only supports most of the the Arm V6 thumb instruction subset, which does not include hardware divide nor floating point. The multiply unit is optionally either a 1 cycle implementation or 32 cycle, so I assume it is the 32 cyle varient. The LPC1114FN28 has 32Kb of memory. The compiler does support a -mcpu=cortex-m0 to support the chip, so you should be able to write programs with emulation of these features, just like the Arduino compiler does. Now, it still should be faster than the Arduino which is only an 8-bit processor, and needs to do multiple instructions for everything. If your algorithms are real time and use multiply, division or floating point heavily you might want to upgrade to a Cortex M4F. ARM Cortex-M - Wikipedia