Do you have compiler warnings enabled in the IDE File, Preferences menu?
So, you don't know the timing requirements. And, you don't know what code will actually be generated by calling memcpy
. But, you've already decided that one is not good enough to satisfy the other.
Ugh.
I did not include the application here because I wanted a simplest example of the issue. This is for and ISR. I have 35 clock cycles to process 6 floats. The original code took 30 clock cycles, but did not work. Adding volatile now works and still takes 30 clock cycles. I think if I memcpy() all 6 floats it will go over budget. Anyway thanks for all the help.
It is not a good idea to do a significant amount of processing in an ISR. PERIOD.
Hard to advise without a view to the application. Where do these floats come from? What "processing" must you do on them? Why do you need them first "cast" to uint32_t before processing them? Can the processing be off-loaded to non-ISR code?
I agree, in general. Here is some more info for people who are interested. I get IEEE 32bit floats from a sensor, which come in randomly but at a max frequency of 450kHz or min separation of 2.2us which give 35.5 clock cycles at 16MHz. These floats are converted to a special 16bit float format to be sent to another device.
Thanks. It is working now, so if I want to go further to test these ideas, I would probably have to go to assembly to see exactly what the compiler is doing. The casting was more for the compiler since in principle I could do the bit manipulations directly on float addresses. Thanks again for all of your help. I now understand what is going on.
(ive reached my reply limit so I'm editing this to reply)
It is bursts of streaming data. You are correct. We need new hardware. We put this together from stuff we had laying around. For version 2 we definitely re-tool. I mostly asked the question because I could not parse how to memory locations with the same bit pattern could behave differently. Now I see that the compiler was making some assumption about the order operations that tripped me up. Thanks again for your help in understanding this issue.
Continuous stream of data or bursty? If bursty, the processing could be off-loaded to non-ISR code between bursts.
Either way, my first-line, go-to solution is to throw better hardware at the problem. Hardware is cheap. Teensy 3.2 clocks at 96 MHz, Teensy 3.6 at 240 MHz, and Teensy 4.0 / 4.1 even faster.
Get real! 35 clocks is NOT "a significant amount of processing"!
Which is not enough to do ANY "processing" on those floats, other than copying them somewhere to be processed later. Almost ANY single floating point math operation will consume the entire 35 clocks and more.
So where do those 6 floats get stored until they CAN be processed. You clearly cannot do the processing in anything close to real-time, so your only option is buffer a bunch of them, then process the buffer-ful.
Get real. The entire idea is suspect. Who needs to process floating point sensor data in an interrupt? And why is this an Arduino project?
I haven't seen anything yet that makes a lick of sense.
I have discovered the following:
- C++ standard seems to say that casting negative floats to unsigned integers is
undefined behaviour
. My surprise was that with avr/gccvolatile float
was treated differently tofloat
. - Turning off optimisation (replacing
-Os
with-O0
inplatform.txt
) will allow original code to work as wanted i.e. negative floats will cast to unsigned integers. Note that I do not recommend this; it's an observation. - Casting negative
float
tolong
before casting tounsigned long
probably works as wanted. Again, not really recommended. I didn't try 'big' numbers. - There is no intrinsic harm in casting address of a
float
to*unsigned long
when long is same size. Note that casting to*uint32_t
might be more portable.
Your fourth "discovery" is only partially correct. Casting a float * to a uint32_t * will not cause harm. However, dereferencing that punned pointer will cause undefined behavior.
Technically, you are correct. Practically, the discovery works on every platform I tried it on (several dozen). I find it very handy in determining what exactly is going on bit/hex-wise with computer memory. If ever it fails, I guess I'll find a work around, such as a union or switching to char*
although even a char length in bits isn't fixed.
I enjoy sailing close to the wind, especially with debugging code, with compact self-defining code like
enum eYN{
eYes = 0x736559, // swap bytes for BIgEndian
eNo = 0x6f4e, // swap bytes for BIgEndian
eY = 'Y',
eN = 'N',
};...
eYN e;
e = eY; Serial.print(F("eY=")); Serial.println((char)e);
e = eN; Serial.print(F("eN=")); Serial.println((char)e);
e= eYes; Serial.print(F("eYes=")); Serial.println((const char*)&e);
e= eNo; Serial.print(F("eNo=")); Serial.println((const char*)&e);
giving
eY=Y
eN=N
eYes=Yes
eNo=No
i don't understand... These:
W=*((float*)&tmp);
Serial.print(*((unsigned long*)&V),HEX);
Are not "casts."
(unsigned long)(-U)
is a cast, but I don't see how it is expected to do anything useful, and if U is a positive floating point value I don't know what the expected result is supposed to be. And it's "sure" to be different than a type-punning copy. (and it ought to be the same as -(unsigned long)U
, right?) (aside from being "undefined.")
C++ standard seems to say that casting negative floats to unsigned integers is
undefined behaviour
. My surprise was that with avr/gccvolatile float
was treated differently tofloat
.
The 0 result does surprise me, but "undefined" means anything is possible. I get 0 for the (ul)-W
compiled for an ARM, and more expected results for AVRs.
I get IEEE 32bit floats from a sensor, which come in randomly but at a max frequency of 450kHz or min separation of 2.2us which give 35.5 clock cycles at 16MHz.
The sensor is presumably read a byte at a time in a way that has to be carefully assembled into a float. If time is critical, just store the bytes and do that sort of processing later.
This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.