I'm looking at some timing issues, and C code (and it's disassembly) is not my strong suite. So as the subject suggests, how many instructions does this code snippet compile to please?
PIND >> 7
I'm specifically interested in the >> part. Is seven right shifts seven individual right shift operations, or does the CPU have this as a single op code as a performance measure? I'm wondering if the duration of a right shift is proportional to the number of places? I'm compiling to an Uno.
I'm actually looking at the fastest way to make a byte out of two high nibble reads of I/O port D. The >>7 was actually just an arbitrary number to illustrate the question. Sorry. So in reality what I have is,
byte b = (PIND & 0b11110000) + (PIND >> 4)
with data on 4 high pins of the port, which will require 4 right shift operations which is 4 clock cycles.
I suggest you stop playing games. By failing to include important details you have wasted @KeithRB's, @el_supremo's, @Whandall,'s, and @oqibidipo's time.
Bear in mind that the high nibble and low nibble of b can have different values.
That's all very nice, but I can't understand what lines 502 -50c are really telling me. As I explained in the very first bit of my original post, I'm not familiar with disassembly. Are those lines the actual instructions? So my entire "byte b = (PIND & 0b11110000) + (PIND >> 4)" will compile to only 6 instructions in total? And that'll only take 6 cycles to make byte b?
That's even better then
PS. How are you making those instructions appear? Is there something in the standard IDE?
You don't know how many cycles it will take until you put it into context, with the exact code you'll have in the final version.
The AVR only has one-bit shifts.
But it also has "swap", which swaps the two 4-bit nibbles of a register. The compiler will (sometimes?) optimize a shift longer than 3bits to use swap (as in reply 6)
But that won't work if you have a value longer than a single byte.
But shifts longer than 8bits can take advantage of byte moves. Theoretically, anyway.
or a shift value that isn't a constant (in which case the compiler will put together a loop.)
If you're doing bit tests or something, the compiler may optimize away the shift entirely.
So my entire "byte b = (PIND & 0b11110000) + (PIND >> 4)" will compile to only 6 instructions in total?
That leaves the result in a register; if b needs to get stored somewhere that will be an additional couple of cycles.