Simple code:
DDRB=0xff;
for(unsigned char i=0;i<4;i++) PORTB|=1<<i;
for(unsigned char i=0;i<4;i++) PORTB&=~(1<<i);
after disassembly
DDRB=0xff;
00000011 SER R24 Set Register
00000012 OUT 0x17,R24 Out to I/O location
for(unsigned char i=0;i<4;i++) PORTB|=1<<i;
00000013 SBI 0x18,0 Set bit in I/O register
00000014 SBI 0x18,1 Set bit in I/O register
00000015 SBI 0x18,2 Set bit in I/O register
00000016 SBI 0x18,3 Set bit in I/O register
00000017 LDI R24,0x00 Load immediate
00000018 LDI R25,0x00 Load immediate
for(unsigned char i=0;i<4;i++) PORTB&=~(1<<i);
00000019 LDI R22,0x01 Load immediate
0000001A LDI R23,0x00 Load immediate
--- No source file -------------------------------------------------------------
0000001B IN R18,0x18 In from I/O location
0000001C MOVW R20,R22 Copy register pair
0000001D MOV R0,R24 Copy register
--- No source file -------------------------------------------------------------
0000001E RJMP PC+0x0003 Relative jump
0000001F LSL R20 Logical Shift Left
00000020 ROL R21 Rotate Left Through Carry
00000021 DEC R0 Decrement
00000022 BRPL PC-0x03 Branch if plus
00000023 COM R20 One's complement
00000024 AND R18,R20 Logical AND
00000025 OUT 0x18,R18 Out to I/O location
00000026 ADIW R24,0x01 Add immediate to word
00000027 CPI R24,0x04 Compare with immediate
00000028 CPC R25,R1 Compare with carry
00000029 BRNE PC-0x0E Branch if not equal
in first loop its using SBI instructions but when I`m writing 0s to portb pins via for loop there is too much overhead in the code. Tried all optimization levels and output is without CBI.
code without 2-nd for loop
DDRB=0xff;
for(unsigned char i=0;i<4;i++) PORTB|=1<<i;
PORTB&=~(1<<PB3);
PORTB&=~(1<<PB2);
PORTB&=~(1<<PB1);
PORTB&=~(1<<PB0);
output
DDRB=0xff;
00000011 SER R24 Set Register
00000012 OUT 0x17,R24 Out to I/O location
for(unsigned char i=0;i<4;i++) PORTB|=1<<i;
00000013 SBI 0x18,0 Set bit in I/O register
00000014 SBI 0x18,1 Set bit in I/O register
00000015 SBI 0x18,2 Set bit in I/O register
00000016 SBI 0x18,3 Set bit in I/O register
PORTB&=~(1<<PB3);
00000017 CBI 0x18,3 Clear bit in I/O register
PORTB&=~(1<<PB2);
00000018 CBI 0x18,2 Clear bit in I/O register
PORTB&=~(1<<PB1);
00000019 CBI 0x18,1 Clear bit in I/O register
PORTB&=~(1<<PB0);
0000001A CBI 0x18,0 Clear bit in I/O register
0000001B RJMP PC-0x0000 Relative jump
which means manipulating bits via for loop will use more ROM space and waste cycles than manually enabling the bits one by one ?