Converting a C++ Statement into AVR Instructions and then into Binary Codes and then Storing into Flash

I am sharing my works with the readers to receive comments and criticisms to correct and improve my understandings.

1. byte y1 = 0x23; byte y1 = 0x35; is a C++ statement. A statement is a complete prescription for the computer to take an action. The result of the execution of this statement is to store the number 0x35 (as 00110101) in a free RAM location (say: 0x0120) of ATmega328P MCU. The action is conceptually presented in Fig-1.


Figure-1:

2. I know that the Compiler converts the C++ codes into Assembly Instructions and then into Machine/Binary Codes.

3. What is an Instruction?
It is a low-level command along with data/operand for the MCU to carry out a rudimentary task. For example: ldi r16, 0x35; is a data movement instruction; where, the MCU reads 0x35 from (flash) memory and keeps it in r16 register for future use.

The ldi (load immediate) is called operation code (opcode) and "r16, 0x35" is called operand of which r16 is the "data destination" and 0x35 is the "data source" (in this example, the data ltself).

4. With the understanding of Step-2, I may write the following AVR Assembly Instructions for the statement: byte y1 = 0x35.

ldi  r16, 0x35; //immediate data 0x35 goes into r16 register

ldi  r31, 0x01; //high byte of target address is kept into r31 reg.
ldi  r30, 0x20; //low byte of target address is stored into r30 reg.

//r31:r30 forms z-pointer register. (r16) goes in RAM location whose address is in z-pointer register
st  z, r16; // now (0x0120) = 0x35 means location 0x0120 holds 0x35

5. Let me use Microchip Studio Platform to get the binary codes by creating list file for Assembly Codes of Step-4.

                        .org	$0000			;
000000 0000    RESET:	nop
000001 c03e             rjmp	START
                                 
                        .org	$0040	   ;application space	
000040 ef0f    START:	ldi	    r16, 0xFF  ; stack top at 0x08FF
000041 bf0d             out     spl, r16
000042 e008             ldi	    r16, 0x08		; 
000043 bf0e             out	    sph, r16   ; ATmega328P
                                 
000044 e305             ldi	    r16, 0x35  ;data for RAM location 0x0120
                                 		
000045 e0f1             ldi	    r31, 0x01;
000046 e2e0             ldi	    r30, 0x20;
                                 		
000047 8300             st	    z, r16   
                        .exit                            			

6. Execution model (Fig-2) of Assembly Program of Step-5 in Flash memory.


Figure-2:

(1) At power up, the PC (Program Counter) of ATmega328P (fresh chip and NOT of UNO Board) holds 0x0000 and fetches the instruction word C03E. It is decoded to get the offset to compute the targer address 0x0040 which is then stored into PC.

Question: Opcode/Instruction fetching, decoding and reaching at the target location -- all these events just happen within 1-cycle period (for 16 Mz clock it is 62.5 ns)?

(2) At word location 0x0044 of the list file of Step-5, the instructon digits are arranged as E305 (as per instruction template) and not as E035 (8085 format: MVI A, 0x35 ===> 3E 35). Why are the digits transposed in the AVR instruction?

1 Like

will not store 0x35 :slight_smile:

It will store 0x23. In order to store 0x35, the code would be byte y1 = 0x35;. I have corrected my post. Thank you very much.

Each instruction have defined number of clock cycles. Lot of them are 1 cycle instructions. It is chip type specific so the same instruction can take different clock cycles for different AVR, e.g. 328 and 2560.
While the instruction is performed, the next instruction is loaded to be prepared for execution.
All information is in the datasheet.

1 Like

I don't know why. It is defined by this way. That's it. There is the instruction set in the Atmel Studio help. Each instruction have defined opcode. You can read also this:
https://ww1.microchip.com/downloads/en/DeviceDoc/AVR-InstructionSet-Manual-DS40002198.pdf

1 Like

This post will give you all the details: https://stackoverflow.com/questions/61158931/convert-avr-assembly-program-to-hex

basically:
ldi r16, 0x35

ldi opcode = 1110 KKKK dddd KKKK

d = 16 - 16 = 0 ( only R16 - 31 work so 16 is an offset)
K = 0x35 = 0011 0101
opcode = 1110 0011 0000 0101 = E3 05

2 Likes

@blh64
I was just going write the same thing. :slight_smile:


Figure-1:

In E305, the opcode and operand (data source and destination) are intermixed/embedded. If we recall 8085 for its 8-bit opcode which allows 256 possible operations.

In the ldi instruction of AVR, the operand is apparently 4-bit (1110) which gives only 16 possible operations - this is not true as AVR offers much more operations. So, the question remains as to knowing the exact opcodes bits from its template of Fig-1 above.

It can be a single machine instruction or a (high level) language instruction. A language instruction (operation, statement...) can extend into any number of machine instructions.

What's that?

Look into the documentation of your tool

Only for 8/16 bit data bus of the controller.

Loading an instruction occurs at the leading edge of the clock, execution at the second edge. The next instruction can be fetched only if their address can be placed on the address bus in time. All this cab be much more complicated in CISC than in RISC machines.
.

I means 0x0040 from: rjmp START (Fig-2 of #1) of the following

        .cseg  $0040 ; (0x0040)
START:  .........

Please explain in detail how your wording (reaching ...) should be understood.

Does anything happen during High and Low period of the clock cycle (Fig-1)?
image
Figure-1:

Whatever is required to perform the just started action.

See e.g. preset and hold time in the data sheet.

It is the target address or jump address or destination address whose value (it is 14-bit for Atmega328P MCU though coded as 4 hex-digit) into Proogram Counter (PC). Would be glad to hear the appropriate wording. I am not sure if the PC is incremented one-by-one or 0x0040 is direcly loaded.

E.g. The PC (or whichever register) is changed to...

Does "Instruction Decoding" take place during High period of the clock?

This is interesting/good for me to know.

In what fundamental context, a CISC (Complex Instruction Set Computer like 80x86) machine does differ from a RISC (Reduced Instruction Set Compuetr like ATmega328P) machine.

Only important are the times when some action can start and when it will have finished for sure. Details depend on the Propagation Time of the involved logic stages.

Is it 16-bit loading at time or higher byte first and then the lower byte?