Why does JMP take three clock cycles?

In the AVR Instruction Set Manual for the ATmega 328p it says that it takes three clock cycles to execute a jmp. Why does it do that?

I would think it would be able to fetch and then set the program counter. Why does it take three?

Hope there is someone who can help :slight_smile:

It's not a simple jmp but an indirect jump that has to fetch the indirect target address from memory.

After all I also don't understand why a simple jump takes longer than an indirect jump.

So first fetch from program memory to instruction register, then fetch the address to general purpose register and then from general purpose to the program counter? Are those the three steps?

Because it is contructed in this way :wink: .

The JMP instruction is a long jump with a 22bit address ( it is not an indirect jump ). The 6 MSB of the address are stored in the first word ( together with the instruction code ) the 16 LSB are stored in the second instruction word. Two cycles are needed to fetch the 2 instruction words. I think the third cycle is needed to load the Programm Counter from these two values. You cannot load the 6 MSB directly when reading the first instruction word, because you need the PC unchanged to fetch the second word.

In most cases, the compiler will use RJMP (Relative Jump) which takes 2 instruction cycles and can reach 2048 words in either direction.

JMP takes longer because it is a 32-bit opcode containing a 22-bit destination address. The RJMP is a 16-bit opcode with a 12-bit offset.

          		.org	$0000
          RESET:	;nop
000000 940c 0030 	jmp	START
          		.org	$0030

          START: ;----- stack initialize--------
000030 0000      		nop

1. ATmega328P has word-organized flash locations (Fig-1) and there are 16, 384 locations each of which holds 16-bit code/data. So, a 14-bit PC (Program Counter) is good enough to address all these word-organized locations. In fact, the size of the PC of ATmega328P is 14-bit.


2. In the quoted program segment, we observe that jmp START (direct addressing mode) instruction has been coded as: 940C 0030 which has occupied two flash locations (0x0000 and 0x0001).

3. The MCU takes 1-cycle time to fetch the 16-bit opcode (940C) and then takes another cycle time to read the 16-bit operand (0030). After that the MCU takes one more cycle time to arrive at the execution point (at address 0030) having stored 00 0000 0011 0000 into the Program Counter (PC).

(How is that 0030 loaded into PC? Just the figure/number 0030 is loaded or the current value of PC is reset to 0 and then keeps incrementing until matches with 0030? We may review the Direct Address Loading Mechanism of classic 8085.)

The third cycle is "lost" because the "next instruction prefetch" is lost when you change the PC. (which is also why RJMP takes 2 cycles.)

1 Like

First clock cycle is used to fetch the opcode from flash and then put it into Instruction Resgister (inluding decoding). Second clock cycle is needed to read the operand (the address) from flash and then put it into temporary address/data buffer. Third clock cycle is required to move the operand (the address) from temporary buffer into Program Counter.

The meaning of "lost" is not understood.