What's inside an at328 chip that allows it to read c code?

Sorry to ask this, I'm sure some curious George has asked it in the past but I'm not even sure how to ask it so I wasn't sure how to search for it either.

I understand the code I write, more or less. :slight_smile:

Just the same as I understand the objC and Swift code I write in x code.

And I know that somehow that code gets turned into 1 and 0's. And that those 1 and 0's are basically "turned" into on and off current flows.

But the whole part in the middle is still kinda hazy. Could someone explain how my code in the Arduino ide gets turned into something the chip can read? I have heard the chip comes with a bootloader "burned in" but I'm not sure what that means and how that converts my code into chip or machine code.

The boot loader simply copies the binary the compiler and linker have created into the flash memory of the micro controller, bit-for-bit.
The boot loader isn't very clever; it even has to be told where to put the ones and zeroes.

OK but what is the sequence?

We wrote the c code.

The c code gets compiled.

That compiled code is...?

just zeros and ones...

c code changes to machine code...

Compiler generates a .hex file - you can read that and see what's being written to the flash (it's machine code). when you upload, this .hex file generated by the compiler is read by AVRdude, which feeds it to the bootloader in appropriately sized chunks, which in turn writes them to the flash.

The bootloader itself is a short piece of code (512 bytes for Optiboot) that sits at the end of the flash. When the chip is reset (provided the BOOTSZ1..0 and BOOTRST fuses are set properly) it jumps to the bootloader, which runs, and if nothing tries to program it, it then jumps to the program.

The process of putting the bootloader on the chip is called "burning" for reasons unclear to me. It's done using an In-circuit serial programmer (ISP or ICSP), which uses reset and the SPI pins. This programmer can be an arduino running ArduinoasISP sketch, or a dedicated device like a USBAsp. Typically, the "fuses" are set at the same time - these are 3 bytes of flash that set very basic operating parameters, like clock source, whether a bootloader is used, etc. Note that you can upload anything via ISP, not just the bootloader - it's just less convenient than using the bootloader, so people rarely do it; the ATTiny series chips are often used without a bootloader, since many of them don't have a serial port.

DrAzzy:
The process of putting the bootloader on the chip is called "burning" for reasons unclear to me.

Before EPROMs, PROMs were made with metallisation fuses which literally were fuses - to zero a bit, the metallisation at that point in the matrix was literally vaporised or "burnt".

What I understand of the programming process of Arduino:

  1. You write code on C.
  2. The AVR-GCC compiles the C code into assembly and then binary
  3. Binary is transfered to the chip, the bootloader receives binary and writes it to flash.

mart256 is correct, except that, to be precise, you actually write code in C++ rather than C (but most C is valid C++, so... :slight_smile: ).

Marciokoko:
Sorry to ask this, I'm sure some curious George has asked it in the past but I'm not even sure how to ask it so I wasn't sure how to search for it either.

I understand the code I write, more or less. :slight_smile:

Just the same as I understand the objC and Swift code I write in x code.

And I know that somehow that code gets turned into 1 and 0's. And that those 1 and 0's are basically "turned" into on and off current flows.

But the whole part in the middle is still kinda hazy. Could someone explain how my code in the Arduino ide gets turned into something the chip can read? I have heard the chip comes with a bootloader "burned in" but I'm not sure what that means and how that converts my code into chip or machine code.

Well, since nobody actually answered your QUESTION, every microprocessor and microcontroller runs on machine code. Here's a VERY simplified example... imagine that you have a display device located at memory address 0x1000 and you want to display a string to it. You would write some C code like this:

uint16_t display = 0x1000;
const char *string = "Hello there";
uint8_t size = strlen (string);

for (x = 0; x < size; x++) {
    display++ = *string++
}

Now, when compiled into machine language, it might look something like this (note this is Motorola HC11 code, but the idea is the same):

size    dc.b    1          ;reserve 1 byte for "size"

string  dc.b    $48,$65,$6c,$6c,$6f,$20,$74,$68,$65,$72,$65,$00  ; string

        ldab   #0         ;initialize string length to 0
        ldx    #string    ;point to string
strlen  ldaa   ,x         ;get a byte
        beq    disp       ;if zero, end of string
        incb              ;not zero, increment "b" register
        inx               ;increment pointer "x" register
        bra    strlen     ;go again
disp    stab   size       ;store string length
        ldx    #string    ;point to string
        ldy    #$1000     ;point to display
loop    ldaa   ,x         ;get a byte from string
        staa   ,y         ;send it to display
        inx               ;increment source pointer
        iny               ;increment destination pointer
        dec    size       ;decrement "size" (string length)
        bne    loop       ;loop until size is zero, then done

Finally, when the assembler is compiled into binary, you get this:

00                // "size", a variable stored in memory
48 65 6C 6C 6F 20 //
74 68 65 72 65 00 // this is the string
C6 00             // C6 = "Load the B register with the byte that follows (00)"
CE 00 01          // CE = "Load the X register with the 16 bit word that follows (0x0001 the string)"
A6 00             // A6, 00 = "Load the A register with the byte that X points to"
27 04             // 27 = "Branch forward 4 bytes if result is zero"
5C                // 5C = "Increment B, that is, B=B+1"
08                // 08 = "Increment X, that is, X=X+1"
20 F8             // 20 = "Branch backwards 8 bytes"
D7 00             // D7 = "Store the value of the B register in location 00"
CE 00 01          // CE = "Load X with the 16 bit value that follows (the string start)"
18 CE 10 00       // 18, CE = "Load Y with the 16 bit value that follows (the display address 0x1000)"
A6 00             // A6, 00 = "Load the A register with the byte that X points to"
18 A7 00          // 18, A7, 00 = "Store the A register contents to where Y points to (the display)
08                // 08 = Increment X (X=X+1)
18 08             // 18, 08 = Increment Y (Y=Y+1)
7A 00 00          // 7A = "Decrement what is at address 0x0000 (size)
26 F3             // 26 = "Branch backwards 13 bytes (loop) if size is not zero

Now, another example. Imagine you are a microprocessor and you know that the hex number "0x10" means "Pick up something from somewhere", 0x20 means "Put down something somewhere" and 0x30 means "decrement count" and "0x40" means "brick". You also know that if you see a "10" or "20" opcode, you will expect to see a "what" and a "where" immediately after (if not, you get confused and "crash"). However, if you see a "30" you know that only one thing follows... "what to decrement".

So I will take a piece of paper and write "144" on it and keep it by me. On another paper I will write this and hand it to you:

loop 10, 40, PILE
     20, 40, BOX
     30, COUNT
     IF COUNT IS NOT 0, go back to loop

So, with this "program" you will pick up one brick from the pile and put in the box, count 144 down to 143, and since 143 is not zero you will move another brick. When you are done, you will have moved 144 bricks from the pile into the box and my piece of paper will have a "0" written on it (the paper is a "memory location" where "count" is stored).

Make sense?

Thinking in the other direction, you might also wonder how a bunch of 1s and 0s stored in flash memory end up implementing something like a computer program. That's ... harder.

It's relatively easy to design a digital electronic circuit that implements a simple boolean logic operation like AND or OR (a "gate".)

And not too hard to figure out how to arrange some simple gates to implement a more complicated function like a storage cell (flipflop) or something that adds two binary numbers.

Those can be combined into counters, registers, or memories. And you can imagine an "arithmetic/logic unit" (ALU) that combines several of those more complicated unit and performs some operation (add, subtract, invert, etc) on two sets up inputs, selected by a third set of inputs. Inputs selected from a bank of registers, by more logic bits.

And then you have a memory that contains all those bits that tell the ALU what to do, along with more logic to tell it which memory location to look at for the next operation. At that point, you're almost there!

An actual computer will contain thousands (or millions) of gates (not included the memories), so it's very difficult to imagine its operation at a very primitive level. But it's not TOO bad if you gradually build up a hierarchy...

I always thought there was a little man in there, maybe a relative of Maxwell's famous tiny little friend. And in the ATtiny chips the man would be even littler and tinier.

Thanks guys!

Your answers are awesome!

Oh, memory lane..

Little Computer People

westfw:
Thinking in the other direction, you might also wonder how a bunch of 1s and 0s stored in flash memory end up implementing something like a computer program. That's ... harder.

It's relatively easy to design a digital electronic circuit that implements a simple boolean logic operation like AND or OR (a "gate".)

Actually, the "ones and zeros" thing goes right along with my example above. For example, the first character of the string to print is "H" which is "0x48" or in binary "0b01001000".

All the microprocessor does (using either an internal state machine or microcode) is encounter the first opcode upon reset. That opcode tells the processor what to do, and the processor also knows if byte(s) immediately after the opcode are required to define the operation.

For example, an instruction like "DEX" (decrement X) is called an "inherent" instruction because it contains all the processor needs to know to carry out the instruction... that is, "Decrement X by one".

Other instructions such as "load immediate" (LDAA #$48) tell the processor "load the A register with the contents of memory directly following the opcode. So the instruction would be "0x86" (LDAA) and the next byte would be "0x48" which tells the processor WHAT to copy into the A register.

Some processors (like the Motorola 68xx series) also use "prebytes". For example, a "Load X immediate" would mean "Load the X register with the next two bytes following the instruction (since X is 16 bit) and as opcodes, it would look like this:

CE 10 00 // load X with the value "0x1000".

But if you instead wanted to load the Y register, the assembler would add a "prebyte" to the code, allowing the re-use of the original opcode. That is, the instruction "Load Y with the value 0x1000" would look like this:

18 CE 10 00 // load Y with the value "0x1000".

Upon seeing the "18" special opcode, the processor knows to "do whatever follows as if it were the X register, but do it to Y instead".

This avoids making the opcode map huge with special codes for EACH operation (ideally the map should be 256 bytes or smaller so that each opcode is only 1 byte).

It took me quite a long time to actually "get" what's going on inside a processor. Book after book I read talked about "opcodes" and "operands", etc... but what good is that if you don't know what an opcode or operand IS?

That's why I do such a long winded and detailed explanation..... so that the reader can actually "get it".

MAS3:
Oh, memory lane..

Little Computer People

Memory lane? The very first computer I ever had my friend and I BUILT using an 8008 processor, a VDG (video display generator) chip who's part number I can't recall, 256 bytes of SRAM and toggle switches for the address and data bus, plus a criss-cross NAND gate flip flop debounced pushbutton to bring the WR line low (to enter a byte into memory) and a reset button.

It took us three weeks of wire wrapping (and re-wrapping mistakes) to build it, then three days to finally get it to work. The very first program initialized the stack pointer, interrupt vectors and the VDG chip, then sent the letter "A" to the monitor (a hacked 12 volt, 9 inch B&W TV set) and finally went into a "here jump here" loop.

For the thousandth time, we hit the reset button and hoped for the best.

The thrill of seeing the TV raster jump, vertically roll once locking onto the VDG sync pulses and finally an overbright, bloomed letter "A" on the upper left corner I will never forget.

Such were the "good old days" (when Radio Shack actually had wall-to-wall parts to buy, not tiny little understocked drawers).

@ Krupski... I remember the first frequency counter I built... Eight VFD tubes (1/2 inch tall 15 V devices) and a lot of cmos CD4XXXAE devices. I used a 74S90 for 2 to 90 Mhz and an 11C90 to go to 500 Mhz.
It was a wonder and you could always buy a Moccasin Kit for $5.00, Too...

So Many Parts and so little time to learn.. (Sigh).

Doc

Re the lower-level explanations...

...something that shouldn't be left out of the discussion - which is rather important - is the role of the clock, and timing diagrams.

Basically, inside the processor, the main clock is used to generate multiple other clocks (which may be faster clocks or slower clocks) - which, when applied properly to the CPU/ALU/etc circuitry - keeps those pieces synchronized properly so that bits get read, pushed into a register, operated on, pushed out the memory, then the address stepped to the "next" address as indicated (usually by another register).

If you look into "bare" CPUs (take a look at the datasheet for an 8-bit CPU like the 6805 or Z80) - you'll find in there a "timing diagram" - that shows a linear representation of square-waves and their sequence to indicate when and how memory is read and written to, when address and data lines are active, etc. It can be a very, very complex dance.

I honestly don't know what such a diagram looks like for a modern processor (I'm not even sure if you can get a datasheet for modern 64-bit PC processors - at least easily - I haven't actually looked!) - but it is sure to be complex.

You can sometimes find timing diagrams on simpler parts that you might use with an Arduino - such as shift-registers and similar. I don't think there is much if any timing diagram, though, for the ATMega328 or others - simply because everything is "on-board" and there isn't any way to add peripheral devices to the address/data bus of the CPU in the controller. There may be one for the ATMega2560 - since you can add SRAM to it - but I am not certain there...

I honestly don't know what such a diagram looks like for a modern processor

You can find some timing diagrams in an AVR datasheet. For example, those AVRs that support external data memory would have a familiar-looking timing diagram.

Some processors (like the Motorola 68xx series) also use "prebytes". ...
This avoids making the opcode map huge with special codes for EACH operation

It might have more to do with being an "upgrade" of a pre-existing architeture that didn't have a "Y register."
One of the big differences (real, rather than marketing) between a RISC cpu like the AVR and a CISC cpu like the 68xx is that a RISC cpu will tend to have a single (or very limited) set of instruction formats. No pre-bytes, no different-length instructions depending on the arguments, etc.

In the old days, a well-designed cpu architecture had a lot of elegance to it. You can look at the PDP-11 instruction format, and there's a set of bits for the operation, and as set for the source, and a set for the destination, some bits for modes... With a bit of training, you can almost see how the hardware has to work, but you can see that it was also designed for humans (assembly language programmers) and by humans. An AVR is by-contrast, "obviously" designed with the aid of CAD tools. The same bits might be there, but they're scattered throughout the instruction (presumably in ways that make the hardware smaller and more efficient...)

The same bits might be there, but they're scattered throughout the instruction (presumably in ways that make the hardware smaller and more efficient...

Or not, with microcode I think it can be totally random.

RISC machines tend not to use microcode.