What's in .trampolines? how does it get referenced?

From an exported assembly listing…

00000300 <__trampolines_end>:
__trampolines_start():
     300: 49 74         andi  r20, 0x49 ; 73
     302: 27 73         andi  r18, 0x37 ; 55
     304: 20 6e         ori r18, 0xE0 ; 224
     306: 6f 74         andi  r22, 0x4F ; 79
     308: 20 3a         cpi r18, 0xA0 ; 160
     30a: 2d 28         or  r2, r13
     30c: 20 4e         sbci  r18, 0xE0 ; 224
     30e: 6f 74         andi  r22, 0x4F ; 79
     310: 68 69         ori r22, 0x98 ; 152
     312: 6e 67         ori r22, 0x7E ; 126
     314: 20 77         andi  r18, 0x70 ; 112
     316: 65 20         and r6, r5
     318: 63 61         ori r22, 0x13 ; 19
     31a: 6e 20         and r6, r14
     31c: 64 6f         ori r22, 0xF4 ; 244
     31e: 20 68         ori r18, 0x80 ; 128
     320: 65 72         andi  r22, 0x25 ; 37
     322: 65 21         and r22, r5

Located immediately after the vector table.
If <ctors_end> is above __trampolines_start()
you end up with this stuff (that makes sense, that’s the standard clearing r1, setting up stack pointer, that stuff, called from .init2

00000100 <__ctors_end>:
__trampolines_start():
../../../../crt1/gcrt1.S:230
 100: 11 24         eor r1, r1
../../../../crt1/gcrt1.S:231
 102: 1f be         out 0x3f, r1  ; 63
../../../../crt1/gcrt1.S:232
 104: cf ef         ldi r28, 0xFF ; 255
../../../../crt1/gcrt1.S:234
 106: cd bf         out 0x3d, r28 ; 61
../../../../crt1/gcrt1.S:236
 108: df e7         ldi r29, 0x7F ; 127
../../../../crt1/gcrt1.S:237
 10a: de bf         out 0x3e, r29 ; 62

But what the bloody hell is the first one? The thing that’s really weird is… that is nonsense code. If it made sense to execute it - okay, fine, it’s code that I don’t know where it came from, but okay, I can deal with that!

It’s things that would be run as part of a comparison or common bitmath stuff… but how do they get run? Does my code do those kind of comparisons? Yeah, probably… but why are they stuffed off over there instead of being where they are when they’re called? And how does the rest of the sketch use them? it’s absolute nonsense to just like, run through them sequentially, because there are some registers that it does things to that would erase all trace of what was once there before any other code could see it. And yet, lots of sketches end up generating that when compiled. Not just modern AVRs (I have a listing from a '328p kicking around with that)…

  68: 00 00         nop
  6a: 00 09         sbc r16, r0
  6c: 00 03         mulsu r16, r16
  6e: 02 00         .word 0x0002  ; ????
  70: 00 04         cpc r0, r0
  72: 05 08         sbc r0, r5

How the hell are they used? There is: no other reference to “trampoline”, no reference to any of the absolute flash addresses, and in any event, and unless I am forgetting something very important, there is no instruction that reaches out, runs one instruction, and then comes back on it’s own - Does anyone have any knowledge to impart here?

In any case, the “problem” is solved, but I would really like to know what the hell this section is and how it works. It IS NOT what the avr-gcc docs say trampolines contains (jmps to >128k flash addresses that get called by ijmp/icall) These are not jumps, and the flash isn’t larger than 128k…

The reason for this whole thing is that - if “unrestricted” (“unspecified” “it’s the user’s responsibility to ensure”) writes to flash from the application are to be permitted on a Dx, tinyAVR 0/1/2 or megaAVR 0-series (as opposed to a tools submenu to specify how much flash you want to make writable) the instruction has to execute from “boot” memory. Okay, no problem - you already need to designate at least one page as “boot” memory for ANY self programming to work, even just writing to APPDATA from APPCODE, because if BOOTEND/BOOTSIZE is 0, the whole flash is marked as BOOTCODE, and BOOTCODE can never be written except via UPDI…

So this is for the difference between Tools → Writable Flash “Above 112k” “above 96k”, and so on (which you’d do with APPEND/CODESIZE fuses; in that case the code running from the app could write to the flash no problem. Biut it’s rather restrictive since the IDE doesn’t provide a means to enter a number, only choose items from a dropdown menu; one is left with either poor granularity, an unreasonably long menu, or both. But if you have a Tools → Writable Flash “Everywhere” (except the first 512b), it would be as powerful as writing the flash from within Optiboot, without any of the mess and dependance on exact, specific menu options.

(experience has shown tools submenus are something that should be kept as few as possible, because people forget about them and they aren’t saved with the skletch, but rather globally per user. I’ve had week-long conversations with people, where I remonded them like 3-4 times to check a specific menu option before they finally did, realized that was their problem all along (as I kept saying I thought it was!) and set it correctly. So yeah, the fewer tools menus and the less important they are the better.

You can already call into Optiboot to erase or write arbitrary data to arbitrary locations as long as they are not within the boot section as long as the write protect or lockbits (as appropriate to the era of AVR you’re using aren’t set to prevent that - there’s a hook in Optiboot for everything except Dx. Dx can’t spare the flash for the full magic routine - but also only the SPM instruction is privileged there; you can set everything else up from app section. So in Optiboot I just stuck a 4-byte constant ( 0x950895F8UL ) immediately before the optiboot version, so you jump to that, and it’s an SPM Z+ (0x95F8), RETurn (0x9508). Then I just do normal flash write routine, but instead of calling SPM Z+ myself, I do call (0x1FA (0x200-2 for version - 4 for the 2 magic words)

So in this configuration (actually, any time SPM from app is allowed on these, because without setting BOOTSIZE, all flash is unwritable BOOTCODE), without bootloader, the first page is BOOTCODE, but is used just like app - except somewhere in it, we need to stick those 2 instructions or that magic 4-byte constant that does the same thing. Unlike a true bootloader, you ensure that the CPUINT.CTRLA IVSEL bit is set before interrupts are enabled in init() so it jumps to (vector offset + 0x0000, rather than vector offset + 0x0200) on interrupt (the application code is compiled without the offset you need at the start of a modern-AVR part with a real bootloader). Acts just like part of the app from there on out.

Then you just need the magic entry point to be somewhere between the end of vector table and end of first page. My first thought was - put it in init, and preceed it with an rjmp to jump over it… but it didn’t end up on the first page. Strings put in progmem with F() macro generally get prioritized for lower addresses - they need to be in low 64k… while .init0 doesn’t so obviously a properly written compiler would put the things it knows have constraints first (and before the thing that I know has constraints but it doesnt). I can see this in the crt gcrt1.S in the toolchain source directory prior to building toolchain package.)

This trampolines section is perfectly positioned and looks like the only way I could pull this off without modifying linker script or packaging alternate gcrt1.o files to link against (which would let one stuff the entry point into the interrupt vector table - which was what I originally wanted to do). . And indeed I can add attribute ((section (".trampolines"))) to my SPM entrypoint function, it puts it right at the start!

#if (!defined(USING_OPTIBOOT) && defined(SPM_FROM_APP) && SPM_FROM_APP==-1)
void entrypoint (void) __attribute__ ((naked)) __attribute__((used)) __attribute__ ((section (".trampolines")));
void entrypoint (void)
{
    __asm__ __volatile__(
                "rjmp .+4"                "\n\t" // skip over these when this runs during startup
               "EntryPointSPM:"           "\n\t" // this is the label we call
                "spm z+"                  "\n\t" // write r0, r1 to location pointed to by r30,r31, increment r30
                "ret"::);                        // by 2, and then return.
}
#endif

And it does appear to work - so I suspect that whatever the hell this section does, the linker is smart enough to sort it out when I get some of my crap to go at the start. that rjmp .+4 is unnecessary too now? Because as I have convinced myself at least, execution does not proceed normally through that section (since the results would be nonsensical) . I just want to better understand what the heck this part does, because it’s usage seems inconsistent with everything i’ve been able to find about the trampolines section.

Anyway, sorry for the rambling question - if anyone has any wisdom here, it would set my mind at ease. :slight_smile: Thanks

I heard about trampolines in the context of indirect calls to functions or computed goto jumps: the linker generates "stubs" and the indirect call jumps to the stub which contains the direct jump to the desired address.

Such stubs are thus "jump pads" and sometimes in a very imaged way referred to as 'trampolines' (you bounce off the jump pads)

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.