Assembler Routine for ESP32 / ISR

Hello all,

the final goal of this project is to write an ISR in assembler. The current version running of this program is working (Not part of the question today). However, the latancy is to high ( 2,5 usec) plus there is jitter of up to 2 usec (this really hurts). So the intention is to write this ISR in assembler hoping to get below 2 usec.

On the way to there I am struggling with such a routine in general. The first example shall call a assembler routine form arduino. For now the assembler routine does nothing (nop) and a return.

The ESP32 code compiles without warning and errors and can be downloaded to the ESP, but the ESP (WROOM32) gives an "Guru Meditation Error: Core 1 panic'ed (InstrFetchProhibited). Exception was unhandled"

Please see attached the sample code. Anyone has an idea what is going wrong here?

Assemblertes.ino

#include "freertos/task.h"
#include "esp_system.h"
#include "esp_spi_flash.h"
#include "driver/gpio.h"
#include "esp_intr_alloc.h"

// load the variable from RAM and not from a storage register
volatile int my_a_number asm("a_number") __attribute__ ((used)) = 6;

//extern void Increment_a_Number(void);
extern "C" {
  void Increment_a_Number(void);
}

void setup() {
 Serial.begin(9600); 
}

void loop() {
 Serial.print("Initial number = " );
 Serial.println(my_a_number);
 Increment_a_Number();  // call assembler routine
 Serial.print("Incremented number = " );
 Serial.println(my_a_number);
 delay(1000);
}

Assemblertes.S (in the same directory)

#include <xtensa/coreasm.h>
#include <xtensa/corebits.h>
#include <xtensa/config/system.h>
#include "freertos/xtensa_context.h"
#include "esp_private/panic_reason.h"
#include "sdkconfig.h"
#include "soc/soc.h"
#include "soc/gpio_reg.h"
#include "soc/dport_reg.h"

#define L5_INTR_STACK_SIZE  12
#define LX_INTR_A15_OFFSET 0
#define LX_INTR_A14_OFFSET 4
#define LX_INTR_A13_OFFSET 8

.extern a_number
.global Increment_a_Number

Increment_a_Number:
  nop
  ret 

Is this a school assignment?

hoping to get below 2 usec

Hoping to get what below 2 us?

When the backtrace info is put into the ESP Exception decoder what error is indicated?

Thank you for your answer:
I hope to get the latency time below 2 usec, the time when an interrupt event happens (e.g. PIN input signal) and the time needed to react (e.g. PIN output toggle). A device with 240MHz clock should do much much better.

However, maybe I should have skipped this (additional) information, since this is not the problem I experience at this point.

Hello Idahowalker,
thank you for your answer. But I think I am lost: I do not know how to get additional information. All I can see the error log published. Then the message is repeating all the time.
I think I am doing wrong something in general (setup of the device?!, missing libraries?), so I hope someone can give me a hint.

Do you need to put the asm in .text or other segment?

Did you try looking up on the internet the words "ESP Exception Decoder"?

Have you looked at the disassembled current ISR code to see if it is the problem?

Latency is usually not invoked by an instruction sequence. It's the CPU pushing registers onto the stack, fetching a vector, and all that stuff that takes time. Nothing to do with the code and you can't get around it.

Also most C compilers nowadays are far better assembly language writers than most human programmers. How many years have you been writing assembly for? Cumulative hours?

You have to place the ISR into IRAM.

IDK how that is done with assembler.

I dunno. The AVR "attachInterrupt" function adds quite a bit of latency compared to a bare ISR...

1 Like

Right. Take care of that first and just write the ISR is C++. Then reevaluate the need for an assembly language version.

If latency getting into the ISR (when you've fixed the IRAM issue) is still a problem, you could abandon the Arduino attachedInterrupt() paradigm and handle the interrupt vector yourself. However, I have a feeling you won't need to .

Thank you for your hint.

The slow latency behavior is a know issue with the ESP32 and part of it seams to be related to the RTOS running on it. The Jitter seam to be relates to the Arduino libraries or the environment: I did the ISR issue in the ESP-IDF. Result: stable 2 usec delay. Arduino environment 2 usec + approx up to 2,5 usec.

All that stuff takes time? With a clock frequency of 240MHz this would mean about 500 cycles to read and set a pin? That is incredibly slow! With good micro this should be done in less than 10 cycles.

Basically I am working in this topic since a long time and the most promising approach to reduce the latency timer might be this project:

https://haydendekker.medium.com/esp32-interrupts-can-only-do-200khz-56f8dbb6a61c

This requires the implemantation of assmbler code. I would like to give it a try.

I have to admit my assembly expierence is a littly rusty: used more than 40 years ago with my first "computer" a Sinclair ZX81 with Z80 processor, later with the PET and CBM64 - Didn' count the hours.

However, this discussion is nice, but doesn't answer the original question about how to implement assembler code for the ESP32 in Arduino. Any hints are appreciated.

Thank you for answer. I did this at first place like this:

volatile unsigned long t_ISR_CS;
void IRAM_ATTR ISR_SCK (); 

/**************************************************
 * Clock Interrupt Service Routine                
 * Bei jeder steigenden Flanke des SPI Clock 
 * wird der das  BITs ensprechend des  
 * Motor Sollwerts gesetzt (Sollwertvorgabe)
 * *************************************************/
void IRAM_ATTR ISR_SCK ()
{  
    digitalWrite(Bitbang, bitRead(SollwertMotor, D0_index)); // MSB zuerst
    D0_index --;
       if (D0_index < 0)
        {
          D0_index = 16;
        }
}

This routine is working well, beside a latency time of 2 usec + up to 2,5 usec jitter.

Hello Idahowalker,
not yet, did not know about it! This might be the correct way to find the root cause of the problem - I will do that. Thank you!

Yes, this matches with my experience. I did the test in the original ESP-ESF enviroment:

  • Latency stable to 2 usec (mightbe related to the RTOS)
  • Latency with AVR 2 usec + jitter (1 .. 2,5 usec)
    conclusion: Probably the AVR libraries are not ideal.
    I assume in most usecases the AVR attachInterrupt is absolutly sufficient, unfortunaltey not in my case. I should not above 2 usec.

Why did you omit it for the assembly version?

Thank you for your answer.
Not yet! But yes this would be the correct way to find the root cause. I am affraid this will be a very hard task, since I am looking for an intermitten behavior, plus I do not have a debugger to catch the event.

I assume the problem is the “ret” in your assembly routine. If the caller (your main) maybe used a callx8 to call your assembly routine, you must use a the RETW or RETW.N instruction in the callee for returning to the caller. But the first thing what the callee has to do is to save the returning address (which is stored in a0) with the ENTRY instruction. ESP32 works with windowed registers. I would recomend you should read a Xtensa ISA manual. To understand the complexity of this assembly instruction set.

Here an example how to call an assembly function from C in Arduino environment

C-Code:

#include <Arduino.h>

// put function declarations here:
extern "C" int	assemblyAdd(int a, int b);

void setup() 
{}

void loop() {
  
  volatile uint32_t i = 99;
  volatile uint32_t e = 0;
  

  e = assemblyAdd(i, (uint32_t) 10);

  while(1);
}

Assembly Code:

                .global assemblyAdd

assemblyAdd:     entry   a1, 48
                add     a2, a3, a2
                retw.n

what's the type of D0_index?