Swapping arrays for an interrupt, quickly

This is somewhat related to my other question about arrys and pointers but for aseparate situation now. And this time I really do want 1 operation copying, but the arrays are guaranteed the same lengthand just need swapping over.

I'll have two sets of arrays which I want to be able to use in both main body code and within an ISR.

I have one array for which I want to populate it with values from wthin the ISR (writing only), and then use them in the main code (reading only).

I'll have another where I want to populate it in the main code (writing only) and use it (for reading only) in the ISR.

But typically an ISR could trigger whilst copying is ongoing, so the ISR could find itself processing an array which the main code had only got halfway through putting the correct values in to, or the main code could find that when trying to take a non-volatile copy of the other array which the ISR modifies that it copies half of what that array used to say, but then the ISR runs and the rest of the array which the maind code fills up afterwards instead reflects new data compared to the first half of the array.

I understand that one can put cli() and SREG=OldSREG code around the copying, but this means that the interrupt is disabled for the whole time it takes to copy the arrays. i don't want this, I want the time for which the interrupt canot trigger to be as short as possible.

The slow way would be:

uint8_t ArrayA_main[5]={0,1,2,3,4};
uint8_t ArrayA_ISR[5]={0,1,2,3,4};

//array sets A and B don't have the same data or do the same things, they are 
//separate in every way and not guranteed to be the same lengths

uint8_t ArrayB_main[5]={0,1,2};
uint8_t ArrayB_ISR[5]={0,1,2};

//other variables and setup stuff

ISR (PCINT0_vect ){
 uint8_t variable=digitalRead(z);
  //use ArrayA_ISR (reading only)...
 uint8_t whatever=ArrayA_ISR[x];
 digitalWrite(n,ArrayA_ISR[y]);

 //fill ArrayB_ISR (writing only)...
 ArrayB_ISR[thing]=variable;
}
void setup (void){
    PCMSKX |= bit (PCINTX);  // pin 
    PCIFR  |= bit (PCIFX);   // clear any outstanding interrupts
    PCICR  |= bit (PCIEX);   // enable pin change interrupts for D8 to D13
}
void loop (void){
  //do stuff which fills ArrayA_Main(writing only)
  ArrayA_Main[i]=(uint8_t)measurement;

  //read from ArrayB_Main(reading only)
  uint8_t something=ArrayB_Main[k];

  uint8_t OldSREG=SREG;
  cli();//disable interrupts
  for(uint8_t a; a<sizeof(ArrayA_Main);a++){
    ArrayA_ISR[a]=ArrayA_Main[a];
  }
  for(uint8_t b; b<sizeof(ArrayB_ISR);b++){
    ArrayB_Main[b]=ArrayB_ISR[b];
  }
  SREG=OldSREG;//re-enable them


}

But this means that interrupts are disabled for quite a long time, for the duration of two whole for loopings, a problem if the arrays are long as any interrupt event occuring then will have t wait potentially a rather long time before it can run. I want something where I can preserve the integrity of the arrays, not end up either in the interrupt or themain body where half of the array is from a previous main body or interrupt loop and half is from the latest iteration, and do so with only a few clock cycles of time during which interrupts must be disabled, so that any interrupt arriving during this time will be able to start running almost as fast as if there wasn't a section of the main body where interrupts are disabled.

I understand there is a way with two pointers which swap references to each other (one set of two pointers for the ArrayA situation and another set of two for the ArrayB situation) and give only a disabling interrupts period of two or three instructions while the array pointers get swapped via an extra temporary variable. I'm sure I've heard of this being done, but don't seem to be able to find the right keywords online, most of what searches bring up for terms about arrays swapping and pointers contain for loops doing almost exactly what I show in the slow example for item by item swapping, except with pointers doing it rather than value copies.

Can someone please show me an example for how to quickly swap the arrays between those used by the ISR and those used in the loop code. Needing to get a whole array intact in to or out of an ISR without the long delay involved in for loops or memcpy must be a common problem, and this is a direct swapping scenario not an arbitrary copying between multiple possible places.

Thank you

TLDR, I'm afraid. But perhaps ping-pong buffering is what you need. Feed that term to your favorite search engine.

1 Like

You basically need to define a pointer to the array element type and then point that pointer to the correct array once it is filled.

int a[100], b[100];
int* pForISR;

// fill up array a with data
pForISR = a;

// fill up array b with data while a has 'current data' then make b current
pForISR = b;

In the ISR you just reference pForISR[index] like you would for any array.

Note this is not safe unless you know and restict pForISR[] references to valid memory as the compiler cannot check you are within array bounds.

Thanks marco_c
Where would the cli() and SREG=OldSREG commands sit relative to that copying method.

And for the array to be transfeered from the interrupt to the main body, you'd run something like

//fill c with data
pForMain=c;

//fill d with data
pForMain=d;

during the interrupt, then have the main read from pForMain? Couldn't it be subject to change half way through in this direction though?

Ensuring sfaety here is just a matter of ensuring both arrays which are to be swapepd are the same length, and knowing what that length is so that no function ever tries to write to an index beyond it? Just like normal array use, the same way as when feeding arrays in to functions where sizeof no longer works because it returns a pointer's size rather than the array'selement count?

Thanks gfvalvo, but most of the ping pong buffer results I can find seem more focused on FPGAs and other situations where buffers are in physical hardware?

I learned a long time ago about semifor registers and not to process data in an interrupt. I simply set a flag then process it in the main loop. It makes debugging much easier and a lot less problems with what will work and what won't.

My problem isn't about how much the interrupt does, it is how fast I need the interrupt to run when triggered. I need to be able to take digital readings on several other pins very quickly indeed after a different pin has changed state. Hence I need to make sure nothing in the main body would cause a potentially long delay between an interupt event occuring and when the interrupt code can start running. This is why I don't want a set of big for loops between cli() and SREG=OldSREG commands.

Since these are asynchronous events just keep track of the order in the interrupt and process in the main loop.

Yes, correct.

You should only need to disable interrupts around the pointer copy. Make sure that the pointer is declared as 'volatile int' to make it interrupt safe.

For the reverse direction (ISR to main code) I would just use a boolean flag for the main program to pick up that the array has been filled (ie, set up a signal to the main from the ISR that data is ready). the main loop can then swap pointers (or whatever) and reset the boolean once it has processed. This is similar to the suggestion by gilshultz.

You might want to sprinkle in some volatile qualifiers:

volatile int a[100], b[100];
volatile int* volatile pForISR = a;

Note that you need two volatiles on the pointer.

2 Likes

Not at all. Although I do see many of the Google hits point to hardware applications, ping-pong or double buffering is extremely common in software engineering. Basically, the writer part of the code writes to one buffer while the reader part processes the data from another buffer that was previously written. Then they swap. The swap only requires exchanging pointers, not moving data.

Also, as I pointed out in your other thread, what you're trying to do is not "copying" at all and you should probably stop calling it that. Copying means that you start with one instance of a thing and when you're done you have two identical but independent instances of the thing.

What you're trying to do is simply using pointers to access the same data. For example, this is NOT copying:

  int arrayData[10];

  int *ptr = arrayData;

because arrayData[2] and ptr[2] are accessing the exact same memory and the exact same data. It is simply begin accessed through two different pointers.

To meet your requirement for a swap of two array variables, please take a look at the below example,

typedef struct {
  uint8_t buffs[5];
} object_t;

object_t objA;
object_t objB;

object_t  temp = objA;
objA = objB;
objB = temp;

I'm going to post an example of this, as I've written it for actual use, later, hopefully people can then advise if it is both properly volatile (I hear pointers must be made volatile as well as the things they point to? How?) and ISR safe, and also that it isn't going to memory leak or overwrite memory used by other variables.

For now, can I check about

uint8_t OldSREG=SREG;
cli();
//code here
SREG=OldSREG;

vs

ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
 //code here 
}

I had used the atomic block method at one point for one project long ago in the past, but found it crashy (i can't remember exactly how, I think it was something like it blocked an interrupt which arrived at a certain time rather than delaying it until the atomic section was done?) and resolved to use the cli(); method instead.

But I've just seen this warning
https://www.nongnu.org/avr-libc/user-manual/optimization.html#optim_code_reorder
which seems to say that the compiler may choose to shift cli() and interrupt restoration around when optimising, potentially letting lines which should be between them escape to be outside them. Are those memory barrier methods used by default or not when calling cli and SREG?

Should I use atomic blocks again? or are they genuinely more prone to causing problems, the way it seemed to be last time I tried them, if so how?
Thanks

On the other hand, how would use of a single byte flag variable be?

Something which for the main-->interrupt array warns the interrupt which copy is not currently being edited by the main at the instant the interupt triggers, and then for the interrupt-->main array tells the main if an interrupt tookplace during copying and does the copying again if so?

Would these be immune to that risk of compiler reordering?
Thanks

Any further thoughts on my questions in the last post of this thread? Thanks

I want to ask some basic questions:

What exact type of microcontroller are you using?

What would be the longest acceptable latency between logic-level state changes and ISR starts running?

as an example:
0:00.002000: main-code does a noInterrupts() and processes array-data
0:00.005000: logic state-change on interrupt-invoking IO-pin occurs
0:00.012000: main code has finished processing array-data and executes interrupts()
0:00.012070: ISR can start

In this example the latency between the point in time where the state-change on the ISR_trigger-pin occurs and when the ISR starts to run is 0.01207 - 0.005 = 0.00707
= 7.07 milliseconds.

These are just example-numbers.
Can you post the numbers that you have in your application?

If you can't post the example-numbers. Give an overview about your project.
What is the final purpose of all this?

I would like to clarify what maximum latency is acceptable.
It might turn out that

  • you need a much faster microcontroller because your acceptable latency is 300 nano-seconds
  • the copy-method is way fast enough because the acceptable latency is 20 milliseconds.

Another question is what is the maximum time between two interrupts occuring?
is this once every 5 seconds?
is it after less than a millisecond?

If these numbers are clear. Maybe the whole discussion becomes obsolete because

  • the copy-method is way fast enough because the acceptable latency caused by the copy-method is 10 times shorter than the acceptable latency.
  • you need a much faster microcontroller because your acceptable latency is too low

best regards Stefan

A possibility to investigate with memory barrier ➜ read about __sync_synchronize(); I think it's implemented in GCC

maybe consider something like this

volatile bool mutexFlag = false;

void acquireMutex() {
  while (true) {
    noInterrupts();               // Disable interrupts
    __sync_synchronize();         // Memory barrier
    if (!mutexFlag) {
      mutexFlag = true;
      __sync_synchronize();       // Memory barrier
      interrupts();               // Re-enable interrupts
      break;
    }
    __sync_synchronize();         // Memory barrier
    interrupts();                 // Re-enable interrupts and allow other interrupts to execute
    yield();
  }
}

void releaseMutex() {
  noInterrupts();                 // Disable interrupts
  __sync_synchronize();           // Memory barrier
  mutexFlag = false;
  __sync_synchronize();           // Memory barrier
  interrupts();                   // Re-enable interrupts
}

void setup() {
  // •••
}

void loop() {
  // Start of Critical section: Acquire the mutex before accessing shared resources
  acquireMutex();

  // Access shared resources here
  // •••

  // end of Critical section: release the mutex before accessing shared resources
  releaseMutex();

  // Other non-critical code
}

This might be okay on AVR, since __sync_synchronize() is only a compile-time fence (in the absence of caches, out-of-order execution, multiple cores, etc.).
However, on non-AVR targets, you should use the standard library atomics and fences, e.g. std::atomic_signal_fence - cppreference.com.
This is because C++ has a well-defined memory model, and rigorously defines the behavior and synchronization and memory order guarantees for all atomic operations and fences.

A full memory fence like __sync_synchronize() is serious overkill: for interrupt handlers, you should only need compile-time barriers. (*)
Even for multithreaded applications, you only need atomic exchange/load-acquire and store-release operations to implement a mutex, not full barriers. See e.g. Correctly Implementing a Spinlock in C++ | Erik Rigtorp

In the context of interrupts, a mutex is essentially useless, because the ISR cannot possibly acquire it without deadlocking.


(*) Since both the access of the volatile variable mutexFlag and the call to (no)interrupts() (access of a volatile register) have observable side effects, you don't need any additional ordering constraints in this case.
However, volatile does not guarantee any memory order, so it cannot be used for multithreaded applications.

1 Like

I'm finding discussions online about how the compiler will often reorder instructions as compared to how they are shown in the code.

So

Flag=0;
//do array copying
Flag=1;

set up in a scenario where when the interrupt runs and wants to read (from the main code) that array it first checks and ensures Flag==1, otherwise it exits

could end up with the compiler deciding to set Flag=1; before doing the copying actions. Would making Flag AND the array variable volatile protect against this? What about memory barriers placed between Flag=0; and the start of the copying and after the copying before Flag=1; Do they actually work properly for ATMEGA and ATTINY?

Flag=0;
asm volatile ("" : : : "memory");
//do array copying
asm volatile ("" : : : "memory");
Flag=1;

How would this be?

As global variables we have


#include <util/atomic.h>

#define barrier()  asm volatile("" ::: "memory")

volatile uint8_t ArrayIntToMain[40]={0};
volatile uint8_t ArrayIntToMainMain[40]={0};

volatile uint8_t MtoIArray[40]={0};

volatile uint8_t Type1Array[]={0,0,0};
volatile uint8_t Type2Array[]={0,0,0,1,0,};

volatile uint8_t Type1ArrayNew[]={0,0,0}; 
volatile uint8_t Type2ArrayNew[]={0,0,0,1,0};

volatile uint8_t Type1ArrayOld[]={0,0,0}; 
volatile uint8_t Type2ArrayOld[]={0,0,0,1,0};

volatile uint8_t OldOrNewArrays=0; //0 for use Old, 1 for use new

volatile uint8_t NewIntDone=0;


then within the ISR we have, amongst other things

NewIntDone=1;
ArrayIntToMain[0]=thing;
ArrayIntToMain[1]=other;
//other filling of ArrayIntToMain

// and we have
if(condition){
  if(OldOrNewArrays==1){
    memmove(MtoIArray,Type1ArrayNew,sizeof(Type1Array));
  }else{
    memmove(MtoIArray,Type1ArrayOld,sizeof(Type1Array));
  }
}else{
  if(OldOrNewArrays==1){
    memmove(MtoIArray,Type2ArrayNew,sizeof(Type2Array));
  }else{
    memmove(MtoIArray,Type2ArrayOld,sizeof(Type2Array));
  }
}

//then we use MtoIArray within this ISR

and then within the main loop of the code we have

if (NewIntDone==1){

    barrier();
    NewIntDone=0;
    //now copy message from interrupt to main code
    //point A
    for (uint8_t i=0; i<sizeof(ArrayIntToMain) ; i++){
      ArrayIntToMainMain[i]=ArrayIntToMain[i];
    }
    barrier();
    if(NewIntDone!=0){//if another new arrievd while we were copying
      //point B
        for (uint8_t i=0; i<sizeof(ArrayIntToMain) ; i++){
          ArrayIntToMainMain[i]=ArrayIntToMain[i];
        }
      
    }
    barrier();
    NewIntDone=0;
    barrier();
    for (uint8_t i=0; i<sizeof(ArrayIntToMain) ; i++){
      ArrayIntToMain[i]=0;
    }
}

//make copies of arrays which will be fed from main code to interrupt
  OldOrNewArrays=0;
  barrier();
  for (uint8_t i=0; i<sizeof(Type1Array) ; i++){
      Type1ArrayNew[i]=Type1Array[i];
  }
  for (uint8_t i=0; i<sizeof(Type2Array) ; i++){
      Type2ArrayNew[i]=Type2Array[i];
  }
  barrier();
  OldOrNewArrays=1;
  barrier();
  for (uint8_t i=0; i<sizeof(Type1Array) ; i++){
      Type1ArrayOld[i]=Type1Array[i];
  }
  for (uint8_t i=0; i<sizeof(Type2Array) ; i++){
      Type2ArrayOld[i]=Type2Array[i];
  }
  barrier();

//do other stuff like processing data of ArrayIntToMainMain, and like filling up data for Type1Array and Type2Array

With the one exception of if we get three interrupts happening in very close succession, particularly the later two of the three, so the second arrived mid-way during the copying just after point A and the third arrived mid-way during the copying just after point B, is there any way this code can end up with a non-atomically-copied (partly from one iteration of filling it, partly from another) version of MtoIArray or a non-atomically-copied version of ArrayIntToMainMain?

Does this achieve the same as would be achieved by having sections in the main loop wrapped in atomic blocks which perform the copying both ways, except in this case there aren't any periods where an arriving interrupt would be delayed in executing due to an atomic block?

Thank you

Yes, both the flag and the array being volatile is sufficient to ensure correct ordering for interrupt handlers.

What about memory barriers placed between Flag=0; and the start of the copying and after the copying before Flag=1; Do they actually work properly for ATMEGA and ATTINY?

For single-core CPUs, you only need compile-time memory barriers in this case, e.g. atomic_signal_fence, but an inline assembly memory clobber should do the trick as well.

Synchronization with signal handlers is tricky, see std::signal - cppreference.com for details.

This code is not correct, it has data races, as you indicate in your post.

No, the version with atomic blocks would be race-free.