Go Down

Topic: Bit-band Cortex M3 - fast bit modifications in RAM explained (Read 498 times) previous topic - next topic

hubmartin

Hi, I've done a video explaining bit-band functionality on Cortex M3 or M4 processors. I also compare compiled assembler outputs for AVR and ARM achitecture. I hope someone will find this interesting. I have also another interesting content on my YouTube channel. Let me know what you think.

https://www.youtube.com/watch?v=h78DyF1NOio


ODwyerPW

Interesting, I was just reading about this today as I poured over Cortex-M7 documents. Cortex-M7/M7F do not support bit-banding.

I thought that perhaps the Arduino core internal libraries might use that technique for the DUE or ZERO for reading/writing I/O. But considering the issues regarding interrupts occurring during the bit-band process (race conditions) and the fact that GPIO might lay above the bitband permitted region, I imagine the coders decided to play it safe. 


Thank you very much for sharing.
Quiero una vida simple en Mexico...nada mas.

Paul Stoffregen

But considering the issues regarding interrupts occurring during the bit-band process (race conditions)
Bitband operations are atomic, not only at the instruction level, but also at the bus access level.  That means they're even atomic with respect to DMA.



westfw

Mostly bit banding is not used for gpio because the gpio peripherals have their own equivalent features that are at least theoretically more powerful and easier to use.

westfw

Expanding on the previous comments, the GPIO ports on ARM chips usually have "set bit" and "clear bit" registers that operate similar to bit-banding, but can (for example) set or clear multiple bits at one time.
I've always sorta wanted to write a bit-banding based implementation of digitalWrite/digitalRead; it's not immediately obvious whether it could be "better" than a GPIO based implementation in some way.  (although frankly, the current Due implementation of digitalWrite/Read is not-at-all limited by the raw IO speed, but rather by other inefficiencies. (sigh.))

(although since CM0 and CM0+ do not generally support bit-banding (and apparently not the M7 either?), the problem is less interesting.)

(as far as I can guess, bit-banding was introduced in CM3 specifically to compete with the fast bit-access instructions available on many 8bit CPUs.  But it's never really caught on.)

julyjim

Hi, I've done a video explaining bit-band functionality on Cortex M3 or M4 processors. I also compare compiled assembler outputs for AVR and ARM achitecture. I hope someone will find this interesting. I have also another interesting content on my YouTube channel. Let me know what you think.

https://www.youtube.com/watch?v=h78DyF1NOio


You article is very helpful and appreciated ,  however , can you supply a link to code only?
I know it is in the video , but have not figured out how to copy just the code text, sorry.

I am looking into implementing SPI on PIO and it was suggested that bit banging may work.

Thanks
Jim

joeblogs

- im about to set up a buffer in the bit band region, how can i tell if another part of code uses the address space?

- Is it safe to assume if during runtime the addresses are set to 0x0 then they are not used?

- are all register addresses set to zero on reset and then setup as needed?




images from
Yiu, Joseph. The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors.

great book with details on most subjects

westfw

Quote
- im about to set up a buffer in the bit band region, how can i tell if another part of code uses the address space?
Huh?  It doesn't matter...

Quote
- Is it safe to assume if during runtime the addresses are set to 0x0 then they are not used?
Probably not.

Quote
- are all register addresses set to zero on reset and then setup as needed?
Definitely not; register initial contents will be as described in the datasheet, and NOT only zero.   There's a section in the datasheet (called "Register Mapping") for each peripheral showing the post-reset state.   For example, the RTT "mode register" resets to 0x8000.
 I'm not sure what this has to do with bit-banding, though.

You seem to misunderstand how bit-banding actually works - you don't specifically get a piece of memory that is to be accessed via bit-banding, but rather you get bit-wise access to a pieces of NORMAL memory that can still be accessed via the normal address space as well.  So you need to allocate it via normal methods, and just ACCESS it via the bit-band region pointers.  The bit-band region typically includes ALL of the on-chip memory, so this is pretty easy:

Code: [Select]

/*
 * Allocate some chunks of memory, to be accessed as bits...
 */
uint32_t someBits[256/32];  // 256 bits.
uint32_t *lotsOfBits = malloc(320*240/8);  // screen-sized array of bits.

/*
 * create pointers into the Bit band alias region for each chunk of memory.
 * each 32bit word in the BitBand region maps to a single bit.
 */
uint32_t *someBits_bitp = (uint32_t *) (((uint32_t)someBits) | 0x22000000)
uint32_t *lotsOfBits_bitp = (uint32_t *) (((uint32_t)lotsOfBits) | 0x22000000)

/*
 * Access single bits
 */

aBit = someBits_bitp[12];   // get bit 12.
someBits_bitp[200] = 1;     // set bit 200.
lotsOfBits_bitp[y*240+x] = 1;  // set a bit on our "screen"


(this is NOT tested code, BTW...)


joeblogs

maybe i should have said is there anyway to tell if any part of the arduino core uses the bit banding memory addresses.

ard_newbie

You have got at least 2 possibilities to obtain an atomic access to a variable.

One of them is the use of atomic assembler instructions LDREX/STREX designed for this purpose. If you are not familiar with inline assembler, use packaged atomic functions, e.g. here:

https://github.com/commaai/openpilot/blob/master/board/inc/core_cmInstr.h

The other one is to declare bitfields variables, e.g:

Code: [Select]

struct foo_Type {
  uint8_t flip1: 1; // 1 bit
  uint8_t flip2: 4; // 4 bits
  uint8_t pad1: 4;   // 4 bits, total 8 bits
  uint8_t pad2;     // 1 byte, total 2 bytes
  uint16_t pad3;    // 2 bytes, total 4 bytes
  uint32_t pad4;   // 4 bytes, total 8 bytes
};

foo_Type foo;

void setup() {
  
  Serial.begin(250000);
  
  while (true) {

    foo.flip1++;
    Serial.print(foo.flip1);
    Serial.print("  ");
    foo.flip2++;
    Serial.println(foo.flip2);
    delay(2000);
  }
}

void loop() {
}



westfw

Quote
maybe i should have said is there anyway to tell if any part of the arduino core uses the bit banding memory addresses.
I'm pretty sure that none of the arduino core uses bit banding.  It's ... relatively incompatible with C.

I still don't understand why it matters.  Bit-banding gives you convenient and atomic access to individual bits, but it's no faster or better than the access you already had to bytes, and ARM libraries are historically and consistently "not good" at being space-efficient (after all, even a tiny ARM chip has LOTS of RAM compared to the bit-addressable memory area of an AVR or 8051, and I always figured it was a feature designed to appease users of such "classic" micros more than something that was truly useful...)


joeblogs

cheers, interesting link

could you explain this syntax for me, in the ""

uint8_t flip1":" 1;

ard_newbie


AFAICT Bitfields are implemented with word or byte-level mask and rotate instructions. They give you the ability to declare structure fields smaller than a character width.

Coming back to the subject of your post, you know if one of your interruptions could change a variable whereas the main loop() is using this same variable to decide which branch it should follow.

You can avoid a modification of this variable from an interruption between two instructions by using :

NVIC_SetPriority() to set a low preemption_priority level to the Handler at the beginning of your sketch and enclose the "atomic" code by :

__set_BASEPRI( preemption_priority << (8 - __NVIC_PRIO_BITS)); //only IRQ with higher preemption priority than  preemption_priority are permitted.  
// Your code
__set_BASEPRI(0); // remove the BASEPRI masking


westfw

(I don't think that there are any C compilers that use bit-banding, and it's probably just as well.   The idea of being able to have individual bits as variables, even to the point of having pointers to bits, is interesting.  But not at all widespread (and "only" addresses 8 megabits, which is nothing compared to what C compilers try to support today.))


joeblogs

cheers for the replies guys.

this was how i was looking at setting it up straight out of an atmel note
Code: [Select]

#define BITBAND_SRAM_REF 0x20000000
#define BITBAND_SRAM_BASE 0x22000000
#define BITBAND_SRAM(a,b) ((BITBAND_SRAM_BASE + (a-BITBAND_SRAM_REF)*32 + (b*4))) // Convert SRAM address

// Mailbox bit 7
#define MBX_B7 *((volatile unsigned int *)(BITBAND_SRAM(MAILBOX,7)))

Code: [Select]

unsigned int temp = 0;
MBX_B0 = 1; // Word write
temp = MBX_B7; // Word read


if you can get the job done in one atomic move, surely that has to be better than increasing interrupt latency

Go Up
 


Please enter a valid email to subscribe

Confirm your email address

We need to confirm your email address.
To complete the subscription, please click the link in the email we just sent you.

Thank you for subscribing!

Arduino
via Egeo 16
Torino, 10131
Italy