Hi, I've done a video explaining bit-band functionality on Cortex M3 or M4 processors. I also compare compiled assembler outputs for AVR and ARM achitecture. I hope someone will find this interesting. I have also another interesting content on my YouTube channel. Let me know what you think.
Interesting, I was just reading about this today as I poured over Cortex-M7 documents. Cortex-M7/M7F do not support bit-banding.
I thought that perhaps the Arduino core internal libraries might use that technique for the DUE or ZERO for reading/writing I/O. But considering the issues regarding interrupts occurring during the bit-band process (race conditions) and the fact that GPIO might lay above the bitband permitted region, I imagine the coders decided to play it safe.
ODwyerPW:
But considering the issues regarding interrupts occurring during the bit-band process (race conditions)
Bitband operations are atomic, not only at the instruction level, but also at the bus access level. That means they're even atomic with respect to DMA.
Mostly bit banding is not used for gpio because the gpio peripherals have their own equivalent features that are at least theoretically more powerful and easier to use.
Expanding on the previous comments, the GPIO ports on ARM chips usually have "set bit" and "clear bit" registers that operate similar to bit-banding, but can (for example) set or clear multiple bits at one time.
I've always sorta wanted to write a bit-banding based implementation of digitalWrite/digitalRead; it's not immediately obvious whether it could be "better" than a GPIO based implementation in some way. (although frankly, the current Due implementation of digitalWrite/Read is not-at-all limited by the raw IO speed, but rather by other inefficiencies. (sigh.))
(although since CM0 and CM0+ do not generally support bit-banding (and apparently not the M7 either?), the problem is less interesting.)
(as far as I can guess, bit-banding was introduced in CM3 specifically to compete with the fast bit-access instructions available on many 8bit CPUs. But it's never really caught on.)
hubmartin:
Hi, I've done a video explaining bit-band functionality on Cortex M3 or M4 processors. I also compare compiled assembler outputs for AVR and ARM achitecture. I hope someone will find this interesting. I have also another interesting content on my YouTube channel. Let me know what you think.
You article is very helpful and appreciated , however , can you supply a link to code only?
I know it is in the video , but have not figured out how to copy just the code text, sorry.
I am looking into implementing SPI on PIO and it was suggested that bit banging may work.
im about to set up a buffer in the bit band region, how can i tell if another part of code uses the address space?
Huh? It doesn't matter...
Is it safe to assume if during runtime the addresses are set to 0x0 then they are not used?
Probably not.
are all register addresses set to zero on reset and then setup as needed?
Definitely not; register initial contents will be as described in the datasheet, and NOT only zero. There's a section in the datasheet (called "Register Mapping") for each peripheral showing the post-reset state. For example, the RTT "mode register" resets to 0x8000.
I'm not sure what this has to do with bit-banding, though.
You seem to misunderstand how bit-banding actually works - you don't specifically get a piece of memory that is to be accessed via bit-banding, but rather you get bit-wise access to a pieces of NORMAL memory that can still be accessed via the normal address space as well. So you need to allocate it via normal methods, and just ACCESS it via the bit-band region pointers. The bit-band region typically includes ALL of the on-chip memory, so this is pretty easy:
/*
* Allocate some chunks of memory, to be accessed as bits...
*/
uint32_t someBits[256/32]; // 256 bits.
uint32_t *lotsOfBits = malloc(320*240/8); // screen-sized array of bits.
/*
* create pointers into the Bit band alias region for each chunk of memory.
* each 32bit word in the BitBand region maps to a single bit.
*/
uint32_t *someBits_bitp = (uint32_t *) (((uint32_t)someBits) | 0x22000000)
uint32_t *lotsOfBits_bitp = (uint32_t *) (((uint32_t)lotsOfBits) | 0x22000000)
/*
* Access single bits
*/
aBit = someBits_bitp[12]; // get bit 12.
someBits_bitp[200] = 1; // set bit 200.
lotsOfBits_bitp[y*240+x] = 1; // set a bit on our "screen"
You have got at least 2 possibilities to obtain an atomic access to a variable.
One of them is the use of atomic assembler instructions LDREX/STREX designed for this purpose. If you are not familiar with inline assembler, use packaged atomic functions, e.g. here:
maybe i should have said is there anyway to tell if any part of the arduino core uses the bit banding memory addresses.
I'm pretty sure that none of the arduino core uses bit banding. It's ... relatively incompatible with C.
I still don't understand why it matters. Bit-banding gives you convenient and atomic access to individual bits, but it's no faster or better than the access you already had to bytes, and ARM libraries are historically and consistently "not good" at being space-efficient (after all, even a tiny ARM chip has LOTS of RAM compared to the bit-addressable memory area of an AVR or 8051, and I always figured it was a feature designed to appease users of such "classic" micros more than something that was truly useful...)
AFAICT Bitfields are implemented with word or byte-level mask and rotate instructions. They give you the ability to declare structure fields smaller than a character width.
Coming back to the subject of your post, you know if one of your interruptions could change a variable whereas the main loop() is using this same variable to decide which branch it should follow.
You can avoid a modification of this variable from an interruption between two instructions by using :
NVIC_SetPriority() to set a low preemption_priority level to the Handler at the beginning of your sketch and enclose the "atomic" code by :
__set_BASEPRI( preemption_priority << (8 - __NVIC_PRIO_BITS)); //only IRQ with higher preemption priority than preemption_priority are permitted.
// Your code
__set_BASEPRI(0); // remove the BASEPRI masking
(I don't think that there are any C compilers that use bit-banding, and it's probably just as well. The idea of being able to have individual bits as variables, even to the point of having pointers to bits, is interesting. But not at all widespread (and "only" addresses 8 megabits, which is nothing compared to what C compilers try to support today.))
I appreciate the efforts of everyone in dealing with bit-banding. I am looking at the datasheet of the Atmel SAM3X SoC: specifically Table 10-6 & Table 10-7 (pages 67-68) and the explanation of bit-banding below. I have to admit being a little confused because I was given to understand that the point of bit-banding is to allow atomic RMW but am I correct in thinking that it is only one bit at a time is atomic? So many registers have multiple bit fields that it appears one would like to RMW but if it's only doing 1 bi at a timet?
I note that synchronization primitives exist and are in the HINTS class of instruction so when you mention other methods, would this be one of them? Sorry if it's me being slow but I'm dreadful at coding anything except assembly language. I am just a little disappointed because I had presumed that ANY arbitrary 32-bit mask could be chosen so that atomic RMWs would do everything.
It just looks very limited.
I would just like to thank westfw for his valuable assembly-language project for the Arduino Uno M0. I've got a 32kbit/sec ACELP & 64kbit/sec mono, fixed-point decoders working on the machine (JUST!) and I've learned a lot about optimizing code for Thumb.
You might be interested by the book "The Definitive Guide to ARM Cortex M3 and Cortex M4" from Joseph YIU. This book contains lots of examples written in assembler instructions.
am I correct in thinking that it is only one bit at a time is atomic?
Yes, I'm pretty sure that ARM Cortex bit-banding is one-bit-at-a-time.
It just looks very limited.
Yes, but there are similar single-bit only features on 8bit processors (8051 has a section of bit-addressable memory, and AVR has SBI and CBI (which people are constantly reminding us are SO MUCH FASTER THAN DIGITALWRITE()), plus SBRC, SBRS, SBIC, SBIS instructions. AFAICT, these features are mostly useful for conserving RAM and program space, which was a lot more important back when a "big" chip had 8k of ROM and 128 bytes of RAM, rather than a "small" chip having 16k of flash and 2k of RAM. (Theoretically, anyway. I observe that it's pretty common to use up that extra memory at a furious rate, when it's assumed to be available.)
Interestingly, NXP (Freescale) has a very similar "Bit Manipulation Engine" on some of their chips (Kinetis E02 series, for example) that does permit manipulation of multi-bit fields.
I would just like to thank westfw for his valuable assembly-language project for the Arduino Uno M0.