Go Down

Topic: PORTD on Due?! (Read 8982 times) previous topic - next topic

MartinL

#15
Oct 25, 2015, 10:28 pm Last Edit: Oct 25, 2015, 10:54 pm by MartinL
Hi Zemovski

Here's an another example of how to use the ODSR register. The contents of this register are output directly on to the pins of the Due, just like your UNO/Mega example. This is in contrast to using the SAM3X8E's set (SODR) and clear (CODR) output registers.

Code: [Select]
void setup() {
  REG_PIOD_OWER = 0x0000000F;     // Enable writes to lowest 4-bits of PortD's 32-bit, Output Data Status (ODSR) register: (B00000000000000000000000000001111)
  REG_PIOD_OER =  0x0000000F;     // Set the lowest 4-bits of PortD to outputs: (B00000000000000000000000000001111)
}

void loop() {   
  // Using the register definition: REG_PIOD_ODSR, could also use pointer to register contents: PIOD->PIO_ODSR
  REG_PIOD_ODSR = PIO_ODSR_P0;   // Set the output on P0 (digital pin 25) high, other outputs low: (B00000000000000000000000000000001)
  REG_PIOD_ODSR = PIO_ODSR_P1;   // Set the output on P1 (digital pin 26) high, other outputs low: (B00000000000000000000000000000010)
  REG_PIOD_ODSR = PIO_ODSR_P2;   // Set the output on P2 (digital pin 27) high, other outputs low: (B00000000000000000000000000000100)
  REG_PIOD_ODSR = PIO_ODSR_P3;   // Set the output on P3 (digital pin 28) high, other outputs low: (B00000000000000000000000000001000)
}

The reason for the SODR and CODR registers on the Due, (that are absent from the Uno/Mega), is that it's possible to set and clear outputs without having to logically OR your bit mask with the register to set, or logically AND the inverse bit mask with the register to clear. These logical operations require a read-modify-write to be performed for register bit manipulation. For example on the Mega:

Code: [Select]
PORTA |= _BV(PORTA3);        // Set bit by logical OR with bit mask
PORTA &= ~_BV(PORTA3);       // Clear bit by logical AND with inverse bit mask

On the Due:

Code: [Select]
REG_PIOD_SODR = PIO_SODR_P3;      // Set bit without logical OR with bit mask
REG_PIOD_CODR = PIO_CODR_P3;      // Clear bit without logical AND with inverse bit mask

This issue with using the whole of port D on the Due is that some of the pins are used by the Arduino core by other functions, for example port D pins P4 and P5 are used for Serial3 pins TX3 and RX3.

westfw

Quote
Can we write something like:
#define LEDBIT (1<<27, 28, 01, 02)
or
REG_PIOB_SODR = 0x1 << 27 << 28 << 01 << 02;
It would look like:
Code: [Select]
REG_PIOB_SODR = (1<<27) | (1<<28) | (1<<2);

Code: [Select]
 for (float i=0; i<5000000; i++)
  {
  // digital 1, on port D pins 0:
  REG_PIOD_CODR = 0xFFFFFFFF;
  REG_PIOD_SODR = 0x00000001;
  }


"float"??  Really?   You're lucky that your loop terminates - at some point, adding 1 to a large floating point number doesn't change the number...

It should go slightly faster if you use the PIOD  pointer syntax:
Code: [Select]
 PIOD->PIO_CODR = 0xFFFFFFFF;
  PIOD->PIO_SODR = 0x00000001;


Note that you're setting four times as many bits in the Due example. (and yes, it matters.)


Quote
285 millions comands divided by 17,99 seconds gives us 15.842.134 executed "PORTD=" comands per second!
I guess that's almost believable, for certain optimized cases.  The AVR "out" instruction takes one cycle, assuming you've managed to pre-load other registers with the values you are outputting.


Quote
[Each Due C output instruction] takes 4 CPU clock ticks, and each comand separately takes 2 CPU clock ticks to execute!
Also about right, for predictable circumstances...

Don't be fooled into thinking that the Due is only 33% faster for "arbitrary" code.   Some things it will do much faster than an AVR.   Other things, it won't.  Pin IO is rarely highly optimized.  Once your pin toggle rate gets to about 10MHz, you get degraded signals anyway - pin drivers just don't go that fast...


Quote
the Uno might be better choise to work on the project of simple function generator.
could be.   The only thing that isn't "simpler" on an AVR than an ARM is probably integer math with values larger than 8 bits...  (and, for a "function generator", you might run into RAM shortages creating waveform tables.)
(a non-simple function generator on Due would use DMA with the D2A peripheral, allowing it to do the background wave output in ... about ZERO cpu cycles...)

westfw

Quote
we have to investigate how to do it using the port register variables themselves which takes a lot more time and effort.
Compared to learning assembly language on PICs or 4bit CPUs?  And you wish there was a function "setport(D, val)" instead of "PORTD = val"?

Personally, I think the Arduino focus on individual "pins" instead of byte-wide (or larger) "ports" was one of the most important things that they did.  Because binary values are confusing, especially for beginners (And this conversation is a good example: Zemovski has experience with 4-bit assembly (much more experience than the Arduino target audience, but he couldn't figure out how to put together a multibit constant in C.)  "Advanced" boards like MEGA and Due have grouped similar-function "pins" (all PWM, all Serial, etc) on the board, which tends to be disruptive of port-wide access (since microcontrollers tend to scatter such peripherals across multiple ports.)  But it's not "wrong."
(I do wish that the "uncommitted" (right-edge connector) had grouped pins port-wise, though.  (although, see also: http://community.atmel.com/forum/samd10-pinout-venting ))


MrAl

#18
Oct 26, 2015, 03:20 am Last Edit: Oct 26, 2015, 04:02 am by MrAl
Hi,


Wow, i thought i was the only one old enough here to remember those days :-)

I learned my first assembler on the 8004, although it was all in theory with that and the 8008, until i actually started programming after building a board with the 8080 processor, which mimicked today's controller boards.
I then moved to the Z80 which i loved because it had so many instructions, and built a controller board around that with I/O and UVEPROM and the like.  That was a whole board about 4 inches by 6 inches that did about the same as the single chip Atmel 328 today used in the Uno.

So i guess i will have to investigate how to use the ports directly too then as i know for sure i will have to do that at some point.

I am happy to see digitalWrite() in the core library though, as that helps many people including myself get started on the Arduino boards.  On the Uno 16Mhz I clocked it at 115kHz max though, and by direct port access on the Uno i could get better than 2MHz, i think i got up to 5Mhz or something one time, just toggling the pins for the tests.  So we see the limitation right away there.
If there was also a port access instruction that would be nice too, although it would probably have to be considered 'safe' to run with the other instructions and initializations.  That could be what is holding it back.
Alternately, the instruction would have to be placed in an "Advanced" instruction category for advanced users.

I am still not sure i know how to do it on the Due yet so i'll have to read this thread over again :-)

[LATER]
I found a library called "DigitalWriteFast" that supposedly writes 20 times faster than digitalWrite().  I am not sure if it is meant for the Due yet however.
The download file is "DWF.zip" and found on the web after a search for "digitalwritefast".



westfw

digitalWriteFast() for (most CPUs that it is implemented on) relies on the arguments being constants.  Because, on many 8-bit microcontrollers, there is a special instruction for setting or clearing an IO pin.  ("sbi port, bitno" on AVR.)  But typically, the port number and bit number are hard-coded into the clear or set instruction, and if you wanted either the port, bitnumber, or result value to be variable, you would be faced with much more complex code.  On top of that, the Arduino code has special cases for "what if the user had previously done an 'analogWrite()' on the pin that they now want to do a digitalWrite() on?"  All complexities included, the AVR digitalWrite function is about 50 instructions.

Now, the ARM isn't specifically designed to be a microcontroller, and it doesn't have ANY special instructions.   It doesn't even have the concept of "IO pins."   The peripherals are treated just like memory locations, and nothing actually operates directly on memory locations, so the way to change one bit involves loading the memory contents into a register, loading the bit into another register, doing a register/register operation, and storing the results.  That'd be some 4 or more instructions (some of them being double-length, or using data words as well.)  And there are atomicity problems, too.   Thus, on ARM chips, the peripherals are made smarter, with some bitwise smarts built into the peripheral instead of the cpu.  (thus the "bit set", "bit clear", and (sometimes) "bit toggle" registers, as well as the "actual bits".)   Even with the smarter peripherals, setting or clearing a single IO pin on ARM is a much more expensive operation than on an 8-bit microcontroller.  (usually still about 4 instructions, probably 6 clocks or so.)
BUT!  Remember that there were no special cases for bit number or port or value.  That means that expanding a function like digitalWrite() to accept variables is EASIER on an ARM than on an AVR.  In fact, digitalWrite() for Due is only about 30 instructions, instead of 50...

So... It's complicated.


MrAl

Hi,

It sounds like what you are saying is that the port is not a real port but an emulated port.
They do call it a microcontroller though.


The more i learn about the Arm used for the Due the more i dont like certain things about it.
The 84MHz clock seems like a scam, because the 328P has an equivalent 64MHz then, and 84 isnt a world apart from 64, it's more like only 32 percent bigger.  Wow, gee, woopee :-)

I guess the main advantage then is the 32 bit operations functionality which includes lots of math functions, which is very nice.  The faster ADC is nice too, and of course who can complain about the 12 bit DAC.
I see it also has a built in RTC (Real Time Clock) which also has a calendar function, so i hope that's not too difficult to access through the Arduino IDE.



westfw

Quote
It sounds like what you are saying is that the port is not a real port but an emulated port.
Not really.  The IO ports just aren't as "tightly coupled" to the CPU as they are in an AVR.  The AVR has about 20 of the nominal 256 opcodes dedicated to dealing with the IO ports.  The ARM has none; it just treats the IO ports like any other memory location.

Quote
The 84MHz clock seems like a scam, because the 328P has an equivalent 64MHz then
Only if you're talking about IO.  Highly optimized IO.  DigitalWrite() looks about twice as fast as a 16MHz AVR; that's not great compared to what you might expect from the clock rate, but it's more 32%.

The biggest advantage is memory size.  96k RAM on a Due, and 512k of program space.  To do that on an AVR takes a tremendous amount of overhead (even that "RAM-saving" use of PROGMEM and F(string) is computationally expensive.)  To use it all on the Due is "built in" to the overhead you're already experiencing.


Quote
lots of math functions
Careful, though.  Another one of the things that is NOT as much improved as you might expect is floating point performance.  For operations that would invoke highly-optimized 32bit functions on an AVR, you get 64-bit not-optimized functions on Due (and even worse on Zero.)  32bit integer math on a Due ought to scream, though.  32bit operations in a single cycle, including multiply and divide...


westfw

(Hmm.  It turns out Due is not a particularly good example, because digitalWrite() ends up (inefficiently) calling Atmel-provided functions.  I'd bet that Zero's digitalWrite() is faster (at 48MHz) than Due's (at 84MHz)...

Zemovski

Hi,


Wow, i thought i was the only one old enough here to remember those days :-)

I learned my first assembler on the 8004, although it was all in theory with that and the 8008, until i actually started programming after building a board with the 8080 processor, which mimicked today's controller boards.
I then moved to the Z80 which i loved because it had so many instructions, and built a controller board around that with I/O and UVEPROM and the like.  That was a whole board about 4 inches by 6 inches that did about the same as the single chip Atmel 328 today used in the Uno.
Mr. Al,

I think I'm younger then you... I did not learn on the 8004.

I learned on Commodore 64, but learned Basic programming language, not assembler.

Then I learned other programming languages, like Java, but give up on it because in the late '90s the Java language was changing and updating so fast that if was nightmare...

Assebler come later, first with simple learning tool "500 in 1 Electronic Lab" with its 4-bit MPU,
then on 8-bit 8051 CPU on Easy8051 v6 development system from MicroElektronika.

My opinion is that only assembler is the real language, the rest of them are less fortunate derivatives of assembler that can only limit, slow down, cripple the powerfull CPU.

The assembler is the only language to use the CPU on all 100% power and possibility it can provide.

Second to assembler is C language with inline assembly.

Zemovski

Hi Zemovski

Here's an another example of how to use the ODSR register. The contents of this register are output directly on to the pins of the Due, just like your UNO/Mega example. This is in contrast to using the SAM3X8E's set (SODR) and clear (CODR) output registers.
Martin, thank you, will check this code on weeked.

MorganS

The faster ADC is nice too, and of course who can complain about the 12 bit DAC.
I see it also has a built in RTC (Real Time Clock) which also has a calendar function, so i hope that's not too difficult to access through the Arduino IDE.
It wouldn't be difficult to access through the IDE but the RTC is impossible to access on the Due hardware. The necessary Vbatt pin is tied to the main power net.

12-bit analog read is also a little optimistic. Going through the Arduino headers usually picks up enough noise to make the last bit basically unusuable.

If you really want to use those kinds of features and an analog out that's actually useful, have a look at the Teensy boards. Very easy to program with the Arduino IDE, in fact easier than the Ardino boards themselves. (It doesn't need to close the serial monitor to upload a new sketch.)
"The problem is in the code you didn't post."

westfw

Quote
the RTC is impossible to access on the Due hardware. The necessary Vbatt pin is tied to the main power net.
Just because the battery backup isn't implemented, doesn't make the RTC unusable.  If you wanted a high-effort low power design, you probably shouldn't be using a Due board.  If you want distant-time alarms and calendar access, it'll be fine without a backup power supply.
I don't see any official support for the Due RTC, but there is https://github.com/MarkusLange/Arduino-Due-RTC-Library
I've been experimenting some with SAMD10, and I find it convenient (and compact) to use the RTC for the core millis()/micros() functionality.



Quote
My opinion is that only assembler is the real language, the rest of them are less fortunate derivatives of assembler that can only limit, slow down, cripple the powerfull CPU.
Well, go to it!.   Here's some code that toggles an ARM pin in one instruction (after some initial setup.)   It's for an ST chip, but you could do something similar for a SAM3x...  https://github.com/WestfW/Minimal-ARM/blob/master/Blink/blink.S


MorganS

You're putting a 32kHz crystal on a Due board? I could not imagine using an RTC without setting it to the real time, but I have a poor imagination.
"The problem is in the code you didn't post."

TheRevva

NB: I snipped out a LOT.
It should go slightly faster if you use the PIOD  pointer syntax:
PIOD->PIO_CODR = 0xFFFFFFFF;
PIOD->PIO_SODR = 0x00000001;
I haven't actually checked the generated machine code but I am fairly confident that the toolchain would either make the PIOD pointer code either run SLOWER, or at the exact same speed as the REG_xxxxxx method.
REG_PIOD_CODR and REG_PIOD_SODR are simply memory address constants.  Assigning a value to a memory address is about as fast as you can get on pretty much ANY processor!
Using the PIOD->PIO_CODR and PIOD->PIO_SODR technically involves a bit more work to arrive at the SAME memory address.  PIOD is the base and PIO_?ODR is an offset (or index if you'd prefer) from the PIOD base address.
I would expect that the code generated would be identical in both cases after compiler optimisation and would thus run at the exact same speed, but if optimisation is switched off, then the pointer arithmetic code SHOULD run slower!.
Please note that I am NOT trying to state that either method is 'more correct' than the other as that's an ENTIRELY new argument.  I'm simply trying to state that using the REG_* constants is VERY unlikely to be any slower and, if anything, might be a tiny fraction FASTER!

TheRevva

Another 'observation' (since this thread has already wandered around some rather kewl tangential areas)....

I truly like the SODR / CODR methodology used within the SAM PIO subsystem and with OWER + ODSR we can mimick the 'old-school' methods on other processors so we have the best of both worlds.
The part that's missing (IMNSHO) is a TODR that would 'toggle/flip' the nominated bit(s) of the port.
It would have been fairly easy to implement on the silicon with simple XOR gates.  (The silicon obviously already possesses a latch holding the current status of the bits output to the pins)
While we CAN achieve the same end result in our applications code, it would have been ;gilding the lilly' to have provided a native method to toggle selected pins...

Go Up