Why can't I read the state of an output port?

Hi,
I'm new here so please be gentle with me! I'm just starting a project based on a Leonardo. I'm new to the Arduino products but have had experience programming in C although as it is around 9 years since I retired I'm having to get back up to speed with a lot that I've forgotten.

Basically I've run into problems accessing a port as simple digital I/O. The port in question is F, although I'm guessing the same would be true of any of them. I'm using the top 4 bits of this port to drive a 7-segment display via a BCD-to-7-segment decoder and the bottom 2 bits to drive a couple of LEDs. Having initialised all the port bits as outputs and successfully written a value to PORTF I want to later be able to come back and read the state of the output pins so that I can mask off the bits that I don't want to change. From the port description in the ATmega32 manual it would appear that I should be able to do this by reading from PINF but when I do this I get completely random values back. I've #included avr/io.h obviously.

I realise I could work around this by simply creating a global variable to hold a copy of the last value written to the port but I'd rather read the port directly if it's possible so if anyone has any idea why this doesn't work I'd be most grateful.

Try reading PORTF instead of PINF. That's the output register.

Ok, that's something I hadn't thought of trying as the simplified logic diagram shows the output of the PORTF register being read via the PINF synchroniser and register. I'll give it a try though and report back.

Ok, I've tried that and it works!

Many thanks for your help Delta_G.

Are you using a Arduino Mega? The Uno doesn't have a PORTF.

I want to later be able to come back and read the state of the output pins so that I can mask off the bits that I don't want to change

Do you need to read the pin states in order to do this ?

Despite having found a working solution, I think you may be "doing it wrong". :slight_smile:

Why do you need to read the output state at all? You know which bits correspond to which devices don't you? Can't you do something like this?

#define LCD_MASK = 0xF0;
#define LED1_MASK = 0x01;
#define LED2_MASK = 0x02;

...

void loop () {
  ..
  // Disable both LEDs:
  PORTF &= ~(LED1_MASK | LED2_MASK);
  
  // Enable LED2
  PORTF |= LED2_MASK;
  
  // Write a character to the LCD
  PORTF = (PORTF & ~(LCD_MASK))  |  0xA0;
}

You may be clearing / setting bits redundantly, but you're writing entire bytes at a time, so it doesn't impose any penalty over selectively changing bits only when they need changing. Usually, it'll take far fewer cycles to just slam them with whatever value they need to be, regardless of whatever they are.

Hi Jiggy-Ninja, I mentioned in my OP that I'm using a Leonardo.

UKHeliBob, The way I'm doing this is to pass a number to a function that drives the port bits for the BCD display. This number is left-shifted 4 bits to the high nibble of the byte. I then OR in the states of the bottom 2 bits that drive the LEDs before writing the result back to PORTF. To preserve the states of the bottom 2 bits I first need to read the port and save them in a temporary variable with the top 4 bits masked off. Similarly to write the LEDs I need to save the BCD value in a temp variable with the LED bits masked off.

SirNickity, in order to slam the values in I need to know what their current state should be, either by maintaining a global value or reading the current state of the port.

For LEDs why mess with AVR controller specific port i/o?
Why not just use the portable digitalWrite() functions that will work
on any processor?
Yes it will be a bit slower and you can't set all the bits at once, but
neither of those seem necessary for this type of application.

--- bill

Delta_G:
Try reading PORTF instead of PINF. That’s the output register.

Reading the PINF register should do the same, except there is a two clock cycle synchroniser delay between changing the PORT register and the change becoming apparent on the PIN register.

I'll probably use digitalWrite() and digitalRead() where appropriate. Actually they may not be any slower as they probably make use of the SBI and CBI instructions.

I wondered if the synchroniser might be the cause of the problem. The manual advises inserting a NOP between writing to the port and reading it back via the synchroniser. However if this is the cause I would have expected it to return one of the previous values written to the port whereas the value returned is more or less random. In any case I'm not reading immediately after writing so there are other instructions in between.

I see now that there is a route to read the output register directly so reading from PORTF is clearly the correct option and since this works ok I'm happy with that.

Thanks again for everyone's contributions to this.

fenman:
I’ll probably use digitalWrite() and digitalRead() where appropriate. Actually they may not be any slower as they probably make use of the SBI and CBI instructions.

They don’t. They are both very slow as they have to first read from the flash the address and bitmask for each pin. Then because everything is now pointers they have to use the slow LDS and STS instructions.
Here is how horrible both functions look:

00000cf0 <digitalWrite>:
     cf0:	48 2f       	mov	r20, r24
     cf2:	50 e0       	ldi	r21, 0x00	; 0
     cf4:	ca 01       	movw	r24, r20
     cf6:	84 5b       	subi	r24, 0xB4	; 180
     cf8:	9f 4f       	sbci	r25, 0xFF	; 255
     cfa:	fc 01       	movw	r30, r24
     cfc:	24 91       	lpm	r18, Z+
     cfe:	ca 01       	movw	r24, r20
     d00:	80 5c       	subi	r24, 0xC0	; 192
     d02:	9f 4f       	sbci	r25, 0xFF	; 255
     d04:	fc 01       	movw	r30, r24
     d06:	34 91       	lpm	r19, Z+
     d08:	4c 5c       	subi	r20, 0xCC	; 204
     d0a:	5f 4f       	sbci	r21, 0xFF	; 255
     d0c:	fa 01       	movw	r30, r20
     d0e:	94 91       	lpm	r25, Z+
     d10:	99 23       	and	r25, r25
     d12:	09 f4       	brne	.+2      	; 0xd16 <digitalWrite+0x26>
     d14:	3b c0       	rjmp	.+118    	; 0xd8c <digitalWrite+0x9c>
     d16:	22 23       	and	r18, r18
     d18:	09 f1       	breq	.+66     	; 0xd5c <digitalWrite+0x6c>
     d1a:	21 30       	cpi	r18, 0x01	; 1
     d1c:	31 f4       	brne	.+12     	; 0xd2a <digitalWrite+0x3a>
     d1e:	80 b7       	in	r24, 0x30	; 48
     d20:	8f 77       	andi	r24, 0x7F	; 127
     d22:	80 bf       	out	0x30, r24	; 48
     d24:	80 b7       	in	r24, 0x30	; 48
     d26:	8f 7b       	andi	r24, 0xBF	; 191
     d28:	07 c0       	rjmp	.+14     	; 0xd38 <digitalWrite+0x48>
     d2a:	22 30       	cpi	r18, 0x02	; 2
     d2c:	39 f4       	brne	.+14     	; 0xd3c <digitalWrite+0x4c>
     d2e:	80 b7       	in	r24, 0x30	; 48
     d30:	8f 7d       	andi	r24, 0xDF	; 223
     d32:	80 bf       	out	0x30, r24	; 48
     d34:	80 b7       	in	r24, 0x30	; 48
     d36:	8f 7e       	andi	r24, 0xEF	; 239
     d38:	80 bf       	out	0x30, r24	; 48
     d3a:	10 c0       	rjmp	.+32     	; 0xd5c <digitalWrite+0x6c>
     d3c:	23 30       	cpi	r18, 0x03	; 3
     d3e:	31 f4       	brne	.+12     	; 0xd4c <digitalWrite+0x5c>
     d40:	8f b5       	in	r24, 0x2f	; 47
     d42:	8f 77       	andi	r24, 0x7F	; 127
     d44:	8f bd       	out	0x2f, r24	; 47
     d46:	8f b5       	in	r24, 0x2f	; 47
     d48:	8f 7b       	andi	r24, 0xBF	; 191
     d4a:	07 c0       	rjmp	.+14     	; 0xd5a <digitalWrite+0x6a>
     d4c:	24 30       	cpi	r18, 0x04	; 4
     d4e:	31 f4       	brne	.+12     	; 0xd5c <digitalWrite+0x6c>
     d50:	8f b5       	in	r24, 0x2f	; 47
     d52:	8f 7d       	andi	r24, 0xDF	; 223
     d54:	8f bd       	out	0x2f, r24	; 47
     d56:	8f b5       	in	r24, 0x2f	; 47
     d58:	8f 7e       	andi	r24, 0xEF	; 239
     d5a:	8f bd       	out	0x2f, r24	; 47
     d5c:	e9 2f       	mov	r30, r25
     d5e:	f0 e0       	ldi	r31, 0x00	; 0
     d60:	ee 0f       	add	r30, r30
     d62:	ff 1f       	adc	r31, r31
     d64:	e8 5d       	subi	r30, 0xD8	; 216
     d66:	ff 4f       	sbci	r31, 0xFF	; 255
     d68:	a5 91       	lpm	r26, Z+
     d6a:	b4 91       	lpm	r27, Z+
     d6c:	66 23       	and	r22, r22
     d6e:	41 f4       	brne	.+16     	; 0xd80 <digitalWrite+0x90>
     d70:	9f b7       	in	r25, 0x3f	; 63
     d72:	f8 94       	cli
     d74:	8c 91       	ld	r24, X
     d76:	30 95       	com	r19
     d78:	83 23       	and	r24, r19
     d7a:	8c 93       	st	X, r24
     d7c:	9f bf       	out	0x3f, r25	; 63
     d7e:	08 95       	ret
     d80:	9f b7       	in	r25, 0x3f	; 63
     d82:	f8 94       	cli
     d84:	8c 91       	ld	r24, X
     d86:	83 2b       	or	r24, r19
     d88:	8c 93       	st	X, r24
     d8a:	9f bf       	out	0x3f, r25	; 63
     d8c:	08 95       	ret


00000d8e <digitalRead>:
     d8e:	68 2f       	mov	r22, r24
     d90:	70 e0       	ldi	r23, 0x00	; 0
     d92:	cb 01       	movw	r24, r22
     d94:	84 5b       	subi	r24, 0xB4	; 180
     d96:	9f 4f       	sbci	r25, 0xFF	; 255
     d98:	fc 01       	movw	r30, r24
     d9a:	24 91       	lpm	r18, Z+
     d9c:	cb 01       	movw	r24, r22
     d9e:	80 5c       	subi	r24, 0xC0	; 192
     da0:	9f 4f       	sbci	r25, 0xFF	; 255
     da2:	fc 01       	movw	r30, r24
     da4:	44 91       	lpm	r20, Z+
     da6:	6c 5c       	subi	r22, 0xCC	; 204
     da8:	7f 4f       	sbci	r23, 0xFF	; 255
     daa:	fb 01       	movw	r30, r22
     dac:	94 91       	lpm	r25, Z+
     dae:	99 23       	and	r25, r25
     db0:	19 f4       	brne	.+6      	; 0xdb8 <digitalRead+0x2a>
     db2:	20 e0       	ldi	r18, 0x00	; 0
     db4:	30 e0       	ldi	r19, 0x00	; 0
     db6:	33 c0       	rjmp	.+102    	; 0xe1e <digitalRead+0x90>
     db8:	22 23       	and	r18, r18
     dba:	09 f1       	breq	.+66     	; 0xdfe <digitalRead+0x70>
     dbc:	21 30       	cpi	r18, 0x01	; 1
     dbe:	31 f4       	brne	.+12     	; 0xdcc <digitalRead+0x3e>
     dc0:	80 b7       	in	r24, 0x30	; 48
     dc2:	8f 77       	andi	r24, 0x7F	; 127
     dc4:	80 bf       	out	0x30, r24	; 48
     dc6:	80 b7       	in	r24, 0x30	; 48
     dc8:	8f 7b       	andi	r24, 0xBF	; 191
     dca:	07 c0       	rjmp	.+14     	; 0xdda <digitalRead+0x4c>
     dcc:	22 30       	cpi	r18, 0x02	; 2
     dce:	39 f4       	brne	.+14     	; 0xdde <digitalRead+0x50>
     dd0:	80 b7       	in	r24, 0x30	; 48
     dd2:	8f 7d       	andi	r24, 0xDF	; 223
     dd4:	80 bf       	out	0x30, r24	; 48
     dd6:	80 b7       	in	r24, 0x30	; 48
     dd8:	8f 7e       	andi	r24, 0xEF	; 239
     dda:	80 bf       	out	0x30, r24	; 48
     ddc:	10 c0       	rjmp	.+32     	; 0xdfe <digitalRead+0x70>
     dde:	23 30       	cpi	r18, 0x03	; 3
     de0:	31 f4       	brne	.+12     	; 0xdee <digitalRead+0x60>
     de2:	8f b5       	in	r24, 0x2f	; 47
     de4:	8f 77       	andi	r24, 0x7F	; 127
     de6:	8f bd       	out	0x2f, r24	; 47
     de8:	8f b5       	in	r24, 0x2f	; 47
     dea:	8f 7b       	andi	r24, 0xBF	; 191
     dec:	07 c0       	rjmp	.+14     	; 0xdfc <digitalRead+0x6e>
     dee:	24 30       	cpi	r18, 0x04	; 4
     df0:	31 f4       	brne	.+12     	; 0xdfe <digitalRead+0x70>
     df2:	8f b5       	in	r24, 0x2f	; 47
     df4:	8f 7d       	andi	r24, 0xDF	; 223
     df6:	8f bd       	out	0x2f, r24	; 47
     df8:	8f b5       	in	r24, 0x2f	; 47
     dfa:	8f 7e       	andi	r24, 0xEF	; 239
     dfc:	8f bd       	out	0x2f, r24	; 47
    //from here it starts actually reading the pin
     dfe:	89 2f       	mov	r24, r25
     e00:	90 e0       	ldi	r25, 0x00	; 0
     e02:	88 0f       	add	r24, r24
     e04:	99 1f       	adc	r25, r25
     e06:	82 5d       	subi	r24, 0xD2	; 210
     e08:	9f 4f       	sbci	r25, 0xFF	; 255
     e0a:	fc 01       	movw	r30, r24
     e0c:	a5 91       	lpm	r26, Z+
     e0e:	b4 91       	lpm	r27, Z+
     e10:	8c 91       	ld	r24, X
     e12:	20 e0       	ldi	r18, 0x00	; 0
     e14:	30 e0       	ldi	r19, 0x00	; 0
     e16:	84 23       	and	r24, r20
     e18:	11 f0       	breq	.+4      	; 0xe1e <digitalRead+0x90>
     e1a:	21 e0       	ldi	r18, 0x01	; 1
     e1c:	30 e0       	ldi	r19, 0x00	; 0
     e1e:	c9 01       	movw	r24, r18
     e20:	08 95       	ret

Yikes! :astonished:

fenman:
Yikes! :astonished:

That's why I gave up on digitalRead() and write a long time again. Direct port access wins hands down always.

If you still want the ease of converting the Arduino pin numbers to port numbers and still keep your code portable across platforms, you can do this:

//Declare your pin at the top.
const unsigned char myPin = 4; //e.g. digital pin 4
unsigned char myPort = digitalPinToPort(myPin);
volatile unsigned char* myOutput = portOutputRegister(myPort);
volatile unsigned char* myInput = portInputRegister(myPort);
volatile unsigned char* myDirection = portModeRegister(myPort);
unsigned char myHighMask = digitalPinToBitMask(myPin);
unsigned char myLowMask = ~myHighMask;

//Then to use it you can do things like:
*myOutput |= myHighMask //set output high
*myOutput &= myLowMask; //set output low
*myDirection &= myLowMask;//make an input
read = ((myInput & myHighMask) ? HIGH : LOW); //Read in the bit

Its not quite as efficient as direct port access, but it is a vast improvement on digitalRead/write in terms of speed. It is also useful when writing libraries that you want to remain easy to change which pin you are using or which processor.

Come on guys, "slow" is a relative term.
Yes there are cases where raw port i/o is necessary
(I have elaborate port mapping routines that I use in the openGLCD library for speed)
but in this case, fenman said he was controlling a few LEDs.
LEDs...
Those are for human consumption. And humans are REALLY slow.

A raw port i/o CBI/SBI read/write is 2 cycles. That is as fast as 125 ns. (on 16mhz clock)
The digitalWrite()/digitalRead() is about 5us which is about 40 times slower.
However for most uses, that speed reduction is simply not needed.

Consider this.
With persistence of vision the human eye can't detect changes beyond about 100hz
and many people start to lose detection below that.
That is .01 second. The digital write code assuming you had to set all 8 bits in the port
(which is a worst case scenario) would be 8 * 5us or 40us or .00004 second.
Or roughly 250 times faster than a human could detect.
And if the concern is "seeing" false BCD digits on the 7 segment display
while the bits are independently being updated,
the human eye simply won't be able to see it even using digitalWrite().

So if this project is about driving LEDs, particuarly if the number of LEDs is small
like it appears to be in this case, then using raw port i/o is simply not necessary and
not worth the portability and atomicity issues.
(Tom's examples don't deal with atomicity issues and would corrupt the register
if used from ISR routines. While it it can be fixed, by masking interrupts, it
does add some addtional complexity that is often forgotten.)

Here is my suggestion, write a routine that does it using the Arduino core code routines
digitalWrite() and digitalRead().
Get the code working. Then, after it is working, if using the standard i/o routines isn't good
enough, replace the routine with a macro or inline function that does raw port i/o.

--- bill

Hi Bill,

I feel I should point out that controlling LEDs isn't all I'm doing.

Writing directly to the port is working fine now and I'm not worried about portability so I prefer to KISS. :smiley:

I have since been experimenting with producing a fast PWM signal from Timer/Counter 0 and have uncovered at least one interesting discrepancy from the Atmel manual, but that's for another thread!

Any high speed stuff?
When doing direct port i/o if you do it at ISR level, you must
mask interrupts during the operation unless you are guaranteed to get
SBI/CBI instructions otherwise you will corrupt the register.
Just be careful.

If you want something that is faster than digitalWrite()/digitalRead() but
has the same interface, you might want to check out my avrio code.
You can find in on google code here:
http://code.google.com/p/mcu-io/
license is GPL, so you can't use it in closed source projects.

--- bill

bperrybap:
When doing direct port i/o if you do it at ISR level, you must
mask interrupts during the operation unless you are guaranteed to get
SBI/CBI instructions otherwise you will corrupt the register.
Just be careful.

Care to explain that one? When if you are inside an interrupt, nothing can corrupt what you are doing as the global interrupt flag is disabled.

That's what I was wondering. :~

OK. I’ll explain it.
I’ve seen many people fall into this trap over the past few years.
The Arduino team had the same issue their digitalWrite() code
as it does the same thing after all the run time lookups.
I found this bug on the first day I used Ardiuno.
Even after the full explanation, they didn’t seem to accept it
and wouldn’t fix the code.
Finally, like 9 months later, they agreed to fix it when it could easily be demonstrated
by using the servo code.

It is a combination and interaction between the AVR being RISC and the ISR and timing.
They key to understanding the issue is to understand that it isn’t
that the ISR code is interrupted or that the ISR is corrupting the register,
it is that the ISR code interrupts the foreground code which causes the
foreground code to corrupt the register.
It is related to the avr-gcc compiler optimization hack to use CBI/SBI instructions
when a AND or OR operation occurs using an compile time constant pointer address
when the address pointer points to one of the lower 32 i/o registers and the mask is a compile time constant.

In this case the pointer is not known at compile time, since the pointer
is filled in at run time. Therefore the SBI/CBI optimization hack in the compiler
won’t be able to optimize the |= or &= operation down into SBI/CBI instructions
and the sequence is non atomic since it becomes multiple interruptable instructions.

So for example the sequence

 *myOutput |= myHighMask; //set output high

is non atomic and is interruptable. Since it is multiple instructions.
While it looks the same as something like

    PORTF |=  MASK;

It isn’t.
It is only when the avr-gcc SBI/CBI optimization hack kicks in that that
it crushes down |= and &= operations into a single non interruptable sbi/cbi instruction.

From a psuedo code point of view that single line of C code is:

  • Read the port register into temporary storage
  • OR the mask into the temporary storage
  • Write the temporary storage back to the port register

If that code is running with interrupts enabled, then an interrupt could occur
and interrupt that sequence.

Now lets look at a more complex sequence of two operations:
In the foreground (code with interrupts enabled like in loop() )we have:

 *myOutput |= myHighMask; //set output high

Somewhere, else ( in a library like servo) there is an ISR function that does this:

 digitalWrite( SERVO(timer,Channel[timer]).Pin.nbr,HIGH); // its an active channel so pulse it high

And lets further assume that myOutput and the Servo library are sharing the same port register
and for simplicity lets also assume:

  • PORTF is used and has all bits clear (a value of 0) prior to this sequence.
  • MyHighMask is 0x1 (bit 0)
  • Servo is setting the pin the corresponds to bit 7 in the port register

Now lets throw in the timing of these operations and watch the corruption:

  • Forground reads PORTF into temporary storage (foreground temp storage now has a 0 in it)
    -----> Interrupt fires
  • Servo calls digitalWrite()
  • digitalWrite() (does all the lookups) then reads PORTF into temporary storage
  • digitalWrite() temp storage now has a zero in it.
  • digitalWrite() ORs bit 7 into temporary storage, temp storage now has 0x80 in it
  • digitalWrite() writes the temp storage, which is 0x80, to PORTF, PORTF now has 0x80 in it.
  • Servo ISR returns
    <----- ISR returns (foreground continues its |= operation)
  • Forground ORs myHighMask (which is 0x1) into its temporary storage read earlier
  • Forgrounds temporary storage now has 0x1 in it.
  • Forground writes its temporary storage into the PORTF register.
    BAM register corruption!
    The PORT register was just loaded with 0x1 which set the myHighMask but also
    cleared the bit that the servo library set.
    This is because the foreground |= operation is not atomic and was interrupted.
    The foreground didn’t see the other bit get set and therefore clobbered it.

In the ISR it doesn’t matter whether the code uses raw port i/o using sbi/cbi,
indirect pointers like Tom’s examaple sequences, or digitalWrite(), the result
will be same, if the foreground is using non atomic operations to modify port bits.
In fact, in this example, the ISR was doing everything correctly to ensure atomicity,
but the foreground was not and that is why the corruption occured.

The only way to ensure that this does not happen is that bit twiddling operations
must always be atomic and sequences like:

 *myOutput |= myHighMask; //set output high
*myOutput &= myLowMask; //set output low

are not atomic.

In fact you can look at the digitalWrite() code and see this sequence.
This is the code I pushed to get them to modify to be atomic and they finally did:

	out = portOutputRegister(port);

	uint8_t oldSREG = SREG;
	cli();

	if (val == LOW) {
		*out &= ~bit;
	} else {
		*out |= bit;
	}

	SREG = oldSREG;

This code is essentually the same as Tom’s (uising indirect raw port i/o)
other than they fetch the pointer every time vs up front.
Note that they mask interrupts during the |= and &= operations.
That is the only way to ensure atomicity. (That is what I pushed to get put into the code)
Yeah it sucks and slows down the operation, but there is no way around it on the AVR.
Other chips like the PIC have set and clear registers.
Those are MUCH better than depending on special instructions like sbi/cbi since
you can use run time pointers and masks and also atomically set/clear mutliple
bits at time.
Set and clear registers beat set and clear instructions every time!

If all that were not enough,
Even more scary is that even operations like

PORTx |= BITMASK;

are not guraranteed to be atomic and
can have problems with atomicity as well, since not all ports use addresses
that fall inside the range for sbi/cbi instructions.
Which is why I call the |= &= sbi/cbi optimization a “hack”.
It doesn’t work in all cases even if the address and mask is known
at compile time.

For example, on a mega if you use
PORTH, PORTJ, PORTK, PORTL the operation are not atomic
because the addresses for those ports fall outside the range
for the cbi/sbi instructions.

So when the optimizer is enabled,
this is atomic:

PORTF |= MASK;

yet this isn’t:

PORTL |= MASK;

because the address for PORTL is above the range
for cbi/sbi and must use multiple instructions.

Many people would unknowningly fall for that trap.

And that is why I say you have to be careful when you do raw port i/o
or even indirect raw port i/o using pointers.
It can be done but there are a few atomicity issues that can haunt you.
Atmocity issues like this can be very hard to track down so it
is best to avoid them.

— bill