DDRD Weirdness and asm

My apologies if this has been covered before, but I can find no reference to it.

I am developing yet another arduino based DSO, and it has a lot of associated hardware. In order to achieve capture speeds of 2MHz I am using all of PORTD and some asm code.

The issue is this: if I set DDRD=B00000000; in C++ just before jumping into the asm, the port does not seem to be put into read mode - I have to actually do it in the asm.

Here is the method (I'm posting the whole thing for completeness - comments and disassembly follow):

/**
 * Use the 16MHz clock to capture at full speed
 * 
 * @return uint16_t The position in the buffer where the trigger was found
 * 
 **/
uint16_t ShedScope::captureRAMFastTrigA(uint16_t samples, uint8_t trigger, bool rising)
{
  uint8_t port = PORTC;
  uint8_t low  = USBEN ;
  uint8_t high = USBEN | EXTCLK;
  uint16_t start=0;
  
  uint8_t s0 = 0; //register to hold sample history so we can catch a rising/falling edge
  bool trig = false;
  
  cli();
  
  PORTC = USBEN | RDN | WRN;
  
  // Set the clock control up for full-speed
  setControl(RESET|STOP|CLKSEL0|CLKSEL1);
  // Taking ETXCLK high will enable the clock when we are ready
  setControl(CLKSEL0|CLKSEL1);
  pinMode(0,0);
  pinMode(1,0);
  DDRD = 0; // This doesn't work for some reason. No idea why. So we do it by hand in the assembler below
  
  __asm__  __volatile__ (
        "\n"
	"ldi %A[start], 0" "\n\t" // initialise the output register start
	"ldi %B[start], 0" "\n\t" // initialise the output register start
	"out 0x0a, r1" "\n\t" // Set DDRD by hand
	"out %[portc], %[high]" "\n\t" // start the clock
	"nop" "\n\t" // nops to pad out the first capture cycle
	"nop" "\n\t" // nops to pad out the first capture cycle
	"nop" "\n\t" // nops to pad out the first capture cycle
	"nop" "\n\t" // nops to pad out the first capture cycle
	"nop" "\n\t" // nops to pad out the first capture cycle
	"cpi %[rising], 0" "\n\t" // nops to pad out the first capture cycle
	"breq 7f" "\n\t" // nops to pad out the first capture cycle
	"nop" "\n\t" // nops to pad out the first capture cycle
	
	// Trigger on rising edge
	
"1:" "\n\t" // Get even trigger sample
	
	// Next cycle we start to look for trigger rising
	"adiw %[start], 1"           "\n\t" // A=1, LA1  latched through
	""                          "\n\t" // A=2, ADC0 available
	"breq 6f"                       "\n\t" // A=3, ADC0 latched through
"2:" "\n\t"
	"in %[s0], %[pind]"              "\n\t" // A=4, ADC1 available
	"nop" "\n\t"
	"cp %[s0], %[trigger]"              "\n\t" // A=5, ADC1 latched through
	"brsh 1b"           "\n\t" // A=6, LA0  available
	"nop" "\n\t"
	"adiw %[start],1"            "\n\t" // A=1, LA1  latched through
	"breq 6f"                          "\n\t" // A=2, ADC0 available
	"in %[s0], %[pind]"              "\n\t" // A=4, ADC1 available
	"nop" "\n\t"
	"cp %[trigger], %[s0]"              "\n\t" // A=5, ADC1 latched through
	"brsh 1b"           "\n\t" // A=6, LA0  available
	"nop" "\n\t"
	"rjmp 4f" "\n\t"
	
	// Trigger on falling edge
"7:" "\n\t" // Get even trigger sample
	
	// Next cycle we start to look for trigger rising
	"adiw %[start], 1"           "\n\t" // A=1, LA1  latched through
	""                          "\n\t" // A=2, ADC0 available
	"breq 6f"                       "\n\t" // A=3, ADC0 latched through
"8:" "\n\t"
	"in %[s0], %[pind]"              "\n\t" // A=4, ADC1 available
	"nop" "\n\t"
	"cp %[trigger], %[s0]"              "\n\t" // A=5, ADC1 latched through
	"brsh 7b"           "\n\t" // A=6, LA0  available
	"nop" "\n\t"
	"adiw %[start],1"            "\n\t" // A=1, LA1  latched through
	"breq 6f"                          "\n\t" // A=2, ADC0 available
	"in %[s0], %[pind]"              "\n\t" // A=4, ADC1 available
	"nop" "\n\t"
	"cp %[s0], %[trigger]"              "\n\t" // A=5, ADC1 latched through
	"brsh 7b"           "\n\t" // A=6, LA0  available
	"nop" "\n\t"
	"rjmp 4f" "\n\t"
	
	

"3:" "\n\t" // Capture loop
	"nop" "\n\t" // nops to make a loop last 8 cycles
	"nop" "\n\t" // nops to make a loop last 8 cycles
"4:" "\n\t" // Entry point when we've already used a cycle
	"nop" "\n\t" // nops to make a loop last 8 cycles
	"nop" "\n\t" // nops to make a loop last 8 cycles
	"sbiw %[samples], 1" "\n\t" // 2 clocks
	"brne 3b" "\n\t" // 2 clocks if branched, 1 if not
	"ldi %[trig], 1" "\n\t"
	"rjmp 5f" "\n\t"
	
"6:\n\t" // Exit when no trigger was found
	"ldi %[trig],0" "\n\t"
"5:\n\t"
	"nop" "\n\t" // nops to make a loop last 8 cycles
	"nop" "\n\t" // nops to make a loop last 8 cycles
	"nop" "\n\t" // nops to make a loop last 8 cycles
	"out %[portc], %[low]" "\n\t" // disable the clock

	: [samples] "=&w" (samples),
	  [start]   "=&w" (start),
	  [s0]      "=&r" (s0),
          [trig]    "=&d" (trig)
	  
        : [portc]   "I"   (_SFR_IO_ADDR(PORTC)), 
	  [pind]    "I"   (_SFR_IO_ADDR(PIND)), 
	  [high]    "r"   (high), 
	  [low]     "r"   (low),
	  [trigger] "r"   (trigger),
	  [rising]  "d"   (rising)
        );

  
  sampleBuffer[0] = s0;
  DDRD = B11111110;
  // Reset the address counters ready for call to readRam()
  setControl(RESET|STOP);
  setControl(0);
  
  triggered = trig;
  
  return start;
}

And here is a partial disassembly:

00000ebc <_ZN9ShedScope19captureRAMFastTrigAEjhb>:
 * Use the 16MHz clock to capture at full speed
 *
 * @return uint16_t The position in the buffer where the trigger was found
 *
 **/
uint16_t ShedScope::captureRAMFastTrigA(uint16_t samples, uint8_t trigger, bool rising)
     ebc:       df 92           push    r13
     ebe:       ef 92           push    r14
     ec0:       ff 92           push    r15
     ec2:       0f 93           push    r16
     ec4:       1f 93           push    r17
     ec6:       8c 01           movw    r16, r24
     ec8:       e4 2e           mov     r14, r20
     eca:       f2 2e           mov     r15, r18
{
  uint8_t port = PORTC;
     ecc:       88 b1           in      r24, 0x08       ; 8
 
  uint8_t s0 = 0; //pair of registers to hold sample history so we can catch a rising/falling edge
  uint8_t s1 = 0;
  bool trig = false;
 
  PORTC = USBEN | RDN | WRN;
     ece:       88 e3           ldi     r24, 0x38       ; 56
     ed0:       88 b9           out     0x08, r24       ; 8
 
  // Set the clock control up for full-speed
  setControl(RESET|STOP|CLKSEL0|CLKSEL1);
     ed2:       c8 01           movw    r24, r16
     ed4:       6f e0           ldi     r22, 0x0F       ; 15
     ed6:       70 e0           ldi     r23, 0x00       ; 0
     ed8:       0e 94 5d 06     call    0xcba   ; 0xcba <_ZN9ShedScope10setControlEi>
  // Taking ETXCLK high will enable the clock when we are ready
  setControl(CLKSEL0|CLKSEL1);
     edc:       c8 01           movw    r24, r16
     ede:       63 e0           ldi     r22, 0x03       ; 3
     ee0:       70 e0           ldi     r23, 0x00       ; 0
     ee2:       0e 94 5d 06     call    0xcba   ; 0xcba <_ZN9ShedScope10setControlEi>
  pinMode(0,0);
     ee6:       80 e0           ldi     r24, 0x00       ; 0
     ee8:       60 e0           ldi     r22, 0x00       ; 0
     eea:       0e 94 69 0c     call    0x18d2  ; 0x18d2 <pinMode>
  pinMode(1,0);
     eee:       81 e0           ldi     r24, 0x01       ; 1
     ef0:       60 e0           ldi     r22, 0x00       ; 0
     ef2:       0e 94 69 0c     call    0x18d2  ; 0x18d2 <pinMode>
  DDRD = 0;
     ef6:       1a b8           out     0x0a, r1        ; 10
          [pind]    "I"   (_SFR_IO_ADDR(PIND)),
          [high]    "r"   (high),
          [low]     "r"   (low),
          [trigger] "r"   (trigger),
          [rising]  "d"   (rising)
        );
     ef8:       98 e0           ldi     r25, 0x08       ; 8
     efa:       8c e0           ldi     r24, 0x0C       ; 12
     efc:       ac 01           movw    r20, r24
     efe:       3f 2d           mov     r19, r15
     f00:       a0 e0           ldi     r26, 0x00       ; 0
     f02:       b0 e0           ldi     r27, 0x00       ; 0
     f04:       1a b8           out     0x0a, r1        ; 10
     f06:       48 b9           out     0x08, r20       ; 8
     f08:       00 00           nop
     f0a:       00 00           nop
     f0c:       00 00           nop
     f0e:       00 00           nop
     f10:       00 00           nop

As you can see, the compiler has generated

     ef6:       1a b8           out     0x0a, r1        ; 10

For the C++ line

DDRD = 0;

and I have checked that at this point r1 does indeed contain zero (as it should). But without the later hand-coded

f04:       1a b8           out     0x0a, r1        ; 10

DDRD does not seem to go into input mode.

I've mucked about with disabling interrupts in case that has anything to do with it, but no joy.

Does anyone have any clue what is happening here?

Thanks,

Duncan

duncanFrance:
The issue is this: if I set DDRD=B00000000; in C++ just before jumping into the asm, the port does not seem to be put into read mode - I have to actually do it in the asm.

What is your basis for saying this? Can you demonstrate in a smaller example?

What Arduino are you using?

It's not quite an Arduino any more to be honest. I've taken the chip off an Uno and stuck it on my own board. That should not be relevant though, since the system works fine except for this problem.

The symptom is that without the line of assembler a fixed value (normally 0xff) always gets read back, but with the assembler I see expected values from the ADCs and capture RAM (i.e. the scope works properly).

Also, if I delete the C++ line and leave in the assembler line it also works. Very strange. Is there some sort of timing issue about when PORTD can be written? The method setControl() changes PORTD also as follows (note the apparently unnecessary pinMode() calls, which also seem to be needed):

void ShedScope::setControl(int bits) {
  // clear reads on portc
  PORTC = RDN | WRN | USBEN | EXTCLK;
  // Set the serial pins to outputs
  pinMode(0,1);
  pinMode(1,1);
  DDRD = B11111111;
  // put the data out on port d
  PORTD = bits;
  // toggle write-enable and the address bits
  PORTC = USBEN | RDN | EXTCLK;
  
  // This seems to be necessary for some reason
  delayMicroseconds(5);
  
  PORTC = USBEN | RDN | WRN | EXTCLK;
  // re-enable serial
  delayMicroseconds(5);
  
  DDRD = B11111110;
}

How are you compiling this? I can't help thinking that the define for PORTD is not matching your processor.

Are you using serial comms at all? Other functions can override the data direction register. See page 81 of the Atmega328 datasheet.

For example:

DDOE

Data Direction Override Enable

If this signal is set, the Output Driver Enable is controlled by the DDOV signal. If this signal is cleared, the Output driver is enabled by the DDxn Register bit.

All good points. The thing is that using DDRD from C++ works perfectly everywhere else.

I do use serial comms, but all the methods which talk to the custom hardware are wrapped with Serial.end() and Serial.begin().

Anyway, I want to run some more tests to see if I can get any more useful data. Right now the board doesn't work properly. This will be either a result of my home-made SPI programmer breaking halfway through programming (a wire broke), or I have managed to properly stuff-up the chip. If I can fix it today I'll post more data otherwise I'll have to wait for a new chip.

The compiled code appears to write to 0xA which is correct for DDRD.

The only timing issue I know if is a write to PORTx followed immediately by a read from PINx, you have to insert a NOP to let the logic have one clock cycle to reflect the change. In C you never see that of course.

Can you reduce this to a 5-line program and still get the error?


Rob

duncanFrance:
All good points. The thing is that using DDRD from C++ works perfectly everywhere else.

I think you are getting bogged down in a C++/asm issue. Your disassembly proves that the generated code is the same, so this is hardly the problem.

It is more likely you have a timing/race condition.

How can you prove that after doing:

DDRD = 0;

... that the data direction register is not correctly set?

You say yourself:

The thing is that using DDRD from C++ works perfectly everywhere else.

So, timing issue. Not C++ issue.

... note the apparently unnecessary pinMode() calls, which also seem to be needed ...

Again, strange.

I've taken the chip off an Uno and stuck it on my own board.

Decoupling capacitors? Can you show the circuit?

More good points! I agree with you that it looks like either a timing or a decoupling issue. Some of my electrolytics may be a bit iffy since they've been sitting in my parts box for cough several years. I'll stick some new ones in over the weekend and see if that helps. Still inclined to think it is some kind of timing problem though, since the rest of the code appears to work fine. But poor decoupling can cause strange problems.

I'm going to get the schematics cleaned up and stick them on my blog so you can have a laugh. In the meantime there's a picture of the board at www.ukmaker.co.uk. Wiring is done with Roadrunner for the digital bits and point-to-point for the analog bits. The decoupling caps are hidden in the IC sockets, but there is one for every chip.

Dang, talk about a board right out of the 70s, I haven't seen something like that since...well since the 70s :slight_smile:

No CAD packages were harmed in the making of that.


Rob

I'm particularly fond of the little daughter-boards with the crystals and caps on them :~

When I used to work at Linn the analog guys would prototype by soldering the parts together using just their leads. This produced beautiful, weird-looking sculptures suspended in the air above a ground-plane.

Of course, real old-timers wouldn't be dealing with this new-fangled stripboard malarkey. Tag-board. That's what real men use.

Schematics are now available at http://www.ukmaker.co.uk/shedscope/ for those who want to know just what the hell all those packages are for.