Setup()/loop() vs main()/while(1)

Here's one for the brain trust. I've been working on it for the last three days and quite frankly I'm stumped. I believe I've lost sight of the forest, or is it the trees?

In any event, I don't want to give away any spoilers so as to not bias any opinions. For the sake of not reinventing however, I will cover some key ground that has already be tread.

This is essentially an implementation of AVR304 with a couple of twists. It uses a transmit buffer and is interrupt driven (bit-wise) so there is no blocking.

The receive portion has not been sussed out yet so look at that at your leisure, this strictly has to do with the transmit side.

A data analyzer was connected to the transmit pin so all framing has been checked and is accurate.

All testing has been done with a Uno as the master and a Mega2560 the slave. The slave is set to use one of its USARTs to receive so there is no drama there.

The master is set up with an option (called side_track) that allows for bit-banging the output, selectable at run time via #/$.

A test state exists in the interrupt routine that pumps out 1ms worth of pulses to check timing, accessible via @.

Caveats.
I do not wish to enter into any philosophical debates. I come from an old school that believes in monolithic programming and for embedded systems especially, prefer to have all my code in one place. The exercise here is to have a single file that can be compiled anywhere and uploaded without concern as to whether the correct libraries are available. The first step is to exit from the Arduino IDE and go straight to the AVR-GCC compiler.

An undocumented reality is that if you swap out the setup and loop functions for main and an endless while, all the Arduino behind-the-scenes add-ons are not added on. You will notice that these lines are commented in/out to be able to switch back and forth.

Timer2 drives the ISR, Timer0 is there to offer millis() and micros() in the timing loop but otherwise not needed. Timer1 is used for non-prescaled timing for the bit-banged alternative but otherwise it too is not needed.

Lastly, in this bit, is that the intent of this code is to provide a workable test-bed to play with non-standard data transmission, specifically 9-bit data. My intent is to post this on my Github page when it is working so feel free to pass it around now.

Issue.
With loop and setup invoked, everything works fine. If you switch over to main and while, the interrupt driven transmit only sends the first character. Whilst still in this mode, you can switch to bit-bang (#) and it works. Also, if you invoke the 1ms pulse stream (@), it works as well. There does not appear to be an issue with the interrupt handler except that it processes only the first character and only in the main/while configuration. Again, switching back to setup/loop everything works.

One might think that some Arduino add-on is missing in the main/while compilation when you look at the sketch size, 1998 in the main/loop version and 2132 for the setup/loop. So there seems to be something no there but like I said, I've been staring at it for so long now I just don't see any more. Perhaps you all can spot it?

I can't post the code as it's a touch too long so I've attached it.

bare_metal_usart_rev3a.ino (18.5 KB)

Have you accounted for the actions performed in main()?
https://github.com/arduino/Arduino/blob/master/hardware/arduino/avr/cores/arduino/main.cpp

The only issues I can see are in init() and I believe I have taken that into account by including the timer0 configuration along with the overflow interrupt and millis and micros. Like I said though, at this stage it could be a typo and I'm just not seeing it.

With loop and setup invoked, everything works fine. If you switch over to main and while, the interrupt driven transmit only sends the first character. Whilst still in this mode, you can switch to bit-bang (#) and it works.

Do you mean the first character is actually sent, or the code only appears to do one character ?
(i.e. it's not actually transmitted, so interrupt is never called)

Had a quick look, is this correct in unusartInit() ?

    UBRR0 = ((F_CPU / 16) / brate) - 1;

do you not have to write UBRR0H, UBRR0L ?

Yours,
TonyWilk

The character actually gets sent and then nothing. Bear in mind that when its compiled as setup and loop it works fine.

Your point with UBBR0 might be valid. As I know it you can write H and L individually or as a 16-bit value. As the numbers I've been dealing with are all under 255 I never really gave the high order byte much consideration.

In other code, specifically writing OCR values for timer1 that are almost always in the thousands, there has never been a problem. There wasn't even an assumption that the same would hold for UBBR, just blind action. Will test that now.

Cheers.

Well it was worth a go. Confirmed as well by the datasheet, the compiler deals with the address. Only necessary to deal with H and L separately if using assembly.

Well, one thing at a time :slight_smile:

Having looked at https://github.com/arduino/Arduino/blob/master/hardware/arduino/avr/cores/arduino/main.cpp
have you tried calling init() and/or initVariant() when you are main-while ?

init appears in: https://github.com/arduino/Arduino/blob/master/hardware/arduino/avr/cores/arduino/wiring.c

Yours,
TonyWilk

No I haven't. From what I can see with init(), it sets everything up for PWM. Which is fine if your using PWM otherwise most needs to be reset anyhow. I can't see anything in that code that would affect using timer2. Indeed, if you were to run the TEST_LOOP and view it on a scope as I have done, in both configurations, you see nice little pulses every 25.5 microseconds. The problem does not appear to have anything to do with the timer. Having said that, it's still not working so maybe it needs addressing.

There's nothing special about the code. I've written dozens of similar routines, which is why I thought there was some syntactical error that I just wasn't noticing. And why does it work in one and not the other? I've looked at the scope and the signals are beautifully framed. Using the character buffer and the interrupt means the framing gets reset every 9 bits so there's not even a chance to compound any error. I'm baffled, but until I sort it out I can't use the code, even though it works in Arduino mode.

If you switch over to main and while, the interrupt driven transmit only sends the first character

Is that the USART that only transmits the first character, or the bit-banged timer-driven transmission?

Just try calling init() instead of your code that's intended to replace it. It will only take a couple minutes and then we can eliminate that as your problem. initVariant() isn't used for Uno or Mega so that shouldn't be the cause of the problem.

The other thing that you're missing is the call to serialEventRun(). Even if you aren't using the serialEvent() feature that call acts as a short delay on every loop. Maybe that delay is the difference between working and not working?

Responses/results

westfw - this all has to do with the soft uart, the usart works just fine up to 1,000,000 bps.

pert - placing a call to init() after the other initialize functions caused repeating reset, before - no apparent effect. One other note, including Arduino.h had no impact on size of behavior.

A dummy call to serialEventRun() did however, have some interesting effects.

This was the line transmitted (4 times):
The corresponding interrupt is executed

1st receipt
xecute
ierrupt is execute
ixecute
irrupt is execute
iexecute

2nd receipt
Torresponding in
Tnding in
Trresponding in
xecuted

3rd receipt
i
in
in
in

4th receipt
interrupt is executed
T
interrupt is executed
T
interrupt is executed
T
interrupt is executed

Further transmissions revealed no consistency. It does seem that it's a timing issue.

I've been trying to wrap my head around this and probably going in circles.

You say "this all has to do with the soft uart" and earlier said "If you switch over to main and while, the interrupt driven transmit only sends the first character"

If I understand it correctly, txS() calls txC to buffer characters for the Timer2 ISR which clocks 'em out.

Meanwhile, the input interrupt can change the state machine of Timer2 to be "RX_START" (thereby scuppering any transmit in progress)

Since the code you posted only calls txS() in main() if there's a message waiting...

are you sending characters to this, receiving a string and then transmitting it back ?

If so, what happens if there's a gap in receive so tx is started than aborted by further rx?
(which would be easily affected by any slight timing variations)

If that makes any sense to you, then you're a better man than I :slight_smile:

Yours,
TonyWilk

You're zeroing in on it Tony. As I was not concerned with the receive side I completely ignored it and its ISR. There are two macros (BTW the posted code is complete) enablePCI() and disablePCI() that should bracket the transmission, effectively blocking reception until complete. As this is intended for a half-duplex RS485 network, that's not an issue. Anyhow, these brackets were missing.

if (!txing)
{
TCNT2 = 0;
timer2on();
Set(txing);
tx_byte = c;
run_state = START_BIT;
*** disablePCI();
}

case STOP_BIT:
timer2off();
tx_buf_count--;
tx_buf_current++;
if (tx_buf_current == BUFLGTH) tx_buf_current = 0;
if (tx_buf_count == 0)
{
Clear(txing);
run_state = HANG_OUT;
*** enablePCI();
}

Sticking them in where they were supposed to be anyhow, seems to be 96% of the solution. There appears yet to be some timing problems which continue to confound me as does the fact that it worked under setup/loop without the PCI brackets. ????

DKWatson:
Sticking them in where they were supposed to be anyhow, seems to be 96% of the solution. There appears yet to be some timing problems which continue to confound me as does the fact that it worked under setup/loop without the PCI brackets. ????

Getting there, slowly. by inches... although, probably like you, i suspect there is something really obvious which is the actual problem :confused:

If this processor has little else to do (i.e. you are not having to wring every cycle out of the ISR's), I'd be tempted to leave the timer interrupt always running and split the state machine

switch( tx_state )   // was: run_state
{
...
case TX_IDLE: break;
}
switch( rx_state )
{
...
case RX_IDLE: break;
}

It would then allow full-duplex if you ever needed it and, probably more importantly, makes it a little simpler: less things to turn on and off, fewer ways one thing can interfere with another and so on.

Yours,
TonyWilk

Great idea. Unfortunately its going to be the other way around. The processor will be very busy and from time to time will need to send/receive a 20 to 30 byte packet, hence the need for no load when not Tx/Rx. All transmissions will be query responses which is why half duplex is enough.

I'm just now getting back into for the night so I'll post results as they become available.

Tony,

Just to be sure all bases are covered I followed your advice and switched the timer on and left it running.

Same results. I also swapped boards just in case - no difference.

I also sent the start bit a touch earlier with limited results but not dissimilar.

I've also attached the new code so that we're on the same page, just in case your interest is still piqued.

These are the results from transmitting

timer0_overflow_count++;

4 times in a row:

timer0_overflow_count++;

timer0_overflow_count++;

timer0_overflow_count++;

timer0_overflow_count++;

timer0_overflow_count++;
t
timer0_overflow_count++;
t
timer0_overflow_count++;
t
timer0_overflow_count++;
tf
w_count++;
toverf
wount++;
t0_overf
wnt++;
ter0_overf
w++;
timer0_overf
trflow_count++;
tflow_count++;
tflow_count++;

All spaces/linefeeds etc. are as transmitted/received. I have the analyzer connected again so I can verify the output.

The blank lines are in fact a series of nulls.

The attached csv file is output from the analyzer.

bare_metal_usart_rev3a.zip (6.74 KB)

For contrast and completeness, this second csv file is the analyzer results when compiled with setup/loop.

The only difference between the two are the commented statements at lines 387/388.

Sorry, it didn't like the csv file.

bare_metal_usart_rev3a.zip (7.15 KB)

Apologies in advance for spelling mistakes, logical inconsitancies etc. - just got back from the local, nice couple of pints of Old Speckled Hen, during which I suddenly realised I was talking absolute rubbish in my last post.

You can't just leave Timer2 ISR running for RX because you are receiving Async.

For receive, you need to detect the first falling edge and then time to the middle of each bit to sample the incoming. So you have to delay 1/2 bit time, then start 1-bit-time interrupts to sample the data.

Ooh, think I need a little lie down now :slight_smile:

Yours,
TonyWilk

For testing I disable Rx. Anyhow, I took a break and watched a James Bond movie (Sean Connery of course) and got an inspiration, something about removing free radicals.

The approach until now has been that given the size difference between the two compilations, that the setup/loop was including some obscure, undocumented function. A touch of reverse logic suggested that maybe the main/while compilation was instead missing something.

So, (drum roll) I added this little gem to the top of the code:

#pragma GCC optimize ("-O0")

and BINGO! Works like a charm. In the end we were staring down a blind alley as there was nothing wrong with the code (pat myself on the back here). There was/is a different set of directives for compiling one vs. the other and something got optimized out. I thought maybe going back and declaring everything volatile as most of the activity is in the interrupt routines anyhow, but that didn't work. Removing all optimization did. Maybe I'll go back and try to figure it out, maybe not. It's not necessary now knowing there's nothing wrong with the code.

Go back to the Hen and have one (or two) for me.

Cheers.

DKWatson:
#pragma GCC optimize ("-O0")

Well done... I'll have to remember that one.

Have a 'Hen yourself !

Yours,
YonyWilk