NewSoftSerial Library: An AFSoftSerial update

Sigh. It does seem like something is broken in OSX interrupts.

I've been playing around with NewSoftSerial and it's been working pretty well for me, with only a few difficulties so far. Thanks for your great work packaging up a bunch of different ideas, mikalhart.

A few notes:

  • I've experienced some problems doing the same sort of serial pass-through, where you just echo characters between the softserial pins and the UART. If the device I'm speaking to on the softserial port is set to echo, no matter how slowly I type I am liable to get garbage characters interspersed with what I'm typing. Turning off echo and having everything run half duplex gives pretty good results. Could this be because pins 2 and 3 (used very often in the soft serial examples I've seen) share an interrupt with the UART on pins 0 and 1? Unfortunately, my device requires a specially designed shield so I can't move pin assignments around easily, but if someone who is having interrupt problems could try using, say, pins 8 and 9 for the soft serial and see if they continue to have problems, that would be an interesting experiment.

  • Another problem I've had is when I get a whole slew of characters sent back from the serial device, it quickly fills the buffer and I lose the rest of the data, even when I'm not doing hardly anything else in the main loop other than trying to handle the input. I'm looking into adding proper RTS/CTS handling as an option to NewSoftSerial to see if that helps. There's an attempt at CTS handling in the AF_XPort code, but it doesn't seem to work properly for me.

Could this be because pins 2 and 3 (used very often in the soft serial examples I've seen) share an interrupt with the UART on pins 0 and 1? Unfortunately, my device requires a specially designed shield so I can't move pin assignments around easily, but if someone who is having interrupt problems could try using, say, pins 8 and 9 for the soft serial and see if they continue to have problems, that would be an interesting experiment.

I was using pins 2 and 3 and have changed them to 8 and 9 as suggested - unfortunately there is no noticeable change in the lockup.

Really hope this can be resolved.. :slight_smile:

I have posted an update to NewSoftSerial (http://www.sundial.org/arduino/?page_id=61). The new version adds support for 300, 1200, 14400, and 28800 baud, although you should use caution with 300 and 1200 as the long interrupt times causes interrupt starvation and the millis() clock doesn't work correctly.

There is also a new self clearing overflow flag, so that you can see if you have lost any characters due to buffer overflow:

if (nss.overflow())
  Serial.print("Characters lost!");

The biggest improvement is in the tuning of higher baud rates. With this new version, I get 100% reliability on 38.4K, 57.6K, and 115.2K transmits, 99.9% on 38.4K receives and 99.7% on 57.6K receives. (115.2K is too fast for RX.) I'd love to see some testing with other high speed serial devices.

@rogermm: I think the prevailing theory is that there is something wrong with the OSX build process for interrupt type applications. I'd like to prove this though. Perhaps if you post or PM me your code, I can take a look at it, rebuild it here, and send you the binary to upload.

@jin: Would you please try AFSoftSerial as a control? At this point I am most interested in identifying and fixing any problems that I introduced with NewSoftSerial. Later I think we can attack anything that is latent in AFSoftSerial.

Thanks for the comments.

Mikal

The problem is definitely with avr-gcc 4.3 on OSX. Building the same source on Windows with Arduino 0012 or 0013 works just fine with no lockups.

Here's a work-around: Take the object file for the library from a Windows machine and copy it to the NewSoftWerial library folder on OSX. Since the NewSoftSerial.o file is already there, the code won't be recompiled. I tested this on my OSX machine and was able to compile my program on OSX and upload to the Arduino and everything ran perfectly.

mikalhart: Maybe you can build a separate download for OSX users that already has a NewSoftSerial.o file from Windows? Not an ideal solution by any means, but it's better then nothing. Also include a disclaimer as to what target chip the code was built for as a different target would need a recompile (and then it wouldn't work anymore).

If I'm not mistaken, object code is compiled for a specific microcontroller, and is recompiled if you select a board type that uses a different processor. This means your object code distribution won't work (more precisely, it would only work for one ATmega and would be broken if the user switched board types).

-j

Hence the reason I said:

Also include a disclaimer as to what target chip the code was built for as a different target would need a recompile (and then it wouldn't work anymore).

Hi Mikal

my code is for a wireless RFID reader. I would like to check battery levels and look for tags in my main loop. As has been reported in other threads - http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1201703850/9#9 - SoftwareSerial's read function blocks until there is a tag available to read, making it 'impossible' to perform both checks. The available function in NewSoftSerial would hopefully resolve this.

Much thanks for the support.
Roger :slight_smile:

#include <NewSoftSerial.h>
#include <avr/sleep.h>
#include <string.h>

#define BATTERY_CHECK_INTERVAL 60000
#define SIX_HOUR 952
#define FIVE_HOUR 945
#define FOUR_HOUR 936
#define THREE_HOUR 926
#define TWO_HOUR 910
#define ONE_HOUR 881
#define ZERO_HOUR 836


// RFID reader for Arduino
// Wiring version by BARRAGAN <http://people.interaction-ivrea.it/h.barragan>
// Modified for Arudino by djmatic
// Modified to use SoftwareSerial by Erik Sjodin <http://www.eriksjodin.net>
// Modified for use in TagMat by Roger Meintjes <http://www.doc.gold.ac.uk/~ma701rm/>
 
/*
 * RFID_Scratch03 (Arduino)
 * 
 * Tests communication between RFID, ArduinoXbee, Processing & Scratch.
 * Reads in RFID tag code from reader and writes it out through Xbee.
 * Code is written out as the String start_byte<10 char code>stop_byte.
 *
 * Battery is checked once every minute.
 * Battery charge written out as String start_byte<10 char charge in hours>stop_byte.
 *
 * Connect:
 * Arduino pin 3 to RFID SOUT
 * Arduino GND to RFID GND
 * Arduino Digital pin 2 to RFID /ENABLE
 * Arduino 5v to RFID VCC
 */

 
char code[13] = "";  
char charge[13] = "";
char baseString[11] = {72, 82, 95, 67, 72, 65, 82, 71, 69, 13, 0};   

unsigned long timeNow = 0;
int batteryLevel = 0;
int val = 0;
int bytesRead = 0;

int rx = 3;
int tx = 4;
int batteryPin = 5;
int rfidEnable = 2;

//set up software serial port
NewSoftSerial softSerial = NewSoftSerial(rx, tx);

void setup()
{
  timeNow = millis();
  pinMode(rx, INPUT);
  pinMode(tx, OUTPUT);
  pinMode(batteryPin, INPUT);
  pinMode(rfidEnable, OUTPUT); 
  
  softSerial.begin(2400);     //set software serial port to reader baud rate
  Serial.begin(9600);         //set hardware serial port to xbee baud rate
  set_sleep_mode(SLEEP_MODE_PWR_DOWN);   //set sleep_mode to most power saving   
  
  //initialize reader with reset cycle
  digitalWrite(rfidEnable, LOW);       
  delay(100);
  digitalWrite(rfidEnable, HIGH);
  delay(100);
  digitalWrite(rfidEnable, LOW);
  delay(100);
}

void loop() {
  if((millis() - timeNow) > BATTERY_CHECK_INTERVAL) {
    checkBattery();
    timeNow = millis();
  }
  else {
    checkReader();
  }
}

void checkReader() {
  digitalWrite(rfidEnable, LOW);
  if(softSerial.available() > 0) {
    //check for start byte
    if((val = softSerial.read()) == 10) {
      bytesRead = 0;
      code[bytesRead] = val;
      bytesRead++;
      //do nothing until all twelve bytes have arrived
      while(softSerial.available() < 12);
      //read in code and stop byte
      while(bytesRead < 12) {
        val = softSerial.read();
        code[bytesRead] = val;
        bytesRead++;
      }
      digitalWrite(rfidEnable, HIGH);
      //check for the stop byte
      if(val == 13) {
        Serial.print(code);
      }
      delay(250);      
    }
  }
}

void checkBattery() {
  batteryLevel = analogRead(batteryPin);
  charge[0] = 10;  //start byte
  charge[2] = 0;   //termination byte
  if(batteryLevel > SIX_HOUR) {
    charge[1] = 54;   //ASCII 6
  }
  else if(batteryLevel > FIVE_HOUR) {
    charge[1] = 53;   //ASCII 5
  }
  else if(batteryLevel > FOUR_HOUR) {
    charge[1] = 52;   //ASCII 4
  }
  else if(batteryLevel > THREE_HOUR) {
    charge[1] = 51;   //ASCII 3
  }
  else if(batteryLevel > TWO_HOUR) {
    charge[1] = 50;   //ASCII 2
  }
  else if(batteryLevel > ONE_HOUR) {
    charge[1] = 49;   //ASCII 1
  }
  else if(batteryLevel > ZERO_HOUR) {
    charge[1] = 48;   //ASCII 0
  }
  else {
    sleep_mode();
  }
  Serial.print(strcat(charge, baseString));  //create and write charge string
}

That's nice looking code, rogermm, and I think it should probably work. This is a perfect application for NewSoftSerial.

I wish someone could figure out how to circumvent what seems to be a problem with OSX/Arduino. Are you using Diecimila/Duemilanova? I'll assume so and post the Windows-compiled binary of NewSoftSerial.o. Hopefully that will resolve your problem.

Mikal

Thanks mikal :slight_smile:

I have checked both functions - checkBattery & checkReader - and each runs fine when it is the only function called in the loop when using SoftwareSerial. It is the combination of the two in the main loop using NewSoftSerial which causes the lockup (the code posted earlier today). I have also tried calling just a single function - either checkBattery or checkReader - using NSS and this also causes lockup. So for me the problem definitely originates from the NSS library.

I am using Diecimila.

roger

Right. We think that the compiler that comes with OSX is generating faulty interrupt code. If you take the same NSS library and compile it with Windows I bet it will work fine.

Mikal

Hm, if there's an interrupt-handling bug in OSX's build of avr-gcc (which is packaged with the Arduino IDE zips), that would explain my problems with the Nokia_lcd library demo.

What's the difference we see in the avr-objdump for the two compiler outputs?

Hi halley--

Thanks for your interest. I am very keen to get to the bottom of this problem. I posted a new version today. In the ZIP file there is the output from

avr-objdump -S NewSoftSerial.o

for whomever is interested. (Thanks to jin for a bug fix.)

@rogermm and @jin-- In the new release, I have included the .o file generated by the Windows 0013 compiler for the Diecimila/Duemilanove. Please try that out and see if your problem is resolved.

Mikal

Hi Mikal,

I have compiled NSS on a windows machine and this has resolved the hangups - so it does seem to be the OSX compiler that is/was causing the problems.

Thanks for all the hard and good work.
Roger

@rogermm--

Yes, I now have reports from 5 different people that the Windows build solves their problems. But I would still love to learn the specifics of the OSX failure. Perhaps there is a very easy workaround we could do in software? Knowing this would benefit me, ladyada, and probably a number of other library maintainers.

If anyone has a chance to run

avr-objdump -S NewSoftSerial.o

on an OSX-built .o file and post the results or compare them with the Windows dump.txt in the latest NewSoftSerial release, this would certainly be interesting.

Thanks for your testing.

Mikal

Okay. Here's the bug that the OSX version of avr-gcc 4.3 generates. Basically it fails to preserve all of the necessary registers before calling a function from an interrupt handler. The code then uses the registers which in effect corrupts them from the main program's perspective.

Here's a problem area in the object dump from OSX:

SIGNAL(SIG_PIN_CHANGE0) 
   0:      1f 92             push      r1
   2:      0f 92             push      r0
   4:      0f b6             in      r0, 0x3f      ; 63
   6:      0f 92             push      r0
   8:      11 24             eor      r1, r1
   a:      8f 93             push      r24
   c:      9f 93             push      r25
   e:      ef 93             push      r30
  10:      ff 93             push      r31
******************************************************************************/

/* static */
inline void NewSoftSerial::handle_interrupts(int lopin, int hipin)
{
  if (active_object && 
  12:      e0 91 00 00       lds      r30, 0x0000
  16:      f0 91 00 00       lds      r31, 0x0000
  1a:      30 97             sbiw      r30, 0x00      ; 0
  1c:      01 f0             breq      .+0            ; 0x1e <__vector_3+0x1e>
  1e:      82 81             ldd      r24, Z+2      ; 0x02
  20:      90 e0             ldi      r25, 0x00      ; 0
  22:      88 30             cpi      r24, 0x08      ; 8
  24:      91 05             cpc      r25, r1
  26:      04 f0             brlt      .+0            ; 0x28 <__vector_3+0x28>
  28:      0e 97             sbiw      r24, 0x0e      ; 14
  2a:      04 f4             brge      .+0            ; 0x2c <__vector_3+0x2c>
    active_object->_receivePin >= lopin && 
    active_object->_receivePin <= hipin)
  {
    active_object->recv();
  2c:      cf 01             movw      r24, r30
  2e:      0e 94 00 00       call      0      ; 0x0 <__vector_3>
}

SIGNAL(SIG_PIN_CHANGE0) 
{
  NewSoftSerial::handle_interrupts(8, 13);
}
  32:      ff 91             pop      r31
  34:      ef 91             pop      r30
  36:      9f 91             pop      r25
  38:      8f 91             pop      r24
  3a:      0f 90             pop      r0
  3c:      0f be             out      0x3f, r0      ; 63
  3e:      0f 90             pop      r0
  40:      1f 90             pop      r1
  42:      18 95             reti

Notice that only registers 0, 1, 24, 25, 30 and 31 are preserved on the stack.

Here's the Windows object dump of the same section:

SIGNAL(SIG_PIN_CHANGE0) 
   0:      1f 92             push      r1
   2:      0f 92             push      r0
   4:      0f b6             in      r0, 0x3f      ; 63
   6:      0f 92             push      r0
   8:      11 24             eor      r1, r1
   a:      2f 93             push      r18
   c:      3f 93             push      r19
   e:      4f 93             push      r20
  10:      5f 93             push      r21
  12:      6f 93             push      r22
  14:      7f 93             push      r23
  16:      8f 93             push      r24
  18:      9f 93             push      r25
  1a:      af 93             push      r26
  1c:      bf 93             push      r27
  1e:      ef 93             push      r30
  20:      ff 93             push      r31
******************************************************************************/

/* static */
inline void NewSoftSerial::handle_interrupts(int lopin, int hipin)
{
  if (active_object && 
  22:      e0 91 00 00       lds      r30, 0x0000
  26:      f0 91 00 00       lds      r31, 0x0000
  2a:      30 97             sbiw      r30, 0x00      ; 0
  2c:      01 f0             breq      .+0            ; 0x2e <__vector_3+0x2e>
  2e:      82 81             ldd      r24, Z+2      ; 0x02
  30:      90 e0             ldi      r25, 0x00      ; 0
  32:      88 30             cpi      r24, 0x08      ; 8
  34:      91 05             cpc      r25, r1
  36:      04 f0             brlt      .+0            ; 0x38 <__vector_3+0x38>
  38:      0e 97             sbiw      r24, 0x0e      ; 14
  3a:      04 f4             brge      .+0            ; 0x3c <__vector_3+0x3c>
    active_object->_receivePin >= lopin && 
    active_object->_receivePin <= hipin)
  {
    active_object->recv();
  3c:      cf 01             movw      r24, r30
  3e:      0e 94 00 00       call      0      ; 0x0 <__vector_3>
}

SIGNAL(SIG_PIN_CHANGE0) 
{
  NewSoftSerial::handle_interrupts(8, 13);
}
  42:      ff 91             pop      r31
  44:      ef 91             pop      r30
  46:      bf 91             pop      r27
  48:      af 91             pop      r26
  4a:      9f 91             pop      r25
  4c:      8f 91             pop      r24
  4e:      7f 91             pop      r23
  50:      6f 91             pop      r22
  52:      5f 91             pop      r21
  54:      4f 91             pop      r20
  56:      3f 91             pop      r19
  58:      2f 91             pop      r18
  5a:      0f 90             pop      r0
  5c:      0f be             out      0x3f, r0      ; 63
  5e:      0f 90             pop      r0
  60:      1f 90             pop      r1
  62:      18 95             reti

Notice that many more registers are preserved. The Windows version is preserving registers 0,1, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30 and 31.

Here's an updated NewSoftSerial recv() function that includes a work-around for the register problem.

void NewSoftSerial::recv() 
{

// Work-around for avr-gcc 4.3 OSX version bug
// Preserve the registers that the compiler misses
  asm volatile(
      "push r18 \n\t"
      "push r19 \n\t"
      "push r20 \n\t"
      "push r21 \n\t"
      "push r22 \n\t"
      "push r23 \n\t"
      "push r26 \n\t"
      "push r27 \n\t"
      ::);

  char i, d = 0; 

  // If RX line is high, then we don't see any start bit
  // so interrupt is probably not for us
  if (! digitalRead(_receivePin)) {

    // Wait approximately 1/2 of a bit width to "center" the sample
    tunedDelay(_bitDelay / 2 - 8);

    // Read each of the 8 bits
    for (i=0; i<8; i++) { 
      //PORTB |= _BV(5);
      tunedDelay(_bitDelay - 6);  // 6 - digitalread takes some time
      //PORTB &= ~_BV(5);
      if (digitalRead(_receivePin))
        d |= (1 << i); 
      else // else clause added to ensure function timing is ~balanced
        d &= ~(1 << i);
    } 

    // skip the stop bit
    tunedDelay(_bitDelay);

    // if buffer full, set the overflow flag and return
    if ((_receive_buffer_tail + 1) % _NewSS_MAX_RX_BUFF != _receive_buffer_head) {
      // save new data in buffer: tail points to where byte goes
      _receive_buffer[_receive_buffer_tail] = d; // save new byte 
      _receive_buffer_tail = (_receive_buffer_tail + 1) % _NewSS_MAX_RX_BUFF;
    } else {
      _buffer_overflow = true;
    }
  }

// Work-around for avr-gcc 4.3 OSX version bug
// Restore the registers that the compiler misses
  asm volatile(
      "pop r27 \n\t"
      "pop r26 \n\t"
      "pop r23 \n\t"
      "pop r22 \n\t"
      "pop r21 \n\t"
      "pop r20 \n\t"
      "pop r19 \n\t"
      "pop r18 \n\t"
      ::);
}

The main change is that I added some inline assembly to preserve the missing registers and then restore them on exit. Additionally I rearranged some of the conditions to eliminate the "returns". This ensures we don't skip the restoring assembly and corrupt the stack.

I've tested this and it works correctly with OSX and I get no lockups. I don't know if this will have any impact on the bit timing so more testing at different baud rates is probably needed. I doubt it will because we're basically just adding in the machine instructions missing from the OSX version (but running on the Windows version).

Although it won't hurt the Windows version to run this code (other than a few more bytes on the stack and a few extra instruction cycles and maybe impacted bit timing?), there's really no reason it needs to. Maybe you can include the assembly blocks inside of conditional compiler directives while testing for the platform and only include them for the OSX version?

It would technically be more correct to put the register preserving code in each interrupt handler before it calls any function. But that would mean adding the push/pop inline assembly 3 times (in each interrupt handler). In this case all of the handlers end up calling recv() before the contents of the "missed" registers are disturbed so it works to have the code only in recv().

Note that this work-around is specific to this code. It's very possible that if the recv() function is modified other registers might be affected that aren't preserved properly (particularly if recv() calls any new functions). So probably the only way to tell is to compare the object dumps between OSX and Windows to see if any more (or different) registers are missing.

etracer--

Fantastic work! This is just the analysis we were looking for and just the result I hoped for. I will generate a new official version of NewSoftSerial shortly incorporating this and some other improvements.

I agree that there is some chance that this will affect bit timings. I have a fairly elaborate test suite built here that tests all the supported baud rates from 300-115.2K, so I should be able to make some tweaks to compensate, if needed.

Ideally, we could conditionally compile this code, because putting it in all implementations means that the Windows interrupt handler will always be slightly slower than the OSX handler.

Thanks again, and congratulations on the fine work!

Mikal

Is this, or is this going to be, compatible with the 328?

I built a BT wireless RFID reader a year ago (why spend 1600 bucks on a commercial model!) and used 2 168's over i2c as I couldn't get past the SS hang.

I'm just about to order 5 328's, can it be assumed the NSS library is compatible?

As long as the clock is 16MHz, I think it's a fair bet that it will work, although I haven't tested anything but the Diecimila.

I've got a few unassembled RBBBs laying around with 16mhz crystals so il put in a 328, when they arrive, and give it a spin.