SoftwareSerial returning garbled data @1MHz

Hello,

I'm using Arduino MEGA to read data from a GPS module (http://www.nooelec.com/files/SKM53_Datasheet.pdf). I'm using SoftwareSerial to receive data from the module through pin 10, using a baud rate of 9600 (default for this module). Everything works fine at 16MHz.

I want to consume the least amount of power possible, so my battery will last longer. For this, I want to reduce the clock speed. The problem is that if I use a prescaling of 16 to reduce the clock speed to 1MHz, I get some garbled data using SoftwareSerial, while if I use any of the USART pins I get the correct data.

Can anybody help me understand what's going on?

My code is below:

#include <SoftwareSerial.h>
#include <avr/power.h>
SoftwareSerial GPS(10,11);

void setup(){
  Serial.begin(9600);
  clock_prescale_set (clock_div_16);
  GPS.begin(9600);
}
void loop(){
 if(GPS.available())
   Serial.print((char)GPS.read());
}

And here is some example of garbled data:

$GP,2,,8,,2PIb‚’š)’b5ÔnÂbb‚b‚‚G¢‚,20,
$GhGGA,141546.000,6,,,0,2,,,M,,M,6
49

How is SoftwareSerial supposed to know that you are operating at a slower speed?

Thanks PaulS.

From my understanding, SoftwareSerial uses the F_CPU declared in boards.txt to know which speed I'm operating at.

I changed mega.build.f_cpu=16000000L to mega.build.f_cpu=1000000L, so I guess this is not the issue.

The problem is likely in the inherent limitations of SoftwareSerial. It disables interrupts for at least 95% of the time each character is being received. At 9600 baud the remaining 5% is about 50us. If there are no other interrupts it might still work. But timer 0 overflow interrupts require something over 50us to service with a 1MHz clock. If the next character detection is delayed long enough it will misread the character. It might work if you disabled timer 0 interrupts; but then you wouldn't be able to use delay() and millis().

I'm pretty sure you could still do it with software serial, but probably not with the SoftwareSerial.

edit: One thing you might try is AltSoftSerial. It is far more resilient than SoftwareSerial. I don't know if it will work at 1MHz but it is designed to adjust to the processor clock. It does require you to use specific pins though (46 & 48 on a mega, I think).

I wrote some code to create a small software serial to work on an Attiny at 1MHz. It is not limited to the Attiny. It may or may not be useful.

...R

Thanks for the help guys.

@jboyton I tried disabling the timer 0 overflow interrupt as well as using the AltSoftSerial library, but neither worked. I tried changing the delays used in SoftwareSerial.h and got better results, but still couldn't make it work:

$GPGGA,213812.000,2525.7813,S,04919.2729,W,2,8,0.88,903.5,M,0.0,M,0000,000
P3,021,8.
,,612,092,$3172,,14,,7G,24,37$20A7S.,075$GPGGA,213813.000,2525.7813,S,04919.2729,W,2,8,0.88,903.5,M,0.0,M,0000,000BG3,021,..

@Robin2 I tried using the SmallSerial functions with a 9600 baud rate, but it didn't work as well. Do I need to recalibrate the values? I guess not, since you wrote "Calibration is only needed if there is a change to the code in sendBits() or getBits()".

I think it's weird that it is being so hard to make SoftwareSerial work at 1 MHz, because 1 bit would last approximately 104 clock cycles, which seems to be a big interval. I may be misinterpreting something though.

hightek:
@Robin2 I tried using the SmallSerial functions with a 9600 baud rate, but it didn't work as well. Do I need to recalibrate the values? I guess not, since you wrote "Calibration is only needed if there is a change to the code in sendBits() or getBits()".

It's a long time since I wrote the code. Calibration can't do any harm. And if you are using it on a Mega you might want to check the code to see if it needs updating for any differences on the Mega. I never tried it on a Mega.

Getting that sort of program to work requires patience and lots of trial and error.

...R

the table for 16Mhz looks like

#if F_CPU == 16000000
static const DELAY_TABLE PROGMEM table[] = 
{
  //  baud    rxcenter   rxintra    rxstop    tx
  { 28800,    34,        77,        77,       74,    },
  { 19200,    54,        117,       117,      114,   },
  { 14400,    74,        156,       156,      153,   },
  { 9600,     114,       236,       236,      233,   },
  { 4800,     233,       474,       474,      471,   },
};

dividing this table by 16 should give a first approximation of the values
problem is that you get no integer values, that means there is drift in the handshake

however

Did some work on Software Serial using formulas instead of lookup tables.
The advantage is that you can tweak the baud rate easier
e.g. instead of 9600 baud you use 9683 baud if that works better given #interrupts overhead etc.
I derived no formulas for 1Mhz as there was no table available
Details - SoftwareSerial magic numbers - Libraries - Arduino Forum

If I put the 1/16th values in Excel I get the following formulas (first approx)

rxcenter = 130049 * pow(baudrate, -1.071);
rxintra = 159927 * pow(baudrate, -1.014);
tx = 185672 * pow(baudrate, -1.032);
rxstop = rxintra;

These formulas would allow you to use any baud rate

as the formulas all use baudrate as base, they can be optimized to

float logbaud = log(baudrate);
rxcenter = 130049 * exp(logbaud * -1.071);
rxintra = 159927 * exp(logbaud * -1.014);
tx = 185672 * exp(logbaud * -1.032);
rxstop = rxintra;

Not tested (no time) so all disclaimers apply :slight_smile:

1 You have 4 hardware serials on the mega so why are you using software serial at all?

2 Why do you think that keeping the CPU awake 16 times as long as it would be at 16MHz saves power.

Mark

@robtillaart

I read your post about the delay tables, but I think they modified the SoftwareSerial library since you made that post, because it uses formulas to calculate the different delays:

void SoftwareSerial::begin(long speed)
{
  _rx_delay_centering = _rx_delay_intrabit = _rx_delay_stopbit = _tx_delay = 0;

  // Precalculate the various delays, in number of 4-cycle delays
  uint16_t bit_delay = (F_CPU / speed) / 4;

  // 12 (gcc 4.8.2) or 13 (gcc 4.3.2) cycles from start bit to first bit,
  // 15 (gcc 4.8.2) or 16 (gcc 4.3.2) cycles between bits,
  // 12 (gcc 4.8.2) or 14 (gcc 4.3.2) cycles from last bit to stop bit
  // These are all close enough to just use 15 cycles, since the inter-bit
  // timings are the most critical (deviations stack 8 times)
  _tx_delay = subtract_cap(bit_delay, 15 / 4);

  // Only setup rx when we have a valid PCINT for this pin
  if (digitalPinToPCICR(_receivePin)) {
    #if GCC_VERSION > 40800
    // Timings counted from gcc 4.8.2 output. This works up to 115200 on
    // 16Mhz and 57600 on 8Mhz.
    //
    // When the start bit occurs, there are 3 or 4 cycles before the
    // interrupt flag is set, 4 cycles before the PC is set to the right
    // interrupt vector address and the old PC is pushed on the stack,
    // and then 75 cycles of instructions (including the RJMP in the
    // ISR vector table) until the first delay. After the delay, there
    // are 17 more cycles until the pin value is read (excluding the
    // delay in the loop).
    // We want to have a total delay of 1.5 bit time. Inside the loop,
    // we already wait for 1 bit time - 23 cycles, so here we wait for
    // 0.5 bit time - (71 + 18 - 22) cycles.
    _rx_delay_centering = subtract_cap(bit_delay / 2, (4 + 4 + 75 + 17 - 23) / 4);

    // There are 23 cycles in each loop iteration (excluding the delay)
    _rx_delay_intrabit = subtract_cap(bit_delay, 23 / 4);

    // There are 37 cycles from the last bit read to the start of
    // stopbit delay and 11 cycles from the delay until the interrupt
    // mask is enabled again (which _must_ happen during the stopbit).
    // This delay aims at 3/4 of a bit time, meaning the end of the
    // delay will be at 1/4th of the stopbit. This allows some extra
    // time for ISR cleanup, which makes 115200 baud at 16Mhz work more
    // reliably
    _rx_delay_stopbit = subtract_cap(bit_delay * 3 / 4, (37 + 11) / 4);
    #else // Timings counted from gcc 4.3.2 output
    // Note that this code is a _lot_ slower, mostly due to bad register
    // allocation choices of gcc. This works up to 57600 on 16Mhz and
    // 38400 on 8Mhz.
    _rx_delay_centering = subtract_cap(bit_delay / 2, (4 + 4 + 97 + 29 - 11) / 4);
    _rx_delay_intrabit = subtract_cap(bit_delay, 11 / 4);
    _rx_delay_stopbit = subtract_cap(bit_delay * 3 / 4, (44 + 17) / 4);
    #endif


    // Enable the PCINT for the entire port here, but never disable it
    // (others might also need it, so we disable the interrupt by using
    // the per-pin PCMSK register).
    *digitalPinToPCICR(_receivePin) |= _BV(digitalPinToPCICRbit(_receivePin));
    // Precalculate the pcint mask register and value, so setRxIntMask
    // can be used inside the ISR without costing too much time.
    _pcint_maskreg = digitalPinToPCMSK(_receivePin);
    _pcint_maskvalue = _BV(digitalPinToPCMSKbit(_receivePin));

    tunedDelay(_tx_delay); // if we were low this establishes the end
  }

#if _DEBUG
  pinMode(_DEBUG_PIN1, OUTPUT);
  pinMode(_DEBUG_PIN2, OUTPUT);
#endif

  listen();
}

But since it calculates the delays in number of 4-cycle delays, this generates a certain imprecision in the calculations. Other thing that I noticed is that subtract_cap(bit_delay / 2, (4 + 4 + 75 + 17 - 23) / 4) returns 1, because the subtraction gives a negative number (-25) at 1 MHz and 9600 baud rate. This made me think it would result in a big error for the recv() function (we would be at least 25 cycles off the center of the first bit).

So my idea was to modify the recv() function from SoftwareSerial, so I would be always at the center of the data bits. I used fixed delay values rather than formulas and this is what gave me the best results (with the timer 0 overflow interrupt disabled):

void SoftwareSerial::recv()
{

#if GCC_VERSION < 40302
// Work-around for avr-gcc 4.3.0 OSX version bug
// Preserve the registers that the compiler misses
// (courtesy of Arduino forum user *etracer*)
  asm volatile(
    "push r18 \n\t"
    "push r19 \n\t"
    "push r20 \n\t"
    "push r21 \n\t"
    "push r22 \n\t"
    "push r23 \n\t"
    "push r26 \n\t"
    "push r27 \n\t"
    ::);
#endif  

  uint8_t d = 0;

  // If RX line is high, then we don't see any start bit
  // so interrupt is probably not for us
  if (_inverse_logic ? rx_pin_read() : !rx_pin_read())
  {
    // Disable further interrupts during reception, this prevents
    // triggering another interrupt directly after we return, which can
    // cause problems at higher baudrates.
    setRxIntMsk(false);

    // Wait approximately 1/2 of a bit width to "center" the sample
    tunedDelay(17); // Delay 73 cycles
 asm("NOP");
 
    DebugPulse(_DEBUG_PIN2, 1);

    // Read each of the 8 bits
    for (uint8_t i=7; i > 0; --i)
    { 
 d >>= 1;
 DebugPulse(_DEBUG_PIN2, 1);
 if (rx_pin_read())
 d |= 0x80;
 tunedDelay(20);
 asm("NOP");
 asm("NOP");
 //asm("NOP");
    }
 
 d >>= 1;
 DebugPulse(_DEBUG_PIN2, 1);
 if (rx_pin_read())
 d |= 0x80;

    if (_inverse_logic)
      d = ~d;

    // if buffer full, set the overflow flag and return
    uint8_t next = (_receive_buffer_tail + 1) % _SS_MAX_RX_BUFF;
    if (next != _receive_buffer_head)
    {
      // save new data in buffer: tail points to where byte goes
      _receive_buffer[_receive_buffer_tail] = d; // save new byte
      _receive_buffer_tail = next;
    } 
    else 
    {
      DebugPulse(_DEBUG_PIN1, 1);
      _buffer_overflow = true;
    }

    // skip the stop bit
    tunedDelay(15);
 asm("NOP");
 asm("NOP");
 //asm("NOP");
    DebugPulse(_DEBUG_PIN1, 1);

    // Re-enable interrupts when we're sure to be inside the stop bit
    setRxIntMsk(true);

  }

I wanted to calculate the delay values manually, but I would have to know how many clock cycles each instruction takes. I would do like this:

Each bit lasts 104 clock cycles, so:

  • Center of first data bit would be after 156 clock cycles.
  • Each subsequent data bit would be after 104 clock cycles.
  • 1/4th of stop bit would be 78 clock cycles away from center of last data bit.

hightek:
$GPGGA,213812.000,2525.7813,S,04919.2729,W,2,8,0.88,903.5,M,0.0,M,0000,000
P3,021,8.
,,612,092,$3172,,14,,7G,24,37$20A7S.,075$GPGGA,213813.000,2525.7813,S,04919.2729,W,2,8,0.88,903.5,M,0.0,M,0000,000BG3,021,..

That actually looks perfect for the first 74 characters of each report. According to the datasheet your GPS defaults to an output of GGA, GSA, GSV, and RMC. The first 74 characters of the GGA sentence are perfect. Then you get a lot of dropouts. But no garbage characters like you reported at first.

So at this point it's most likely that the RX buffer is filling up. I would try a couple things. First, increase the baud rate of the Serial output so that it's faster than the GPS input. I don't know what the limit is on a 1MHz processor, but try 19200. The other thing I'd do is increase the SoftwareSerial buffer size.

By default you're receiving four different NMEA strings. That might add up to a few hundred bytes of data for each report. If you have the RAM, make your SoftwareSerial RX buffer that big. You might also consider disabling the output of any NMEA sentences you don't care about.

hightek:
I think it's weird that it is being so hard to make SoftwareSerial work at 1 MHz, because 1 bit would last approximately 104 clock cycles, which seems to be a big interval. I may be misinterpreting something though.

That's only 104 processor cycles at 1MHz, so something less than 100 instructions. And since SoftwareSerial monopolizes the processor when data is being received you really only have a few instructions per character to do anything else.

hightek:
I want to consume the least amount of power possible, so my battery will last longer. For this, I want to reduce the clock speed.

Another thought: You might save more current by leaving the processor at 16MHz and sleeping in between GPS reports.

Right now you're reducing the clock to 1MHz which reduces the current by a factor of about 16 (not quite). But if you programmed the GPS to only output GGA and RMC sentences and increased its baud rate to 38400 it would send a report in under 50ms. If you could sleep 950 out every 1000ms you could reduce your processor's current consumption by a factor of 20, even running at 16MHz.

But this depends on what else you're doing.

@holmes4

1 - My intention is to use the program with an ATMEGA328p that I have and I'm already using its hardware serial with another module. I'm using the Arduino MEGA just to test the program.

2 - I'm trying to save power based on Gammon Forum : Electronics : Microprocessors : Power saving techniques for microprocessors.
I am aware that if I use a 1 MHz clock the program will be 16 times slower, but with a slower clock source I'm also able to power the ATMEGA with a lower voltage (I'm using a 3.3V and 1200 mAh battery). This reduces the power consumption even further. I'm not 100% sure if it will consume less power overall, considering the CPU will have to be awake for more time.

Anyway, if I can't have a 9600 baud rate using 1MHz with software serial, I think I will just try to increase the clock frequency. But it looks mathematically possible to achieve that with 1 MHz.

hightek:
...with a slower clock source I'm also able to power the ATMEGA with a lower voltage (I'm using a 3.3V and 1200 mAh battery). This reduces the power consumption even further.

Then you want to run at 8MHz, not 1MHz. Running slower isn't necessarily an advantage when you have the option to sleep instead. Gammon mentions that in his web page.

Another thought: You might save more current by leaving the processor at 16MHz and sleeping in between GPS reports.

Right now you're reducing the clock to 1MHz which reduces the current by a factor of about 16 (not quite). But if you programmed the GPS to only output GGA and RMC sentences and increased its baud rate to 38400 it would send a report in under 50ms. If you could sleep 950 out every 1000ms you could reduce your processor's current consumption by a factor of 20, even running at 16MHz.

But this depends on what else you're doing.

I'm already doing that :slight_smile:

Let me explain the project. I'm making a tracker. It will get GPS data and send it to my server using a GPRS shield. I'm sending a position each 2 minutes (it is configurable via GSM message though). I only send the position if the tracker moved a certain distance and if the position has a good accuracy. To reduce battery consumption, I power off the modules I'm not using. After sending a position, I put the CPU to sleep for the remaining of the 2 minutes. I'm also using a SD card to store the positions that can't be sent when there isn't a data connection.

Then you want to run at 8MHz, not 1MHz. Running slower isn't necessarily an advantage when you have the option to sleep instead. Gammon mentions that in his web page.

Yeah, I read that. But to know for sure if it would be good to use a lower frequency, I would've to know how much power the CPU consumes at each frequency and see how long they would be awake/sleeping. Considering that the CPU will be awake for different period of times (depends if there is GPS signal or if the accuracy is good enough and if it will really send a position each 2 minutes), it's hard to measure that. I was using 1 MHz, because it seemed to consume a lot less power according to Gammon experiments:

lfuse   Speed    Current
0xFF    16 MHz    15.15 mA
0xE2     8 MHz    11.05 mA
0x7F     2 MHz     7.21 mA
0x62     1 MHz     6.77 mA
0xE3   128 KHz     6.00 mA

But I guess you're right. With 8 MHz I could use bigger baud rates and I would put the cpu to sleep for a longer time and I would keep the modules turned on for a shorter time. I guess I will just use 8 MHz as you mentioned. But I will still try to make software serial work at 1 MHz and 9600bps just out of curiosity. :slight_smile:

Thanks for the help!

hightek:
But to know for sure if it would be good to use a lower frequency, I would've to know how much power the CPU consumes at each frequency and see how long they would be awake/sleeping. Considering that the CPU will be awake for different period of times (depends if there is GPS signal or if the accuracy is good enough and if it will really send a position each 2 minutes), it's hard to measure that. I was using 1 MHz, because it seemed to consume a lot less power according to Gammon experiments:

lfuse   Speed    Current

0xFF    16 MHz    15.15 mA
0xE2    8 MHz    11.05 mA
0x7F    2 MHz    7.21 mA
0x62    1 MHz    6.77 mA
0xE3  128 KHz    6.00 mA

You can look up the current in the datasheet.

There's something weird about Gammon's numbers. He said he's using a bare bones board with just the 328p chip on it and yet all of his measurements are 5.7-5.9mA higher than what the datasheet says (for 5V). He's somehow using nearly 6mA of extra current, consistently, regardless of clock speed. Is there an LED lit up or something? I can't quite tell.

I'm working on a project with a 328p running at 8MHz and 3.3V. It draws a little over 3mA when active. That matches the datasheet quite well.

If I ran the processor at 1MHz and 3.3V it would draw about 600uA. But my code would take 8 times as long to run so I'd be using 8*600/3300 = 1.45 times as much energy while active each cycle. It's more energy efficient for my code to run at 8MHz and go to sleep sooner. Once sleeping (power save mode in this case) the processor uses 1uA.

For you, the code is stuck waiting for the serial GPS data to arrive. It's captive during that period. So it's worth the effort to speed that up as much as possible, by turning off NMEA sentences you don't care about and increasing the baud rate. That way you can get back to the important business of sleeping as soon as possible.

Thanks jboyton!

That's what I'm doing now. Just for the record, I was surprised that some bytes of GPS data were being lost with SoftwareSerial at 8MHz and 38400bps. :o

$GPGSA,A,3,28,0,,,,,4,0.B,3,2,4,1518,810,4,$GPS2,220,3078,,,8
G320,706,202,,,
$1.00A8S,0.2.7,25,6$GPGGA,182312.000,2525.7880,S,04919.2782,W,1,4,1.68,939.5,M,0.0,M,,68
$GPGSA,A,3,,,,,,,8,0
G,3,,4,158,2580,21$GP,2,2260,3108,,,

I guess I will try what you suggested before (changing the buffer size and the Serial baud rate) as well as changing the delays.

hightek:
I was surprised that some bytes of GPS data were being lost with SoftwareSerial at 8MHz and 38400bps.

I couldn't get SoftwareSerial to work with my GPS at anything above 9600 baud. And that was running at 16MHz. I ended up writing my own software serial code which runs at 9600, 19.2 and 38.4 at 16MHz. I modified it later to run at 8MHz but because of the approach I took with the code I was only able to get it to work at 19.2 at that clock speed.

I'm surprised that AltSoftSerial wouldn't work for you at 1MHz. I wonder if you did something incorrect and gave up too soon. It would probably be the best solution at 8MHz, provided you can dedicate those pins and timer 1 to the task.

I couldn't get SoftwareSerial to work with my GPS at anything above 9600 baud. And that was running at 16MHz. I ended up writing my own software serial code which runs at 9600, 19.2 and 38.4 at 16MHz.

Nice job!

I'm surprised that AltSoftSerial wouldn't work for you at 1MHz. I wonder if you did something incorrect and gave up too soon.

Hmmm that may be the case... I just changed the variable declaration to AltSoftSerial and used pins 46 and 48 of my Arduino MEGA. Is there anything else to it? I will try it again.

I made more tests with the GPS. I changed the baud rate to 19200 and 38400 and used the UART3 of my MEGA to get the data. For my surprise, I was still losing data, but less frequently. Example below is for 19200:

GPS,,,40,539,50,3,2,126,70102C
GGV441,10,3,3,,4
$PM,94700,,5579,,41.62W00,0.6011,,8
$GPGGA,193458.000,2525.7791,S,04919.2652,W,1,8,0.96,929.6,M,0.0,M,,6C
$GPGSA,M,3,17,06,13,30,28,24,15,12,,,,,1.25,0.96,0.80
05
$GPGSV,4,1,14,13,68,310,22,17,66,120,16,15,43,249,26,28,28,138,22
77
$GPGSV,4,2,14,30,27,077,16,06,18,032,16,24,14,226,17,12,12,272
0
GPS,,,40,539,50,3,2,126,70,6,7
$PSV441,10,3,3,,4
$PM,94800,,5.71S041.62W00,0.6011,,*7

It works perfectly for 9600.

hightek:
I just changed the variable declaration to AltSoftSerial and used pins 46 and 48 of my Arduino MEGA. Is there anything else to it?

As far as I know that's it.

Did it work at 16MHz? It should have.