Poor serial performance when sending data from RaspberryPi to Arduino

Hello,

I wondered if anyone could give me some help on why my serial performance is very slow. I have an Arduino driving an LED matrix, receiving data from a Raspberry Pi using Python and pyserial. The serial data is sent over a USB cable plugged between the Pi and Arduino.

A sketch on the Arduino listens for bytes, one control byte, then 1536 pixel data bytes. (The LED matrix is 1024 RGB pixels at 4 bit colour. I pack two 4 bit values into each byte, so each RGB pixel is 1.5 bytes. Therefore an entire frame is sent in 1024 pixels x 1.5 bytes per pixel = 1536 bytes.)

If my maths is right... Running at 115200 baud, I make that roughly 14400bytes/s - enough that I should see a maximum of 9 frames per second. However I'm only seeing roughly two frames per second. Even at 230400 baud things are 2 or 3 fps.

Any ideas appreciated!

Here is the Python send code:

import serial
from time import sleep

#setup serial port
port = "/dev/ttyACM0"
ser = serial.Serial(port, 230400, timeout=1)
sleep(1) #wait for serial 

#fill screen with white then black

loop=0
while (loop < 100):

    #draw totally white frame (all bytes value 11111111)
    
    #send control byte: 1 = draw frame 
    ser.write(chr(1)) 
   
    #send screen data - 2 pixels sent in 3 bytes. 1024 pixels in total = 512 loops of 3 bytes (1536 bytes total)
    c=0
    while (c < 512):
        ser.write(chr(255))
        ser.write(chr(255))
        ser.write(chr(255))
        c=c+1


    #draw totally black frame (all bytes value 00000000)
   
    #send control byte: 1 = draw frame
    ser.write(chr(1))
   
    #send screen data - 2 pixels sent over 3 bytes. 1024 pixels in total = 512 loops of 3 bytes (1536 bytes total)
    c=0
    while (c < 512):
        ser.write(chr(255))
        ser.write(chr(255))
        ser.write(chr(255))
        c=c+1

    loop=loop+1

And here is the Arduino receive code.

#include "RGBmatrixPanel.h" //3rd party matrix driver 

//define pins
#define A   5 
#define B   4 
#define C   3 
#define D   2 
#define CLK 10  
#define LAT 9 
#define OE  7 
RGBmatrixPanel matrix(A, B, C, D, CLK, LAT, OE, true);

void setup() {
  Serial.begin(230400);
  matrix.begin();
}

void loop() {

  byte incomingByte = 0;
 
  if (Serial.available() > 0) {  //if > 1 get 1st byte which is control byte

    incomingByte = Serial.read();

    //if control byte is 1, draw frame
    switch ( int(incomingByte) ) {
    case 1:
      drawScreen();
      break;
    default:
      Serial.println("unknown cmd"); 
    }
  }
}



void drawScreen() {

  //Total bytes to receive is 1536
  byte x = 0;
  byte y = 0;
  byte r = 0;
  byte g = 0;
  byte b = 0;
  byte data = 0;

  while (y < 32 ) {  

    if (Serial.available() >= 3) { //read 3 bytes (2 pixels)
   
      //read 1st byte and split into nibbles
      data = byte (Serial.read());
      r = data & B00001111;
      g = data >> 4;  

      //read 2nd byte and split into nibbles
      data = byte (Serial.read()); 
      b = data & B00001111;

      //draw pixel1
      matrix.drawPixel(x, y, matrix.Color444(r,g,b));
      //next pixel
      x++;

      r = data >> 4;  
      //read 3rd byte and split into nibbles
      data = byte (Serial.read()); 
      g = data & B00001111;
      b = data >> 4; 

      //draw pixel2
      matrix.drawPixel(x, y, matrix.Color444(r,g,b));

      //next pixel
      if (x == 31){ 
        x=0; 
        y++;
      } 
      else {
        x++;
      }          
    }
  }
  matrix.swapBuffers(false); 
}

If my maths is right... Running at 115200 baud, I make that roughly 14400bytes/s

It isn't. Each bytes requires sending 10 bits, not 8.

    switch ( int(incomingByte) ) {

Why are you performing this cast? The switch function is perfectly happy with a byte as the argument.

You're receiver is not waiting for three bytes. It is looping through the while loop many times before the third byte arrives. Are you monitoring what the Arduino sends back? I'm guessing that there are a lot of messages being sent back. That is why your receive rate is so low.

Hi Paul, thanks for taking the time to reply.

I was unaware serial required 10 bits per character, OK so that knocks the maximum frame rate down a little. The cast to Int in the switch statement was there as I didn't know switch could handle bytes, I'll take it out.

How would you suggest the receiver code waits for the 3 bytes, I wasn't sure what else it could do other than loop round waiting for data. Is that very inefficient?

As for monitoring what the receiver sends back, there is actually nothing being sent back. I stripped out all serialprint lines I had in there for debugging as I thought that might be the cause of the issue.

Cheers
Nick

How would you suggest the receiver code waits for the 3 bytes

while(Serial.available() < 3)
{
   // Do nothing
}

Is that very inefficient?

Yes, very. The alternative is to simply deal with data as it arrives, storing each piece until you have enough to proceed. Then use that data, and reset the array index to start storing again.

Thanks again for the pointers Paul, I've tweaked my Arduino code (see below)

I've been experimenting with different baud rates and timing the Arduino function that receives the serial data from the Raspberry Pi. At both 115200 and 230400 baud there is no difference in the time it takes to receive and process a frame of data (1536 bytes). It's about 390 milliseconds either way.

This seems strange, so I wonder if something other than the baud rate is slowing things down.

Commenting out anything to do with the LED matrix drawing routines in the Arduino code made very little difference (10-20 milliseconds) so I know they're not the issue.

The idle python editor on the RaspberryPi also seemed to slow things down. Running the python sender program from the command prompt was quicker and got the 390ms results above.

I'd be interested to know if there any other ways I could squeeze some more performance out of the Arduino code? Is there anything obvious I'm missing ?

Thanks
Nick

Tweaked Arduino code:

#include "RGBmatrixPanel.h" //3rd party matrix driver 

//define pins
#define A   5 
#define B   4 
#define C   3 
#define D   2 
#define CLK 10  
#define LAT 9 
#define OE  7 
RGBmatrixPanel matrix(A, B, C, D, CLK, LAT, OE, true);

void setup() {
  Serial.begin(230400);
  matrix.begin();
}

void loop() {

  byte incomingByte = 0;
 
  if (Serial.available() > 0) {  //if data, get 1st byte which is a control byte that sets what to do

    incomingByte = Serial.read();

    //if control byte is 1, draw frame
    switch ( int(incomingByte) ) {
    case 1:
      drawScreen();
      break;
    default:
      Serial.println("unknown control command"); 
    }
  }
}



void drawScreen() {

  //Receiving 32x32 pixels = 1024 pixels. Each pixel is 3x4 bit values, so each pixel = 1.5 bytes. Total bytes to receive is 1536

  byte x = 0; //screen x
  byte y = 0; //screen y
  byte r = 0; //red
  byte g = 0; //green
  byte b = 0; //blue
  byte data = 0; //incoming byte

  unsigned startTime= millis();
  
  while (y < 32 ) {  
 
   //wait for 3 bytes of data
    while (Serial.available() < 3) {
       //do nothing
    } 
   
    //read 1st byte and split into 2 lots of 4 bits, (red and green)
    data = byte (Serial.read());
    r = data & B00001111;
    g = data >> 4;  

    //read 2nd byte - b and r of next pixel
    data = byte (Serial.read()); 
    b = data & B00001111;

    //draw pixel1
    matrix.drawPixel(x, y, matrix.Color444(r,g,b));
    
   //inc for next pixel
    x++;
    
    r = data >> 4;  

    //read 3rd byte, g and b
    data = byte (Serial.read()); 
    g = data & B00001111;
    b = data >> 4; 

    //draw pixel2
    matrix.drawPixel(x, y, matrix.Color444(r,g,b));

    //inc for next pixel or reset if x at end of row
    if (x == 31){ 
      x=0; 
      y++;
    } 
    else {
      x++;
    }          
  }
  
  //print time it took
  Serial.println(millis() - startTime);
  
  //when all data received, swap buffers and show screen
  matrix.swapBuffers(false);
}

At both 115200 and 230400 baud there is no difference in the time it takes to receive and process a frame of data (1536 bytes). It's about 390 milliseconds either way.

That means that the other code - on arduino or on raspberry side - is blocking performance.

Be aware that python is an interpreted language,

small bug in arduino code:
unsigned startTime = millis(); should be unsigned long

Can you post the Raspberry code?

Hi Rob, yes that was what I was thinking. The Raspberry Pi code is in first post if you want to take a look. I\ve noticed it slows down considerably with some operations - e.g. using reading numpy arrays for screen data.

Let me know what you think.

I missed the first post, sorry.

Looking at the python code it just pumps in all data and the arduino is probably overflown with data.

You should add some sort of handshake so that the arduino confirms the receive after every 3 bytes (== 2 RGB pixels).
If that ACKnowledge is received the python program spits the next 3 bytes .

That way you at least synchronize the producer and consumer.
Furthermore there should be some kind of synchronization at the start of the stream.

why not make a tuple to allow you to set a random pixel on the Arduino screen?

protocol would look like
RP -> A: <x,y,r,g,b>
A -> RP:

Hey Rob,

Thanks for the ideas. Yeah it does just pump a full frame of data out, which is what I need to draw a whole screen of pixels at once (e.g. for example to display a series of frames).

When you say the Arduino is possibly being overwhelmed with data... I don't see any data missing on the frames, so it's not like bytes are being lost. I'd be interested to know how some kind of data overload would manifest itself as slow performance and no data get lost in the process.

I didn't write in in any ACK logic in as yet, as I thought that by adding more traffic things would slow down even more.

I do already have another routine as you suggest that just sets certain pixels and this does run much faster. It's just these full frames I'm having difficulty with.

You could also implement a horizontal line of the same color to speed things up
e.g. <x,y, len, r,g,b> it is a sort of run length compression.

as this introduces new types of datapackets / tuples there would be a need for an id as first byte in the packet
<1,x,y,r,g,b> = setpixel
<2,x1,y1,x2,y2,r,g,b> = draw any line => look for bresenham
<3,x,y,radius,r,g, b> = circle
<4,x,y,x,y,r,g,b> = square
<5,r,g,b> = floodfill whole screen
etc

You didn't mention it so I want to make sure, did you modify /etc/inittab and /boot/cmdline.txt to prevent other processes using /dev/ttyAMA0?
edit: sorry, I see you're using a USB port. Never mind.

robtillaart's suggestions are good, but may I make a few more?

The Serial class only takes strings and buffers as input. Strings, like tuples, are immutable, so building arbitrary data packets would be done by concatenation rather than insertion. A bytearray might make more sense as you can write to indices and slices of a bytearray and Serial treats it like a buffer.

I added the concept of bytearrays to your code:

import serial
from time import sleep



#setup serial port
port = "/dev/ttyACM0"
ser = serial.Serial(port, 230400, timeout=1)
sleep(1) #wait for serial 

#fill screen with white then black

for l in range(100): #draw totally white frame (all bytes value 11111111)
    ser.write("\x01")   #send control byte: 1 = draw frame 
    
                         #send screen data - 2 pixels sent in 3 bytes. 
                        #1024 pixels in total = 512 loops of 3 bytes (1536 bytes total)
    white = bytearray([255, 255, 255])
    for c in range(512):
        ser.write(white)

                          #draw totally black frame (all bytes value 00000000)
    ser.write("\x01")    #send control byte: 1 = draw frame
    
                          #send screen data - 2 pixels sent over 3 bytes. 
                         #1024 pixels in total = 512 loops of 3 bytes (1536 bytes total)
    black = bytearray([0, 0, 0])
    for c in range(512):
        ser.write(black)

Then, to create a custom packet like robtillaart suggested:

h = 1 #packet header
x = 0x13 #x coordinate
y = 0x2f #y coordinate
r = 0xc #red value
g = 0x4 #green value
b = 0xe #blue value
packet = bytearray([h, x, y, r, g, b]) #build packet

packet[3:6] = [12, 10, 3] #write new r, g, b values

The serial data rate is not maintained when going through a USB converter. The USB transfers data in batches or frames.
Also try upping the thread priority in the python code to give it more time before Linux steals some.

Thanks for all the suggestions Rob and PapaG... implementing a packet structure seems like a nice way to go.

Mike, I'll try upping the priority and see how it goes.

At the moment running functions that plot LED by LED in a packet of (x,y,r,g,b) seem to overwhelm the Arduino's serial buffer more often than not. Sending an ACK per frame slows things down too much. E.g. when plotting 100 LED's (less than 1/4 the screen) with an ACK per packet it drops the frame rate to 3 or 4 frames per second. Not sure the best way around this. I think I might try and set up something that sends a 'pause' command if the buffer get's near 128bytes full.

I think I might try and set up something that sends a 'pause' command if the buffer get's near 128bytes full.

That is known as Xon / Xoff handshaking.

Hello Arduino hivemind. I'm still struggling with getting decent speed sending serial data between my Rasberry Pi and Arduino.

To recap:

I'm driving an Arduino Mega from a Raspberry Pi. The mega runs code from Adafruit to drive an LED Matrix. I want to pump 'screens' of pixel data from the Pi over the serial link to the mega to display. Each 'screen' is 32x32 RGB pixels.

Pixels are encoded as 4 bits per colour, so 12 bits make up one pixel. I can therefore send 2 pixels P1 and P2 in 3 bytes as: R1G1, B1R2, G2B2.

The matrix is 32x32 = 1024 pixels. Each pixels is 1.5 bytes, so a whole frame takes 1536 bytes. At 115200 baud using 8N1 I should get something like 92160bits to play with (80% of 115200) or 11520 bytes/s or 11.5 KB/s.

With a rate of 11.5KB/s I reckon I should be able to send 8 or 9 frames per second.

So... I tried using Python on the Pi as the sending code but it was too slow (using things like arrays) so I've switched to C.

The test script below sends a control byte to choose "draw screen mode" then 1536 bytes of data to show a screen of red pixels , then repeats with a screen of green pixels , then blue. However I'm only seeing 1 or so FPS (with the delays in the sender code). If I remove the delays my output gets garbled - a mess of RGB pixels.

I first tried connecting the two boards over the USB serial link, but Grumpy_Mike mentioned the converter chips encapsulate my data in frames of their own so the data rate isn't maintained. So now I'm using the secondary UART on the Pi (/dev/ttyAMA0), and that is going direct into the Arduino Mega (Serial1). (I've commented this out in /etc/initab).

I've tried increasing the buffer on the arduino by changing the value in HardwareSerial.cpp but it makes no difference.

When I was connected via serial over USB I had the Arduino ACKing (by printing an "A") after receiving each byte and then waiting for the ACK in the sendData function before continuing. This worked but was also slow. 1 frame a sec or so.

I'm not sure where what else to try, other than using something like SPI instead.

Any thoughts appreciated...

Here is the C send code that runs on the Pi:

#include "rs232.h"
#include <sched.h>

#define COMPORT         0           // this is '/dev/ttyAMA0' (for raspberry pi UART pins) 
#define BAUDRATE        115200   // or whatever baudrate you want
#define RECEIVE_CHARS 4096      // or whatever size of buffer you want to receive
#define SEND_CHARS      1          // or whatever size of buffer you want to send



//Set high Priority - Need to be root to do this.
void setHighPri (void)
{
  struct sched_param sched ;

  memset (&sched, 0, sizeof(sched)) ;

  sched.sched_priority = 10 ;
  if (sched_setscheduler (0, SCHED_RR, &sched))
    printf ("Warning: Unable to set high priority\n") ;
}



void sendData(unsigned char send_byte){

   SendByte(COMPORT, send_byte); 
    usleep(250);      

}



void drawScreen(int r1, int g1, int b1){ //pass in rgb colours for all pixels - this is a test so dont need to set individual pixels

	sendData(3); //3 = enter drawScreen mode
		
	int x=0;
	int y=0;	
	int r2,g2,b2 //pixels
	int byt1,byt2,byt3; //3 bytes = 2 pixels

	 //2 pixels per 3 bytes. 1024 pixels = 1536 bytes to send 
	 while (y < 32){
	 
               //r1=0 //Pixel 1 to send - these would normally be set individually via an array but passed in for this test
               //g1=0
               //b1=0

	 	x++; //goto next pixel
	 	
	 	r2 = r1; //Pixel 2 - again would normally set via array but - just set to same as 1st pixel for this test
	 	g2 = g1;
	 	b2 = b1; 
	 
	 	x++; //goto next pixel
	 	
        if (x==32){
    		x=0;
            y=y+1;
        }
        
    	//send 3 bytes now we have our 2 pixels, pack nibbles into byte
        byt1 = 0xff & (((0xff & g1) << 4) | (0xff & r1)); //pack r in lhs g rhs
    	sendData(byt1);
        
        byt2 = 0xff & (((0xff & r2) << 4) | (0xff & b1));
        sendData(byt2);
        
        byt3 = 0xff & (((0xff & b2) << 4) | (0xff & g2));
        sendData(byt3);
        
	 }
}


int main (int argc, char **argv) {

	setHighPri () ; //run as high priority - doesnt seem to make any odds

	OpenComport(COMPORT, BAUDRATE);	
	sleep(2);

	while(1) {
		drawScreen(5,0,0); //draw screen of red
		usleep(1000);
		drawScreen(0,5,0); //draw screen of green
		usleep(1000);
		drawScreen(0,0,5); //draw screen of bue
		usleep(1000);
	}
	CloseComport(COMPORT);
}

Here is the arduino receive code:

#include "RGBmatrixPanel.h"

#define A   5 //A3
#define B   4 //A2
#define C   3 //A1
#define D   2 //A0
#define CLK 10  // was 8 MUST be on PORTB!
#define LAT 9 
#define OE  7 
RGBmatrixPanel matrix(A, B, C, D, CLK, LAT, OE, true);

void setup() {
  Serial1.begin(115200);
  matrix.begin();
  //Serial.println("matrix routine started");
}

void loop() {

  byte incomingByte = 0;

  if (Serial1.available() > 0) {  //if > 1 get 1st byte which is control byte
    
    // read the incoming byte:
    incomingByte = Serial1.read();

    switch (incomingByte) {
    case 0:
      //Serial.println("clear screen");
      clearDisp();  
      break;
    case 1:
      //Serial.println("swap buffers");
      matrix.swapBuffers(false);
      break;
    case 3:
      //Serial.println("Drawing screen of 1024 pixels"); 
      drawScreen();
      break;
    case 49:
      Serial.println("light a test led"); 
      testLed();
      break;
      //default:
      //Serial.println("unknown cmd"); 
    }  
  }
}



void drawScreen() {

  //Receiving 32x32 pixels = 1024 pixels. Each pixel is 3x4 bit values, so each pixel = 1.5 bytes. Total bytes to receive is 1536

  byte x = 0; //screen x
  byte y = 0; //screen y
  byte r = 0; //red
  byte g = 0; //green
  byte b = 0; //blue
  byte data = 0; //incoming byte

  //unsigned long startTime= millis();

  while (y < 32 ) {  

    //wait for 1st byte of data
    while (Serial1.available() < 1) {
      //do nothing
    } 
    
    //ack
    //Serial.print("A");
    
    //read 1st byte and split into 2 lots of 4 bits, (red and green)
    data = byte (Serial1.read());
    r = data & B00001111;
    g = data >> 4;  
    
    //wait for 2nd byte
    while (Serial1.available() < 1) {
      //do nothing
    } 
    
    //ack
    //Serial.print("A");
    
    //read 2nd byte - b and r of next pixel
    data = byte (Serial1.read()); 
    b = data & B00001111;
    
    //draw pixel 1
    matrix.drawPixel(x, y, matrix.Color444(r,g,b));

    //inc x for next pixel along
    x++;
    
    //get red val from 2nd byte for 2nd pixel
    r = data >> 4;  

    //wait for 3rd byte
    while (Serial1.available() < 1) {
      //do nothing
    } 
    
    //ack
    //Serial.print("A");
    
    //read 3rd byte, g and b
    data = byte (Serial1.read()); 
    g = data & B00001111;
    b = data >> 4; 

    //draw pixel 2
    matrix.drawPixel(x, y, matrix.Color444(r,g,b));

    //inc for next pixel or reset if x at end of row
    if (x == 31){ 
      x=0; 
      y++;
    } 
    else {
      x++;
    }          
  }

  //print time it took
  //Serial.println(millis() - startTime);

  //when all data received, swap buffers to show screen
  matrix.swapBuffers(false);

}

void clearDisp() {
  matrix.fill(matrix.Color333(0, 0, 0));
}

void testLed() {
  matrix.drawPixel(0, 0, matrix.Color444(15,0,0)); 
  matrix.swapBuffers(false);
  delay(500);
  matrix.drawPixel(0, 0, matrix.Color444(0,0,0));
  matrix.swapBuffers(false); 
}

Do you see the same problems at a really slow baud rate like 9600?

If you send fast enough to get ahead of the arduino's read/draw loop by more than the size of the serial buffer on the arduino there will be lost characters, right?

You may have to transmit that big packet in smaller chunks to avoid the pig-in-the-anaconda effect.

-br

Ok i think you might be right. I realised I adjusted the serial buffer on the wrong Arduino version so it didnt take effect. I've pushed it all the way up to 512 bytes and I can now get away with removing all the pauses apart from a usleep(1) after sending every bit (in the sendData function.)

Pushing the buffer to 1024 as mentioned in a few posts Random movements during printing seems to hand the arduino somewhere, plus I'm sure it's not the most elegant solution.

It seems a little faster to the eye through the secondary UART rather than the USB. I'm getting maybe 2/3 FPS now. Still not fast enough for what i'd like though.

Pushing the buffer to 1024 as mentioned in a few posts Random movements during printing seems to hand the arduino somewhere, plus I'm sure it's not the most elegant solution.

Since the buffer size affects both the incoming and outgoing buffers for all the instances of HardwareSerial, of which there are 4 on the Mega, setting the buffer size to 1/8 of the available SRAM will cause every bit of SRAM to be used just for the Serial functions. That doesn't seem like a good idea to me.

hi, sorry for my bad english.
But i have the similar problem with a program in c#, I has open the COM port at 250.000 bps in pc and in the arduino leonardo.

But really the port always is opening at 9600 bps, trought virtual port COM the faster speed is 9.600 bps.

I think that trought a dll is posible

With a ft232 i has the same problem, virtual COM = 9.600 bps max, but using the d2xx.dll until 3.000.000 bps

I need know how to do the same with arduino leonardo (trought a dll)

If you know please tell me

thanks