Arduino & HD44780 timings


This is my first post on forum.
I wonder if it's possible to measure the time that is needed to display characters on LCD 2x16 by using parallel or serial communication (with converter on pcf).
At first by parallel communication I think about measure time from start sending data to read Busy Flag that goes to zero. But when I start to read about this, many people said that it is useless because it will take more time to change Arduino pins from output to input then hd44780 will work. Is it true?

Have you got any other ideas to measure and compare this?

Install Bill Perry's hd44780.h library via the IDE Library Manager.

There are example diagnostic sketches that compare different LCD library speed.

LCDs are slow. If you change pixels too fast it looks horrible and smeary. Since a human can't read very quickly it is pretty pointless to worry too much about speed.


The hd44780 library includes sketches that will measure the byte write time to the display for all the various interfaces supported.
Install the library using the library manager.
Then you can select the appropriate i/o class for whatever h/w you have.
The sketch you will want to run is the LCDiSpeed sketch - you will need to select the appropriate one depending on hardware (i/o class) you are testing.
The library also includes the ability to run timings for other libraries as well such as the LiquidCrystal library.

The LCDiSpeed sketch measures the time it takes to write a character to the display. It includes all the overhead of the library.
This is important since depending on how the library is written the time to write a character to the display can very substantially even on the same hardware.
The time reported is the time a sketch will see when writing characters to the display.

And in terms of using BUSY polling. Yes it is not worth it for most instructions.
I have spent MANY MANY hours (days/weeks) looking the hd44780 timing and various timings of hardware interfaces used to communicate with these types of displays using a logic analyzer on many different LCD libraries.

For clear and home instructions - it might be faster, depending on the library, but likely not.
For all others, it will slow things down quite a bit given the the way Arduino defined the API for digital i/o routines (digitaWrite(), digitalRead(), pinMode()). It is particularly true on the AVR platforms given a combination of the way AVR chip does its i/o and the sub optimal coding for the digital i/o routines provided by the AVR core library.

For example, the hd44780 instruction to write a character to the screen takes no more than 37us to execute inside the chip.
Now lets look at the timing on a 16Mhz AVR. (UNO type board)
It can vary depending on version of the compiler but these numbers should be close.

Each digital i/o API call like digitalWrite(), pinMode,(), digitalRead() takes 4.5-6us
depending on which call.
So suppose you are wanting to read the BUSY status.
You have to flip all the data pins from output to input, change R/W to high, then strobe E appropriately, while you read the DB7 pin.

In 4 bit mode:
pinMode() 4 times, 1 for each data line to input mode
digitalWrite() - RS, RW high, E high, E low
digitalRead() - DB7
digitalWrite() E high, E low (for second/low nibble)
pinMode() 4 times, 1 for each data line to output mode
digitalWrite() R/W low

4 * 5 + 4 * 6 + 6 + 2 * 6 + 4 * 5 + 5= 87us

Even if you look only at the time to just to read the DB7 pin (not the needed clean up after reading the pin)
You are at 4 * 5 + 4 * 6 + 6 = 50us
It is already longer than the instruction time so you would never even see a BUSY status for anything but clear and home.

So you can see it doesn't make sense to read BUSY since it takes longer just to get to the point of being able to reading the DB7 pin than the LCD instruction takes to execute.

If in 8 bit mode, counter intuitively it takes even longer.
It changes to
pinMode() - 8 times, 1 for each data line input mode
digitalWrite() - RS, R/W, E high, E low
digitalRead() - DB7
pinMode() - 8 times, 1 for each data line output mode
digitalWrite() - R/W low

8 * 5 + 4 * 6 + 8 * 5 + 5 = 109us

Even in 8 bit mode, it still takes longer to read BUSY than the instruction time.

Can reading busy ever be faster, yes, but not when using Arduino and its APIs.
Yes you could write some faster code if the code was hard coded to certain ports/pins so that you could use the AVR specific bit set and bit clear instructions.
But that would no longer be using the Arduino API i/o functions and more importantly would no longer allow the user to configure the pins being used.

---- bill

Thank you for long and interesting answer!

I will try to use LCDISpeed sketch. Currently I'm trying to understand it, how it works.
Should I use example from hd44780 -> ioClass -> hd44780_pinIO -> hd44780examples -> LCDISpeed for standard 2x16 lcd with 4bit connection?

if you have a pcf8574 based backpack, use the hd44780_I2Cexp i/o class.

The code uses micros() to do the actual timing.
It gets the current time using micros() sends a bunch of characters, then uses micros() to get the current time again when it finishes.
Subtracts the last from the from first one to get the elapsed time.
It then divides the elapsed time by the number of byte transfers to get the byte transfer time.

--- bill

During waiting for your answer to be sure I create something like this:

void loop()


TimeFromStart = micros();

lcd.setCursor(0, 0);


lcd.setCursor(0, 1);


TimeDiffrence = TimeFromStart - TimeSaved;

TimeSaved = TimeFromStart;

Serial.println(TimeDiffrence - 2000000);



What do you think about this?
I wonder if this is working similar to your's example?

My results:

Results from your sketch:
ByteXfer: 93us
16x2FPS: 314.86
Ftime: 3.18ms

And I2C results:

ByteXfer: 555us
16x2 FPS: 53.01
Ftime: 18.86ms

Tomorrow I will try to do some conclusions of results, today its to late.

Well it prints numbers.
But is it not printing the elapsed time for the LCD operations.


--- bill

Please read the Forum Guide.

Please do not put "sketch code" into a PDF.

It is always wise to copy-paste text into your message. Preferably into a "code-window". If the code is very big you can attach as a INO or TXT file. Or post a link to a GitHub repository.
If you have several related files, you can attach as a ZIP file.
The Forum does not permit some extensions like .BMP. They can be placed in a ZIP file.

Regarding HD44780 timing. Every library can print faster than you can read the LCD.
Bill Perry's diagnostic will show the actual times.
Study his code to see how he does it.


Well I thought it was wrong.

But I'm trying to analize bill code and it also use micros to measure time of fill LCD Screen and then use some maths to get these three values, right?

Edit: Also I have other question, where I can find information about
"Each digital i/o API call like digitalWrite(), pinMode,(), digitalRead() takes 4.5-6us
depending on which call." ?

It isn't that complicated.
like I said.
call micros() to get current/start time
do some stuff
call micros again to get current/end time
Subtract start time from end time and that is the elapsed time for operations between the two micros() calls.
For getting byte transfer times,
divide that elapsed time by the number of transfers you did to get the time for a single transfer.
Each character written is a single byte, and setCursor() also is a single byte.

To get the exact timings you can do it with h/w or s/w.

  • you can use something like a logic analyzer.
  • You can use the same method to time those digital i/o operations.
    put a bunch of them in a row (the same digitalWrite(), or pinMode() or digitalRead() ) and use micros to time them, then devide by the number of API calls you did you did to get the time for a single one.
    I would do like 20-50 of them (not in a loop) to get an accurate average time.
    Just make sure that all the API calls are the same if you want to get the time for that API call.

--- bill

Adafruit has a library for their MCP23017 i2c backpack that takes like 3-4ms to transfer each character.
You can definitely see the characters showing up on the display.

--- bill

you have to check micros() after your last lcd print and then calculate the difference

TimeFromStart = micros();
lcd.setCursor(0, 0);
lcd.setCursor(0, 1);
timeAfterEnd = micros();
timeTotal = timeAfterEnd - TimeFromStart ;


thank you for fix some bugs in my try

So when in "stuff" is
lcd.setCursor(0, 0);

then there is 17 bytes, right?

  1. About timings of switching ports I will do some tests like you said but is it possible to find information about it in any documentation from Arduino to compare results?

  2. "For example, the hd44780 instruction to write a character to the screen takes no more than 37us to execute inside the chip."
    I can also find this time in documentation but I have question. If there is 32 characters then the time to execute in chip is 37*32? = 1184? Anyway it is impossible to read busy because we can't wirte and read in the same time.

Read the HD44780 datasheet carefully. The 37us is a typical time that is dependent on the HD44780 RC clock running at a typical 270kHz speed.

You either check BUSY or you allow sufficient time for a "slower than usual" RC clock. Personally I allow 50us e.g. for fOSC=200kHz.

As Bill Perry has explained. Arduino libraries need to be portable.
You are never going to achieve 37us or even 50us with digitalWrite()

I can assure you that you can drive HD44780 at maximum speed with careful AVR coding. But does it matter ?

Oh, if you want to time things to the microsecond level buy a Logic Analyser. Or use the MCU Timers and a little care.


So when in "stuff" is
lcd.setCursor(0, 0);

then there is 17 bytes, right?

Yes, the secCursor() call sends a Set DDRAM address instruction which is a single byte and then each printed character counts as a single byte.

The information you got from me is about as good as you will find.
From nothing exists.
The development team has not ever documented this.
In fact, the early team has throughout the years appeared to be fairly clueless about certain aspects of h/w and s/w.
This includes things like atomicity, API semantics, and how to write code that can execute faster and still maintain/preserve an existing API.

In terms of actual timing, of the digital i/o API functions, it can vary substantially depending on the arduino board used, the speed of the processor, and the platform core.
Even the version of gcc can have a significant affect on the timing overhead.
i.e. to set a pin with digitalWrite() can vary from around 6us on an UNO, down to a few 10s of ns on something like a ESP part.
And even on an UNO it can vary a 1us or so depending on the version of gcc.
Another example is that Teensy products from paul stroffegen use his 3rd party core that he provides. His AVR core is 40x faster for digitalWrite() than the one from (He offered it to them, but they refused to accept multiple times)

But keep in mind that there is more overhead than just the digital i/o routines and the LCD execution time.
There is also the overhead of getting the byte to the LCD which can be significant depending on the i/o interface.
Even on a parallel interface you have the overhead of setting up the control lines and strobing E.

And when using an i2c backpack, you must send two nibbles, and each nibble has to transfer multiple bytes across the i2c bus which clocked at 100 kHz.
And then there are multiple transfers per nibble.
So it takes 100s of us to send a single byte to the LCD.

And then there is overhead of the Print class which does the i/o output (printing)
i.e. lcd.print() etc...

The transfer times from LCDiSpeed include all of that as it times the byte transfers going through the Print class using the print() method.

Just curious. Where are you going with this?
I get the fascination if you are interested in writing some of you own code, but if you are just looking for LCD byte transfer timings, you can already get that today from the hd44780 library package.

--- bill

Actually it is possible.
On some platforms it is close to 20 us using digitalWrite() to transfer a byte - in 4 bit mode.
But that is not on an AVR based board which is using the bundled AVR core.

Boards using the chipkit and the ESPxx platforms are the fastest, with the Teensy platform right up there as well. (haven't tested Teensy in a while on the newer Arm based boards)

--- bill


Well I'm doing something for study that's why I'm asking for documentation in pdf.
I'm just looking for transfer timings. Thank you for long, precise answers. I think I know everything I want to, and also I deepened my knowledge about avr !

Keep in mind that the AVR is quite fast. It can set a pin state in just a couple of clock cycles.
The big overhead or slowness of the Arduino digital i/o API functions is the combination of the Arduino API semantics combined with the code implementation that precludes the use of the limited AVR bit set/bit clear capabilities that pushes that out to as much as 6us.

That said other platforms that use much faster processors can be as fast as 40ns using the same API functions.

--- bill

Yes, I see Arduino is slow but also more user friendly then clear avr.

I still don't know why your code show 93 micros for parallel and code @noiasca after divide about 288us. Is it about library used to handle display?

No clue what your code does since we can' t see it, but like I said the overhead to write to the display can vary substantially depending on the version of the IDE & compiler, the library used, and the i/o interface to the display.
i.e. pin control, I2C, spi, etc..

The hd44780 library handles the i/o to the display differently than any other library and because of this, it is often quite a bit faster than other libraries even on the identical hardware.

BTW, the hd44780 library comes with sketches to test the byte transfer timing for some other libraries including the bundled LiquidCrystal library.
Using the LCDiSpeed test sketch you can directly compare the timings for say hd44780_pinIO class to the bundled LiquidCrystal library on the identical h/w to see the difference between the two different libraries.

My recommendation is that if you want a feature rich library that is actively maintained, has capabilities not in other libraries, and is faster than most other libraries, use the hd44780 library.

--- bill