Controlling LCD via MCP23017/I2C

I know there's been a dozen posts on this, I've scoured them all and could find nothing to fix the problem; but I apologize in advance if this seems redundant.

I have a 20x4 parallel lcd (from adafruit) connected to a MCP23017. Port A is controlling an 8 switch relay, and works perfect (I can control each relay independently, works completely as expected). Port B is supposed to control the lcd, but I'm having problems. I haven't used a pre-built 'backpack' or library mostly because I want to learn.

I've wired it as follows:
GPB0 -> RS
GPB1 -> R/W
GPB2 -> E
GPB4-GPB7 -> D4-D7

Backlight turns on, potentiometer works, constrast is adjusted as expected. Yet try as I might I can't seem to send a command to the lcd or display. Currently what I have is:

void WriteMCP(uint8_t addr, uint8_t reg, uint8_t value) {
  Wire.beginTransmission(addr);
  Wire.write(reg);
  Wire.write(value);
  Wire.endTransmission();
  }

void LCDPulse(uint8_t cmd) {
  delay(5);
  WriteMCP(0x20, GPIOB, cmd);
  delay(5);
  WriteMCP(0x20, GPIOB, cmd | 0x04);
  delay(5);
  WriteMCP(0x20, GPIOB, cmd);
  delay(5);  
  }

void LCDInit() {

  WriteMCP(0x20, IODIRB, 0x00);            // set ports to output
  WriteMCP(0x20, GPIOB, 0x00);             // zero outputs

  // power on
  delay(20);
  WriteMCP(0x20, GPIOB, 0x30);
  delay(20);
  WriteMCP(0x20, GPIOB, 0x30);
  delay(20);
  WriteMCP(0x20, GPIOB, 0x30);

  WriteMCP(0x20, GPIOB, 0x20);    // 4 bit mode

  // function set
  LCDPulse(0x20);
  LCDPulse(0x00);
  delay(2000);

  // display on
  LCDPulse(0x00);
  LCDPulse(0xF0);
  delay(2000);
  
  // display clear
  LCDPulse(0x00);
  LCDPulse(0x10);
  delay(2000);

  // write text
  LCDPulse(0xF1);
  LCDPulse(0xF0);  
  }

I've read through the HD44780U datasheet multiple times, I've looked at numerous websites with examples, I've even downloaded other lcd/i2c libraries and looked at the source code to see where I went wrong/what I'm missing but to no avail.

I've checked each connection between the mcp and lcd repeatedly with a multimeter to ensure each pin/solder connection is correct. The relay connected to Port A works fine.

The only thing I saw differently between my attempts and others was when writing the MCP, the register was never set. For example in LiquidCrystal_I2C they use:

#define printIIC(args)	Wire.write(args)

// ...

void LiquidCrystal_I2C::expanderWrite(uint8_t _data){
	Wire.beginTransmission(_Addr);
	printIIC((int)(_data) | _backlightval);
	Wire.endTransmission();   
}

I'm not sure how this works at all, from what I've seen in examples for the MCP23017, and what I understand of the MCP23017's datasheet, each beginTransmission() must specify the register before writing data to it. Or maybe not? I'm not too sure at this point.

Any suggestions/ideas would be helpful.

Any suggestions/ideas would be helpful.

I have not read through your code and don't have time to do it right now. Are you sure that your basic LCD programming techniques are correct? In other words do your techniques work when your LCD is connected directly to the Arduino without using the I2C interface?

If not then you have to get that part done first. You can find some examples here.

I haven't used an MCP23017 in several years but I do have some code around here that may be helpful to you. As I recall it is written for an 8-bit interface so it uses both ports of the MCD23017 but the real catch is that it is written in assembly language. The code is heavily commented so if you have any programming experience at all it may be of help.

Let me know if you think this code will be useful and I will dig it up. I may not be around for a day or so as I am scheduled for some minor surgery later this morning.

Don

Kaisha:
I have a 20x4 parallel lcd (from adafruit) connected to a MCP23017. Port A is controlling an 8 switch relay, and works perfect (I can control each relay independently, works completely as expected). Port B is supposed to control the lcd, but I'm having problems. I haven't used a pre-built 'backpack' or library mostly because I want to learn.

Most of the "i2c LCD backpack" libraries out there are for a backpack that uses a PCF8574 not a MCP23008 or MCP23017.
PCF8574 chips work differently than the microchip expanders.
Adafruit does have a library for their backpack (#292 board) which uses a MCP23008 and then they have a LCD keypad that uses a MCP23017.
Those libraries should offer some insight. I will say that while they do work, the are DOG slow because of the way they control the pins.
My hd44780 library supports both PCF8574 and MCP23008.
The MCP23008 and the MCP23017 work basically the same other than the MCP23017 has a duplicate set of registers for the other 8 bit port.

The only thing I saw differently between my attempts and others was when writing the MCP, the register was never set. For example in LiquidCrystal_I2C they use:

#define printIIC(args) Wire.write(args)

// ...

void LiquidCrystal_I2C::expanderWrite(uint8_t _data){
Wire.beginTransmission(_Addr);
printIIC((int)(_data) | _backlightval);
Wire.endTransmission();  
}




I'm not sure how this works at all, from what I've seen in examples for the MCP23017, and what I understand of the MCP23017's datasheet, each beginTransmission() must specify the register before writing data to it. Or maybe not? I'm not too sure at this point.

You are correct in your understanding of how to address the MCP23017 registers. The code you are looking at is for a PCF8574 which does not use nor require specifying the target register as there is only a single register for the output port. It has not configuration/control registers.

From glancing at your code, one issue that you definitely have is that in order to use a single 8 bit port to control the LCD you must put the LCD into 4 bit mode, but your layering of how to get the 8 bit data to the LCD is incomplete.
You are trying to send a full byte to the LCD by calling LCDPulse()
That won't work and it isn't the way the code in the all the other libraries work.
They have separate layers (functions) for sending the data to the LCD vs sending a byte to the expander port.
The two operations are not the same. When sending a byte to the LCD you must break the byte into nibbles and then take that nibble to create the needed bytes to send to the expander to send that nibble to the LCD. (It takes sending more than a single byte to the i/o expander output port to send a single nibble to the LCD)

In the case of the Microchip i/o expanders, you always have to send an extra byte first to specify the starting target register. That is why a MCP23008/MCP23017 will always be slower than a PCF8574 - It simply has more i2c overhead.

For examples of working code, you could look at the two adafruit libraries, or my hd44780 library.
That said, my hd44780 library is pretty complicated as it is a multi layer/class library that supports multiple communications interfaces not just i2c and for i2c expanders it supports both PCF8574 and MCP23008.

I will also say that creating a robust fully functional LCD library is a non trivial task that requires quite a bit of knowledge of both the HD44780 interface as well as the communication interface (data over i2c using a MCP23017 in your case)

Having used the MCP23008 and the MCP23017, I don't care for them. And I really dislike the register layout of the MCP23017. They totally f'd it up IMO. There are multiple modes (bank 1 vs bank 0 mode) and while you can set it to make PORTA look/work like a MCP23008 which I find usefal, I found no way to reliably detect which bank mode the chip is in.
If you are wanting/needing to use bank 1 mode, this is critical to have a reliable initialization
as the device powers up in bank 0 mode and you need to reliably get the part into bank 1 mode.

Anyway, just my little personal rant on that chip.

--- bill

I really appreciate the responses.

floresta: if you have a link to your assembly I'd love to see it.

bperrybap:

You are correct in your understanding of how to address the MCP23017 registers. The code you are looking at is for a PCF8574 which does not use nor require specifying the target register as there is only a singleregister for the output port. It has not configuration/control registers.

Ok that explains a lot.

From glancing at your code, one issue that you definitely have is that in order to use a single 8 bit port to control the LCD you must put the LCD into 4 bit mode, but your layering of how to get the 8 bit data to the LCD is incomplete.
You are trying to send a full byte to the LCD by calling LCDPulse()

LCDPulse() sends a byte to the mcp, but only 4 bits to the lcd, which is why its called in pairs. For example the 'display on' function is the byte 0x0F (display on, cursor on, blink on), the high nibble is 0x0 and the low 0xF. So I send 0x00 and 0xF0 to the mcp, the bottom nibble (pins 0, 1, 2) control the rs, rw, and e pins respectively. I actually had a larger library I wrote that did things properly (ie. didn't hard code everything, allowed pin reassignments, etc...) but it wasn't working, so I stripped out the bare minimum of it, hardcoded the values for my particular setup, just to see if I could get anything to show on the screen.

For examples of working code, you could look at the two adafruit libraries, or my hd44780 library.
That said, my hd44780 library is pretty complicated as it is a multi layer/class library that supports multiple communications interfaces not just i2c and for i2c expanders it supports both PCF8574 and MCP23008.

If you have a link to the lcd library that'd be wonderful.

Having used the MCP23008 and the MCP23017, I don't care for them. And I really dislike the register layout of the MCP23017. They totally f'd it up IMO. There are multiple modes (bank 1 vs bank 0 mode) and while you can set it to make PORTA look/work like a MCP23008 which I find usefal, I found no way to reliably detect which bank mode the chip is in. If you are wanting/needing to use bank 1 mode, this is critical to have a reliable initialization as the device powers up in bank 0 mode and you need to reliably get the part into bank 1 mode.

Other than the fact that IOCON should be the same register address in both banks (allowing someone to trivially change banks on the fly) I found the bank-thing to be rather straightforward. Either way, I can set the values on the mcp pins from the MCU without problem on both ports. Checked all manually with a multimeter.

Anyway, just my little personal rant on that chip.

If you were to use a chip for GPIO expander purposes in the future, what would you use?

Kaisha:
LCDPulse() sends a byte to the mcp, but only 4 bits to the lcd, which is why its called in pairs. For example the 'display on' function is the byte 0x0F (display on, cursor on, blink on), the high nibble is 0x0 and the low 0xF. So I send 0x00 and 0xF0 to the mcp, the bottom nibble (pins 0, 1, 2) control the rs, rw, and e pins respectively. I actually had a larger library I wrote that did things properly (ie. didn't hard code everything, allowed pin reassignments, etc...) but it wasn't working, so I stripped out the bare minimum of it, hardcoded the values for my particular setup, just to see if I could get anything to show on the screen.

But this looks very strange to me.

// write text
  LCDPulse(0xF1);
  LCDPulse(0xF0);

It looks like you set 0xf on the upper data lines with the r/s pin high
Then you set 0xf on the upper data lines with the r/s pin to low
I don't know how the internal hd44780 module latches the RS signal (on first vs 2nd nibble) to know if that will perform a write data instruction or a set DDRAM address instruction.

I'd still recommend sticking with/using a more layered approach as it avoids these types of issues/errors.

If you have a link to the lcd library that'd be wonderful.

You can find the adafruit libraries on their site.
My library is here: GitHub - duinoWitchery/hd44780: Extensible hd44780 LCD library

Other than the fact that IOCON should be the same register address in both banks (allowing someone to trivially change banks on the fly) I found the bank-thing to be rather straightforward.

It isn't that it is difficult. It is that the way they did it, makes it impossible to reliably identify the part and what mode it is in. i.e. trying to identify the part between a MCP23008 vs MCP23017 especially when the MCP23017 part is in byte mode.
I use byte mode, as it can significantly speed things up as you don't have to let go of the i2c bus and re acquire it between each byte sent to the output port. This is a significant performance gain.

Keep in mind I'm looking at all this from the perspective of being backwards compatible with the MCP23008 and being able to automatically auto-identify the part types.
In in these two areas, the MCP230017 kind of falls on its face.
Don't get me wrong the part works and is a decent chip but they could have at least made it powerup in a way that would work with existing MCP23008 s/w and allow s/w to probe to be able to reliably detect which mode it is in.

If you were to use a chip for GPIO expander purposes in the future, what would you use?

Depends on the application and its requirements.

The MCP parts can be clocked MUCH faster and can run SPI mode which is even faster, so even with the extra overhead it can still be as fast or faster than the PCF parts.
But if all that is needed is a simple i2c interface for a LCD, I'd use a PCF8574 backpack. They are so cheap and s/w is readily available. They are about $1 USD shipped to your door.
You can even use the lcd backpacks for other applications than a LCD backpack.
i.e. one for the LCD and another one for other I/O.

The MCP parts have much more drive capability than the PCF parts. The PCF parts can't drive anything, they can only sink current. For the HIGH signals they use internal weak pullups.
The MCP parts can drive and sink current.

I'm looking at providing MCP23017 support for my openGLCD library.
The KS0108 parts need about 13 pins so dual ports are needed.
But for speed it will be in byte sequential mode so I can slam out the bytes to both ports in back to back bytes on the i2c bus all in the same bus aquisition. (speed really matters on a glcd since there is so much data moving to the display)

What to use really just depends on the application.

Got it working :slight_smile:

Thank-you bperrybap very much. While the error you spotted was an error, but didn't fix anything (the lcd wasn't even getting that far). That said the library you linked/wrote had the answer.

Your explanation on the lcd initialization made is clear. The docs didn't mention that the 0x30 had to be strobe'd in (pulsed, I'm not sure what the official term is, but commands have to be sent by pulling enable high, then low). I wasn't doing that, nor were any of the other libraries (about 5 different ones) that I looked through. Course they were all using 8-bit interfaces, so doing it wrong probably had no effect. But once I saw what you did, and sent the three 0x30's and the following 0x20 using a 'pulse/strobe/whatever' it worked perfectly.

So I REALLY REALLY appreciate you spending the time to help a total stranger.

You are right about the MCP/LCD combo being slow. I could send all the bytes in a single i2c transmission, but doesn't there have to be pauses between the commands being sent? Doesn't that require a begin/end transmission? Since nothing is passed until end, a send() delay() send() wouldn't actually work? Do you use a i2c restart command (endTransmission(false); I think it is), or is the i2c/mcp just so slow that pushing the commands into the lcd at max speed is still far slower than it needs?

Also on a tangent, I can't seem to find anywhere where it states what C++ features are supported. From what I understand Arduino uses gcc under the hood, but clearly there are some limitations for using a MCU. Does it support C++11? 14? 17?? Does it support templates, rvalues, lambdas, multiple inheritance? Is there a page somewhere that lists/explains what it does and does not support?

Kaisha:
Got it working :slight_smile:

Thank-you bperrybap very much. While the error you spotted was an error, but didn't fix anything (the lcd wasn't even getting that far). That said the library you linked/wrote had the answer.

Your explanation on the lcd initialization made is clear. The docs didn't mention that the 0x30 had to be strobe'd in (pulsed, I'm not sure what the official term is, but commands have to be sent by pulling enable high, then low).

The hd44780 spec is actually very clear on how to send instructions to the LCD or how to read data or status from it.
There are timing diagrams that show in detail how the electrical interface signals are supposed to behave and the required timing.

In terms of the "0x30" being sent, during initialization, a 0x30 really isn't being sent.
A 0x3 is put on the upper 4 bits of the 8 bit interface and clocked in.
Doing this 3 times (with appropriate timing) will ensure that the display is put back into 8 bit mode regardless of which mode it was in when the sequence started and regardless of whether the lower 4 bits of the interface can be driven by the host.
This is how you can reliably get the LCD back to 8 bit mode and then subsequently get it into 4 bit mode when the host can only control 4 of the 8 data pins. (A full description of this is in the hd44780 library code)
i.e. if you have a 4 bit interface you only clock in the upper nibble of the 0x30 not both nibbles during this part of the initialization.

I wasn't doing that, nor were any of the other libraries (about 5 different ones) that I looked through.

Yes all the other libraries (if they are working) must be doing this as the only way to clock in instructions is to present the data on the bus when E is high and lower it. The LCD clocks in the data on the falling edge of E.
If they were only using 8 bit mode, it is possible that they didn't do the sequence to put the device back into 8 bit mode as they would not need to do so since the lcd powers up in 8 bit mode.
But they would always have to present the data on the data bus a clock it in using the E signal using the appropriate timing.

You are right about the MCP/LCD combo being slow. I could send all the bytes in a single i2c transmission, but doesn't there have to be pauses between the commands being sent?

"slow" is relative. The code you have written will be extremely slow, it may even be slower than the adafruit code as your code is doing delays that are much longer than is needed for the h/w timing.
But then your code isn't dealing with the instruction execution timing at all.

Doesn't that require a begin/end transmission? Since nothing is passed until end, a send() delay() send() wouldn't actually work? Do you use a i2c restart command (endTransmission(false); I think it is), or is the i2c/mcp just so slow that pushing the commands into the lcd at max speed is still far slower than it needs?

All transfers require a begin/end.
The hd44780 spec describes the timing for both the h/w interface byte transfers as well as the instruction processing/execution times. The timing shown is when the LCD is operating at the reference clock speed.
Nearly all LCDs will work with the reference timing but not all of them, some are slower.

While the i2c interface is a bit slow in terms of computer timing speeds, it does not add enough overhead in all cases. Also the speed of the Arduino processor can affect the timing as well.
Most libraries (All the lcd libraries I’ve seen, including your code) do blind delays to try to account for the timing.
hd44780 is very smart about about how it handles the timing (both types of timing) and allows the arduino to continue to run in parallel with the LCD during LCD instruction execution.

Like I said, it is a fairly large task to write a robust feature rich library.

Also on a tangent, I can't seem to find anywhere where it states what C++ features are supported. From what I understand Arduino uses gcc under the hood, but clearly there are some limitations for using a MCU. Does it support C++11? 14? 17?? Does it support templates, rvalues, lambdas, multiple inheritance? Is there a page somewhere that lists/explains what it does and does not support?

The Arduino IDE uses gcc/g++ the version can vary depending on the version of the IDE.
But then then there are also core add ons which also use gcc so each core can use a different version of gcc since they are separate toolsets.

The AVR is a bit of a mess because it is an 8 bit processor with very limited memory and no ability to directly access constant data. This creates many issues including the messy AVR proprietary PROGMEM crap and xxprintf() code with limited functionality that are used by default.

The simple bottom line is the version of the gcc used and its capabilities will vary.
The only way to know is to see what version of gcc you have for the particular core you have and then go read about it.

But keep in mind that the C language and the C libraries are separate.
So while the "language" may describe certain capabilities, not all of it is part of the actual language so they may not be available because they are implemented in a library and certain types of functionality are typically omitted or not possible when using an embedded environment. This is not an Arduino thing or limitation, but rather a limitation of running in an embedded environment vs on a full fledged computer with an operating system.

--- bill

The Arduino IDE uses gcc/g++ the version can vary depending on the version of the IDE.
But then then there are also core add ons which also use gcc so each core can use a different version of gcc since they are separate toolsets.

So if I'm reading this correctly, the latest version of gcc supported is 4.9.2? So I can use MI with virtual dispatch? Does it support lambdas?

Does it support lambdas?

I think so. I was tinkering around with OTA programming of some esp8266 modules when I ran into them.

Check out the BasicOTA.ino example here to see what i mean.

Don

From reply#3

floresta: if you have a link to your assembly I'd love to see it.

I have attached the code (I think), I hope it helps.

Don

EDIT: I had to add the .txt to allow it to upload

I2C-LCD-v03.asm.txt (30 KB)

floresta:
I think so. I was tinkering around with OTA programming of some esp8266 modules when I ran into them.

Check out the BasicOTA.ino example here to see what i mean.

Don

He's using std::function? Doesn't that require a heap (ie. new/delete)?

I was playing around a bit yesterday and from what I can tell most of the template functionality is there. Haven't yet test variadic templates, but I was able to pretty much finish my HD44780U library abstracting out the MCP23017 communication code via compile time polymorphism (so I could easily add new chips/communication methods). That allows me to configure the lcd as I feel fit, then the compiler can optimize out almost everything, it also removed all the run-time variables (which saves on memory : ).

While I really appreciate the links/response, as far as controlling the HD44780U, I've pretty much got it up and working, all bells and whistles. The only thing I'm unsure of is how other libraries are handling timing. There are times specified in the datasheet, which I've used (with an optional scaling factor for slower or faster clocked LCDs), but the documentation still mentions the need to check the 'busy flag' in addition to the timings. Thing is, I don't see a lot of libraries using this busy flag check. Particularly in the case of a 4-bit bus, not connecting the r/w pin saves a pin, and so most don't even connect it. It seems to be working without the busy flag check, but I'm wondering if I should add some additional waiting overhead 'just in case'? Ignore it and hope for the best? Spend the time to switch the MCP23017 into read mode (at least on the busy pin), read the busy flag (perhaps multiple times), switch back to output mode, then write the next command/char? This seems like a lot of unnecessary bus traffic for single command/character.

How did you approach this?

The datasheet shows two types of timing. One for the byte transfers and one for the instruction execution. If the LCD uses the default reference clock (most do or are faster than the reference clock) you don't need to check busy if you don't send another instruction sooner than the timing needed for the previous timing.
And if you look closely at the instruction timing, you will see the most are the same with the exception of the clear & home which are significant longer.
Then you have to look at how fast the instructions can be sent to the LCD. This will vary depending on the speed of the interface to the LCD and the speed of the processor.

Most libraries (in fact all the LCD libraries I've looked at other than one) are really dumb when it comes to how to handle the instruction timing.
They simply do blind delays for the LCD instruction time right after sending the instruction.
The overhead of the interface and/or the speed of the processor can be used to reduce these blind delays but I've not seen any LCD library do it other than fm's new LiquidCrystal library do it.
But fm's library does it using some scaling calculations based one some rough calculations of the processor speed and interface speed, these don't always work.
hd4480 does things in a completely different way that needs no scaling and yet also allows the LCD to run in parallel with the Arduino processor.

I will say that while using the BUSY status has the potential to speed things up as the code will then only wait the absolute minimum necessary, In reality in the Arduino environment, it won't. Clearly for something like an i2c port expander this will be case, particularly for the MCP parts which have significantly more overhead than direct pin i/o.
Do the math and see what the actual timings are for transferring a byte and reading a byte then take into consideration how many bytes have to be sent just to flip the port mode and read byte from the LCD.
Even when using direct pin control from the processor using BUSY in Arduino won't be faster because of the way the i/o code (digitalRead()/digitalWrite()) works.
While it sounds counter intuitive it will actually be slower than not using BUSY since it will take more time just to flip the port around from output to input and back again than the default timing of 37us for most LCD instructions. i.e. the time to set up the port to do the BUSY read will take longer than the default instruction time.
While it is core and processor speed dependent, the digitalRead()/digitalWrite() implementation, particularly on the AVR as the way the code is implemented and written it has TONs of needless overhead. (i.e. the code can be written better - the Teensy core is over 50x faster than the bundled AVR core for pin manipulation).
This is the kind of stuff that breaks the scaling in fm's library but doesn't break hd44780.

Using BUSY is only a potential win on clear & home instructions as those times are quite lengthy.
If you want to see the performance of different libraries, you should run the LCDiSpeed sketch that is included in my hd44780 library. It will tell you the timing for sending a single byte as well as the timing for filling the LCD.
It is a very useful tool to compare libraries.
You should be able to get it to work on your library.

bperrybap: I had the same feeling. Standard speed communication via i2c is 100kb/s, so there's no way that the MCP could saturate the lcd on anything but the clear/return commands.

because of the way the i/o code (digitalRead()/digitalWrite()) works

Is it really that bad? I had just assumed under the hood it just called an intrinsic function which mapped to one or two asm instruction(s). What more could they be doing?

Another thing that I found interesting, is that the execution time of 'clear display' is actually omitted. I just went with 1.52ms because other libraries had been using it. And the 'note' on page 24 left me a bit confused on the timing:

Be sure the HD44780U is not in the busy state (BF = 0) before sending an instruction from the
MPU to the HD44780U. If an instruction is sent without checking the busy flag, the time between
the first instruction and next instruction will take much longer than the instruction time itself. Refer
to Table 6 for the list of each instruc-tion execution time.

This left me unsure as to whether the BF was required as well as timings. Really its just poorly written.

Anyways thanks for the info. I'm not sure if its worth my time at this point to add some sort of predictive time scaling, but it is something to consider...

Kaisha:
Is it really that bad? I had just assumed under the hood it just called an intrinsic function which mapped to one or two asm instruction(s). What more could they be doing?

You have obviously not looked at how this is implemented and all the extremely painful hoops you have to jump through on the AVR to support the Arduino digital pin i/o API.
It isn't that the code is ugly or "bad". It is very readable.
It is that it's execution time is quite slow vs what it could be.
And in my view (which is from a person with 35+ years of embedded & realtime programming experience) it is not good code.

The issue is partly related to the API semantics
"set this pin to the state specified by this value" vs set this pin to a specific state.
But then it is combined with some very poor internal design decisions in the AVR h/w architecture and an arduino s/w library implementation that has quite a bit of overhead vs what it could be.
The AVR has two things that make doing an API with the existing API semantics difficult.

  1. the AVR must use specific bit set/clear instructions to atomically set bits in ports to control the pins
  2. the AVR cannot directly access data stored in the flash.

Both of these are AVR h/w design decisions. The h/w designers took too simplistic of an approach and the result is that the AVR h/w in these two important areas are fundamentally incompatible with the C language.
IMO, this was a very dumb design decision. No other processor has made this decision.

Arduino, was create by and initially implemented by individuals with very little embedded programming experience and it shows.

In this case, there are ways to write the code to be much more efficient and faster than the way they did it.
What they did is a classical table lookup for multiple entries using multiple tables and then using that information to figure out which port and bit to set/clear. This is very costly on the AVR.
There are other ways this could be done, including ways that do not require so many table lookups (which are very slow on the AVR) since the processor and the port to pin mappings are all known at compile time. But it takes more knowledge of how the compiler generates its code and how gcc works to be able to write that kind of code.
This level of programming is well above many if not most programmers including the original Arduino core development team.

The atomicity issue is real pain. While many other processors have bit set and bit clear registers, the AVR has bit set and clear instructions. The AVR gcc developers put hacks into gcc to be able to use them but they only work in very specific situations -which will NEVER occur when using the Arduino API semantics of digitalWrite(pin, val) and the existing code implementation.
As a result, the Arduino library code has to manually work around it. This requires

  • save the interrupt mask state
  • mask interrupts
  • read the port in to temp variable
  • set the bit the in temp variable (which requires testing the runtime value of "val")
  • wrote the temp variable to the port
  • restore the interrupt mask state

And that is after all the rigamarole of doing multiple table lookups to figure out the address of the port and bit in the port for the specified pin. Data table lookups to data in flash are horrendously slow and painful on the AVR since the AVR can not directly access data stored in the flash.

The net result is that at 16Mhz takes close to 6us to manipulate a pin vs 140ns if the special instructions can be used - which they won't be when using the Arduino supplied AVR core.

All other processors have h/w that is much better suited to C and does not have the above 2 issues.

In particular the ESP8266 has an Arduino core that can take advantage of the better internal h/w architecture to provide very fast pin control as that core does not have to do any table lookups and can atomically manipulate port bits for the pins directly from C using simple OR and AND operations which compile down to very few machine instructions.

--- bill

From reply#12 and the HD44780U instruction sheet:

If an instruction is sent without checking the busy flag, the time between the first instruction and next instruction will take much longer than the instruction time itself.

I'm pretty sure this is just a poor translation, although better than I could do. Basically it could be rewritten to say "If an instruction is sent without checking the busy flag, the time between the first instruction and next instruction must be longer than the execution time for the first instruction."

I would just say that if you don't check the busy flag before sending an instruction then you must instead make sure that the previous instruction has had time to fully execute.

Don

bperrybap: Thank-you for the info, I never thought it would be that complicated. That seems ridiculous to have to do all that for something so simple... That said, I'm actually using an esp8266 (adafruit huzzah) not an avr chip, just using the arduino IDE. So yeah for me I guess?

floresta: Ya I agree.

Well I'm pretty much done with this part of the project. The library is done, I really like how it turned out, everything seems to be working well. So thank-you all for the help and subsequent discussion :slight_smile:

Kaisha:
bperrybap: Thank-you for the info, I never thought it would be that complicated. That seems ridiculous to have to do all that for something so simple... That said, I'm actually using an esp8266 (adafruit huzzah) not an avr chip, just using the arduino IDE. So yeah for me I guess?

On the esp part with direct pin control BUSY would be a win for sure on the clear & home and likely not hurt on the other instructions, but when using an I2C interface it will take more than 37us to do a single poll of the status register.

Well I'm pretty much done with this part of the project. The library is done, I really like how it turned out, everything seems to be working well. So thank-you all for the help and subsequent discussion :slight_smile:

I'd be curious what your byte xfer time is to see how it compares with the hd44780 library.
You can get this from the LCDiSpeed sketch in the hd44780 library.

I have some esp boards but none were ready to be used so I used an AVR board.
The esp boards would be a a bit faster.

For the PCF8574 on a 16Mhz AVR:
550us with default/100Khz clock
199us with a 400Khz i2c clock.

For the MCP23008 (which should be the same for a MCP23017)
649us with default/100Khz clock
230us with a 400Khz i2c clock.

You can see how the extra overhead required for the MCP parts impacts the byte transfer time vs the PCF parts.
And if not using byte mode on the MCP parts, the transfer times will be at least double probably close to triple these times.

--- bill

I'd be curious what your byte xfer time is to see how it compares with the hd44780 library.
You can get this from the LCDiSpeed sketch in the hd44780 library.

When I get a chance I will try it with my library and see what happens.

And if not using byte mode on the MCP parts, the transfer times will be at least double probably close to triple these times.

I don't think that would matter. You have to do a full Begin/Write/End transfer for each nibble (on a 4-byte interface). You can't combine multiple writes in a single transfer (that I know of). So byte or sequential mode, you're still going to have to set the register, and value (2 bytes + any begin/end overhead) every nibble (4 bit) or byte (8 bit) sent. And of course each have to be done twice for a 'pulse' (enable high/low).

Kaisha:
I don't think that would matter. You have to do a full Begin/Write/End transfer for each nibble (on a 4-byte interface). You can't combine multiple writes in a single transfer (that I know of). So byte or sequential mode, you're still going to have to set the register, and value (2 bytes + any begin/end overhead) every nibble (4 bit) or byte (8 bit) sent. And of course each have to be done twice for a 'pulse' (enable high/low).

Your understanding is incorrect.
--- bill

Ok so right now I have:

static void WriteMCP(uint8_t reg, uint8_t value) {
   Wire.beginTransmission(mcp_addr);
   Wire.write(reg);
   Wire.write(value);
   Wire.endTransmission();
   }

static void PulseMCP(uint8_t value) {
   WriteMCP(mcp_gpio, value | pin_e);
   delayMicroseconds(1);
   WriteMCP(mcp_gpio, value);
   }

PulseMCP() sends a full byte over a 4 bit interface. You're saying I could ignore the delay and simply write (assuming byte mode of course):

static void PulseMCP(uint8_t value) {
   Wire.beginTransmission(mcp_addr);
   Wire.write(reg);
   Wire.write(value | pin_e);
   Wire.write(value);
   Wire.endTransmission();
   }

I feel the latter would only work if we know for a fact that the mcp23017 internally runs slower than the hd44780U. Maybe its deep somewhere in a datasheet but off the top of my head its not something I remember reading. Or were you talking about something like:

static void PulseMCP(uint8_t value) {
   Wire.beginTransmission(mcp_addr);
   Wire.write(reg);
   Wire.write(value | pin_e);
   Wire.endTransmission(false);
   delayMicroseconds(1);
   Wire.write(value);
   Wire.endTransmission();
   }

I haven't actually tried the above code... maybe its time to.

Or are you talking about something else?