LCD 1602 (and similar) databus sniffer

This is a basic sketch to demonstrate sniffing the data bus of an LCD 1602 display to interpret the contents of the display. This could be useful if, for instance, you are using a 3rd party product for which you have no source code and which incorporates one of these displays, and you need to extract some information from it.

Actually, I started this project as an experiment to get familiar with a new logic analyser and have no current plans to develop this further. I've, however, seen a couple of cases where such a tool could be useful so , maybe, this helps someone.

The first thing to say is that this is a basic sketch and demonstrates the main points of capturing and interpreting the data. However, the chip family on which these displays are based (Hitachi HD44780) offers quite a rich feature set, including scrolling support, custom character definition etc. Depending on how the developer has used these features, and which library he has used, it may be quite challenging to interpret the data. In the worst case, it would be necessary to more or less emulate the functionality of the entire chip. Fortunately, you are more likely to encounter a simpler use case.

The HD44780 data sheet is attached as are the demonstration sketches.

The demonstration environment is extremely simple. It consists of two main parts. Part 1 is an Arduino Nano connected to a 5 volt LCD 1602 with an I2C backpack. It simply writes to the LCD 1602 to emulate the behaviour of, say, a third party application. I used an I2C backpack for simplicity of wiring. The I2C protocol is otherwise irrelevant here. The second part consists of an Arduino Uno connected to the data bus and control pins of the LCD 1602 (seven pins marked RS, RW, E, D7, D6, D5, D4 plus a common ground). This second part runs the sniffer sketch. That is it.

Before proceeding with the details of the sniffer sketch, I'm presenting a picture from the logic analyser showing the status of the pins under some test conditions. The first part of the data stream is the initialisation of the LCD 1602 and then the first message "hello world!" . The initialisation, mainly forcing the display into 4 bit mode and cleaning up is interesting but you don't have to understand it.

See post #2 below for an improved picture with protocol specific analysis data

Reading example: The seven main pin of the LCD 1602 are represented by channels on the left margin (E, RS, RW, D7 etc.). A simultaneous trace for each channel is shown. The enable pin of the LCD 1602 is pin 'E'. The falling edge of the enable signal indicates that the data on the remaining channels is valid. So, if we take, say, the second visible pulse on channel E ( the one under the 'm' of + 20 ms on the header) and look vertically downwards, we see that RS is low (0), as is RW, D7, and D6. However, both D5 and D4 are high (1). Since the display is, in this case, driven in 4 bit mode, the LCD pins D3, D2, D1 and D0 have no connections to them. The display still needs this data, however, so the whole bus is loaded twice. That is, you see two pulses of E, once for the four high order and once of the low four low order bits. The trick in the sniffer program is to join these up.

The data presented in the trace matches the sample sniffer output below.

Of course, all the information appearing above is also visible from both the HD44780 data sheet and also by reversing engineering the Arduino Liquid Crystal libraries, but the diagram should help to make it clearer.

The sketch sending the test data is quite trivial. It simply does the standard initialisation, clears, the screen and writes "hello world!". Then, at one second intervals, it writes a counter to a specific part of the display. It is attached below: LCD1602_TestWrite_V0_02.ino

This is a sample of the output from the sniffer when the above mentioned test program is dumping data to the display. It has been annotated to help explain what is going on.

10:46:32.517 -> starting sniffer ...

 
//                            0xMM  0xNN   MM bit 7 is error bit , bit 1 is RS, bit 0 is R/W
//                            ----  ----   NN is LCD DB7 through to DB0

                                           // Start sequence . . .
10:46:35.873 -> control  data 0x00  0x33   // 0x03 sent twice (presented as combined)
10:46:35.873 -> control  data 0x00  0x32   // 0x03 then 0x02   ( 0x02 = return home ? )
10:46:35.873 -> control  data 0x00  0x28   // Function Set 4bit, 2 lines, 5x8 dots
10:46:35.873 -> control  data 0x00  0x0C   // Display on, cursor home, blink off
10:46:35.873 -> control  data 0x00  0x01   // Display Clear
10:46:35.873 -> control  data 0x00  0x06   // Entry mode set - cursor increment mode, no display shift ?
10:46:35.873 -> control  data 0x00  0x02   // Return Home


10:46:36.391 -> control  data 0x00  0x01   // Display Clear
10:46:36.391 -> control  data 0x02  0x68   // 'h' 0x02 (RS=1,RW=0 - write data )  0x68 = 'h' 
10:46:36.391 -> control  data 0x02  0x65   // 'e'
10:46:36.391 -> control  data 0x02  0x6C   // 'l' 
10:46:36.391 -> control  data 0x02  0x6C   // 'l' 
10:46:36.391 -> control  data 0x02  0x6F   // 'o' 
10:46:36.391 -> control  data 0x02  0x20   // ' ' 
10:46:36.391 -> control  data 0x02  0x77   // 'w' 
10:46:36.391 -> control  data 0x02  0x6F   // 'o' 
10:46:36.391 -> control  data 0x02  0x72   // 'r' 
10:46:36.391 -> control  data 0x02  0x6C   // 'l' 
10:46:36.391 -> control  data 0x02  0x64   // 'd' 
10:46:36.391 -> control  data 0x02  0x21   // '!' 

10:46:36.391 -> hello world!   // The following data item 0xC5 is not detected as a character
   // so the sniffer prints out the accumulated printable characters.

10:46:36.391 -> control  data 0x00  0xC5   // 0xC5 ( setCursor( 5, 1 ) ) Sets DDRAM address
10:46:36.391 -> control  data 0x02  0x30   // '0'
10:46:36.391 -> control  data 0x02  0x30   // '0'
10:46:36.391 -> control  data 0x02  0x30   // '0'
10:46:36.391 -> control  data 0x02  0x30   // '0'
10:46:36.438 -> control  data 0x02  0x30   // '0'


10:46:37.418 -> 00000   // again, the sniffer recognises the end of a character stream and
   // prints out the accumulated printable characters, the counter 
   // value in this case.


10:46:37.418 -> control  data 0x00  0xC5 
10:46:37.418 -> control  data 0x02  0x30 
10:46:37.418 -> control  data 0x02  0x30 
10:46:37.418 -> control  data 0x02  0x30 
10:46:37.418 -> control  data 0x02  0x30 
10:46:37.418 -> control  data 0x02  0x31 

10:46:37.511 -> errorCount= 0   // The sniffer prints an error total every few seconds. Errors are 
   // Queue overflows and indicate that the loop() processing is too slow.

10:46:38.397 -> 00001

10:46:38.397 -> control  data 0x00  0xC5 
10:46:38.397 -> control  data 0x02  0x30 
10:46:38.397 -> control  data 0x02  0x30 
10:46:38.397 -> control  data 0x02  0x30 
10:46:38.397 -> control  data 0x02  0x30 
10:46:38.444 -> control  data 0x02  0x32 
10:46:39.423 -> 00002

// etc.

General description of the sniffer. The sniffer sketch simply watches the enable pin, captures the data on the LCD bus, combines data from 2 consecutive reads of the bus (4 bit mode) and puts the data in a queue. The queue is read in the loop and formatted output is written to the serial console.

The queue is necessary because a synchronous operation of reading the data bus in a data burst and writing it to the serial console overwhelms the serial port, leading to data loss. The sketch can be easily adapted to handle 8 bit mode, if necessary.

I've interpreted the display characters as ASCII, however, there may be some inconsistencies between the standard ASCII table and the font table of the LCD. The sniffer sketch appears in the attached zip file together with its queue library: LCD1602_sniffer_V0_03.zip

If you are attempting to interpret complex patterns, it may help to define an array which matches the display canvas, in this case 2 x 16, and keeping track of screen control commands such as clear and cursor movements etc. On each change to the display, or at regular intervals, check if the part of interest has changed in that iteration. Process such changes accordingly.

Edit:

  1. LCD1602_sniffer_V0_04.zip has been added to correct the issues described later in this thread. I've retained the older version LCD1602_sniffer_V0_03.zip for those who prefer a simpler version and understand its limitations.
  2. There is an even newer version of the sniffer attached to post #10. When it all stabilises, I'll put the definitive version here and clean up.

HD44780.pdf (322 KB)

LCD1602_sniffer_V0_03.zip (2.78 KB)

LCD1602_TestWrite_V0_02.ino (684 Bytes)

LCD1602_sniffer_V0_04.zip (4.05 KB)

Note: Post #41 contains the latest version of the sniffer.

1 Like

Your Saleae software can decode the HD44780 code. And display it on the display.
From memory it might even check your initialisation sequence.

Following the I2C backpack is harder. Common backpacks have the data pins on contiguous bits. e.g. DB4-7 on expander P4-7
So letter 6 from 6v6gt is 0x36 will be 0x3- on the I2C hi bytes and 0x6- on the lo bytes.

The Saleae software can help you understand how the logic signals work.

David.

Many thanks for that. The whole thing was a learning exercise and now I have learned something more. I could even use the 44780 protocol analyser retrospectively on a saved trace. I'm becoming more impressed with this logic analyser.

It is clear that the I2C side is less useful for a real sniffing task because of the variety of chips and wiring permutations. The advantage is that there are only 2 lines to tap.

Trace with 44780 protocol analysis:

I'd be curious if this works with the hd44780 library with the hd44780_I2Cexp i/o class.

I'd be concerned about two things.
When using auto configuration, it probes the i2c expander chip to detect the i/o expander chip type.
And then it probes the LCD through the expander to determine the pin mapping between the LCD and the expander chip.
These probes may throw off the host to LCD nibble synchronization.
The way to ensure the sniffer is always in nibble sync would be to look at the commands being sent to the LCD and run a state machine and process the function set commands used during LCD initialization.
Those function set commands are very carefully chosen to first put the LCD into 8 bit mode, then put it into 4 bit mode.
The sequence of commands will always work no matter what state the LCD is in and work even when the host can only control the upper 4 data pins.
So when the function set command sequence completes the LCD will be 4 bit mode and the host and the LCD will always be in nibble sync.

The other thing that would be interesting is can this sniffer keep up when using the hd44780 library and hd44780_I2Cexp i/o class.
The hd44780 library is quite a bit faster than other libraries like the LiquidCrystal_I2C.
A quick test could be done by not using the auto location and auto configuration to avoid the potential issue above.
While undocumented, the hd44780_I2Cexp i/o class supports the same constructor arguments as LiquiCrystal_I2C and you can use the came init() call to initialize the LCD rather then begin().
So the changes to LCD1602_TestWrite_V0_02.ino would just be changing the header file names and the lcd object class name.

--- bill

Very cool project.

I'd be curious if this works with the hd44780 library with the hd44780_I2Cexp i/o class.

I'd be concerned about two things.
When using auto configuration, it probes the i2c expander chip to detect the i/o expander chip type.
And then it probes the LCD through the expander to determine the pin mapping between the LCD and the expander chip.
These probes may throw off the host to LCD nibble synchronization.
The way to ensure the sniffer is always in nibble sync would be to look at the commands being sent to the LCD and run a state machine and process the function set commands used during LCD initialization.
Those function set commands are very carefully chosen to first put the LCD into 8 bit mode, then put it into 4 bit mode.
The sequence of commands will always work no matter what state the LCD is in and work even when the host can only control the upper 4 data pins.
So when the function set command sequence completes the LCD will be 4 bit mode and the host and the LCD will always be in nibble sync.

The other thing that would be interesting is can this sniffer keep up when using the hd44780 library and hd44780_I2Cexp i/o class.
The hd44780 library is quite a bit faster than other libraries like the LiquidCrystal_I2C.
A quick test could be done by not using the auto location and auto configuration to avoid the potential issue above.
While undocumented, the hd44780_I2Cexp i/o class supports the same constructor arguments as LiquiCrystal_I2C and you can use the same init() call to initialize the LCD rather then call begin().
So the changes to LCD1602_TestWrite_V0_02.ino would just be changing the header file names and the lcd object class name.

--- bill

Thanks. You are correct that LCD1602_sniffer_V0_03 does not respect the LCD's special initialisation sequence to force the mode from the initial unknown state, and would have exactly the synchronisation problem you described if the enable pin were toggled an odd number of times before that start up sequence. I've corrected this and also made the sniffer handle 8 bit mode, but 8 bit mode is, as yet, untested. I'll add the new version to the OP (now LCD1602_sniffer_V0_04) but I'll leave the original version as well, because it is substantially simpler, as long as its limitations are understood.

I'll do some speed trials, as suggested, when I get a chance sometime later.

I haven't gone through the state code in full details, but my initial impression is that it could have issues.
The 4 bit/8 bit mode selection using function set commands is not about seeing or processing 3 function set commands.
While many people mistakenly think that the 3 function sets are retries or a specific 3 command sequence that the LCD detects, that isn't how it really works.
All LCD instructions are always internally processed 8 bits a time.
The only difference between 8 bit mode and 4 bit mode is how the 8 bits are constructed before processing the 8 bit instructions.
And the function set commands are to be processed whenever they are seen with no concern if the host or the LCD are in nibble sync just like any other LCD instruction.

The magic is that bit position the DL bit for 8 bit and 4 bit mode was very carefully chosen to ensure that when processing the 8 bit instructions, even if the host and the LCD are out of sync with each other, the instructions will eventually put the LCD back into 8 bit mode then into 4 bit mode.
The LCD can end up processing a garbage/unknown instruction during this process if there is a synchronization issue when the host and the LCD are in different 8/4 modes or out of nibble sync in 4 bit mode.
I have a very long description of this process in the hd44780 library file hd44780.cpp in the begin() function.
It is definitely worth a read if you want to deep dive into how the function set 4/8 bit initialization really works.

I would recommend keeping collect() very simple.
Its job should be to only que up 8 bit data transferred to the LCD. This keeps it simple as well as fast.
collect() should not be concerned about getting into nibble sync with the host.
LCD Instructions can be processed by code in loop()
You pop off a bus transfer byte to always get an 8 bit instruction to process.
You process the instructions you know how to process and toss the ones you don't.
If you see a function set instruction with DL set then you set something that tells collect() to process in 8 bit mode.
If you see a function set instruction with DL clear, then you set something that tells collect() to process in 4 bit mode and the nibble state is for the first nibble.
It really can be that simple.

Also by handling instruction processing in loop, you can spit out messages about processing the function set messages and the 8/4 bit mode changes.
The instruction processing could easily be expanded to process more instructions as needed/desired.

For development, tracking, and potentially others contributing, it could be useful to put the project into a git repository and up on
github, bitbucket, atlassian, etc...

--- bill

Great project, could not have been a better timing as I am battling similar issue.
I needs to sniff data from a 8x2 OLED made by WinStar that uses WS0010 driver - WEH000802ALPP5N00000

And while the datasheet references the standard fonts/characters that decode to easy-to-read hex, it seems that the manufacturer might be driving the display in the graphical mode?

I have attaches some snaps of the activity on those pins and it seems to be quite a bit more complicated.
Using your sketch as is, I could not detect a single number that matched, so I think I am overflowing the queue.

I am not sure if I should try to pursue this route, or try to decode the guns RX/TX communications - I need to get the amount of ammo and health out for a project of mine.

6v6gt,
There is one very important detail I left out when handling the sniffing related to 4 bit mode and 8 bit vs the number of data pins being sampled.
The hd44780 ensures that unconnected data pins (at least DB4 to DB7) are read as zero.
The sniffer must emulate the same behavior when using 4 data pin sniffing.

i.e. 8bit vs 4bit LCD mode is separate from sampling all 8 data pins vs sampling only 4 data pins.

The obvious limitation is that in order to support a host that uses 8 bit mode, the sniffer must do 8 data pin sampling.
But for hosts that use 4 bit mode, the sniffer can "cheat" and only sniff the upper 4 data pins which can make wiring up the sniffer simpler/easier.

Since most Arduino libraries run in 4 bit mode, I'd favor what is easiest for the user when using 4 bit mode which can allow doing 4 data pin sampling.
So, IMO, the easiest way to do this would be to still handle the instruction queuing and 8 bit processing as I mentioned earlier where the instruction processing is done in loop()
However, to support 4 data pin sampling, requires that the user add pulldowns to the Arduino pins used to sniff DB7, DB6, DB5, and DB4 so they read as low by the sniffer when not connected to the LCD or having a conditional that can indicate that sniffing is only sampling 4 data pins so they get forced to zero in s/w.
Remember the sniffer must still support 8 bit mode even if only doing 4 data pin sniffing, to be able to properly handle the function set instructions.

Adding pulldowns is more work than connecting wires to all 8 data pins, so if there is no conditional to handle this in s/w, pretty much everyone will end up wiring up all 8 data pins for sniffing.

Since most people will be using 4 bit mode, IMO, a conditional macro that defaults to 4 data pins sampling is simpler.

The sniffer could have a macro say SNIFF_8DATAPINS that indicates that all 8 data pins are being sampled and is not set by default.
The SNIFF_8DATAPINS conditional could then be used so that when the LCD is in 8 bit mode, DB7, DB6, DB5, and DB4 are sampled and use for the upper 4 bits of the byte, but when the LCD is in 8 bit mode and SNIFF_8DATAPINS is not set, the upper 4 bits of the byte are set to zero.

--- bill

karlisakis,
While it is possible that they are using the graphic mode and/or some of the extended font capabilities not in "normal" hd44780 displays, based on what I can see on the display, there isn't anything on the display that would require graphic mode or things not available in the hd44780 instruction set.

They have extended the function set instruction for some new capabilities.
I have seen this in other LCD modules. While it is a convenient way to add functionality, it breaks the robustness of being able reliably get the host and LCD in nibble sync when using 4 bit mode.
i.e. there can be cases where it is possible that the host and the LCD can get out of nibble sync and never be able to recover.

The traces are not useful since they are zoomed out too far to be able to see the individual instruction transactions.
There is also no timing information to be able to tell how fast operations are happening.

One thing that I can see in the traces, is that it appears that the host is testing the BF flag
But since the traces are zoomed out so far it is impossible to tell what is happening.

I'm guessing that if we could see the instructions, it wouldn't' be that difficult to determine what is being sent to the display.
But, figuring out exactly what is on the display is more difficult that just seeing what is being sent to the display.
It involves tracking and internally emulating several instruction types to keep track of things like custom characters and cursor positions which requires keeping track of the DDRAM address counter and updating it for each character written to the display and updating it for any set DDRAM instructions (setsursor()) that were done to alter the DDRAM address.

--- bill

@bperrybap.
I've now also tested the sniffer against your HD44780 library also in auto configure mode. It appears to work OK, but the sniffer does not currently support a reversion from 4 bit mode to 8 bit mode, so could get out of phase if it sees multiple start up sequences. The cleanest use case is that the sniffer is running, then the LCD goes through a complete initialisation from power reset onwards. Incidentally, the Saleae 44780 protocol analyser, at least with its standard configuration, could not recognise the output of this library.

I've attached a new version LCD1602_sniffer_V0_06 to this post. I've restructured it to transfer the main activity from the ISR to the loop() so, for one thing, it easier to add custom debugging statements. When it gets so far, I will put a definitive version in the OP. That could be after it is tested running in 8 bit mode and some other tidying up.

I just want to emphasize here, if it is not already clear, that the target audience for this tool is the relatively experienced user. New users will have difficulties with the complex analysis and integration activities this tool is intended to support. Further, it is fully expected that the user will have to make coding changes to it to match the target environment and the format of the data to be extracted etc., so it is really to be regarded as sort of framework and not a ready made solution.

I'm not so worried about the sniffer having problems with floating pins ( I hope I have understood you correctly here) . If the target application uses 4 bit mode (pins D7 to D4 only) and, during the initialisation sequence, puts the 44780 cleanly into 4 bit mode, then the sniffer would not look at the unused pins. However, if the raw data is presented to the user, I agree that the unused pins should be masked to avoid confusion.

@karlisakis
You have an interesting project there. Especially with a game, you may well have a problem relating the data sent to the display with the current image on the display, as @bperrybap has said. Just as an example, those custom characters could easily be created dynamically, and constantly loaded. All you will see is their position in the internal ram of the device and there are only 5 postions (at least in the standard case assuming no custom font table). If multiple special characters, during the progress of the game, share, at different times, the same position, you can't do a simple look up to see which character is being referred to.

Under Windows "Photos" I can magnify the traces, but I am wondering what I am seeing. I guess the display leaves D0 to D3 unconnected, since you are showing only 7 traces. Further, it looks like the 'E' (enable) pin is on DIO 2. DIO 0 looks rather wild. Anyway, you should say what your pin mapping is.

LCD1602_sniffer_V0_06.zip (4.9 KB)

LCD1602_write_V0_03.ino (860 Bytes)

Note: Post #41 contains the latest version of the sniffer.

Thanks a ton for your replies.
I forgot to take more photos, the game uses a lot of custom graphics during the game - skull faces, masks, explosions, etc etc.

The display is wired as per the screenshots, that's why it seemed weird, so much activity.
This definitely was a learning experience, but I did find a way around this - found a debug port that was not using custom protocol and does update the StatServer whenever a bullet is shot,player is killed etc etc.
Thus all I need to do is parse those strings and boom, it works.

I will bookmark this thread as its super interesting and will definitely become useful for some other projects.

6v6gt:
@bperrybap.
Incidentally, the Saleae 44780 protocol analyzer, at least with its standard configuration, could not recognize the output of this library.

This likely indicates that they are not properly understanding how to handle LCD instructions, in particular function set instructions, and are out of nibble sync with the host.
I'm not surprised.
I'm assuming like many others, they have made incorrect assumptions about how the 3 instruction sequenced used to get the host and the LCD into nibble sync really works.

I'll see if I can talk to them about it.
Ironically, they way many people handle processing of LCD instructions, particularly the function set instructions, is WAY more complicated than it should be and that is what creates problems.

I'm not so worried about the sniffer having problems with floating pins ( I hope I have understood you correctly here) . If the target application uses 4 bit mode (pins D7 to D4 only) and, during the initialisation sequence, puts the 44780 cleanly into 4 bit mode, then the sniffer would not look at the unused pins. However, if the raw data is presented to the user, I agree that the unused pins should be masked to avoid confusion.

I'm afraid you have not understood how the 3 instruction "initialization" sequence really works and how simple it really is.
It isn't actually an initialization sequence. The LCD does not need any initialization to process instructions.
It is simply a reliable way to get the host and the LCD in sync with each other with respect to 8 bit vs 4 bit mode and if in 4 bit mode into nibble synchronization.

You very much need to be worried about floating Arduino pins if doing LCD data pin sampling when only 4 arduino pins are hooked up.
It is a requirement unconnected LCD DB pins read low and THAT is what actually makes the 3 instruction sequence work.
And the sniffer must work accordingly.
i.e if those other 4 Arduino pins were hooked up to the LCD DB0 to DB3 pins they would read low if those LCD pins were not connected to the host controlling the LCD.

There is a distinct difference between LCD 4 bit mode vs LCD 8 bit mode and
sniffer 4 arduino data pin sampling and 8 Arduino data pin sampling.
It is possible to use only 4 Arduino pins for LCD data pin sampling if the host is using LCD 4 bit mode, but in order to do that, the sniffer must know that only 4 arduino pins are being used instead of the full 8 arduino pins.

From looking at the latest sniffer code it appears the code is still trying to detect an initialization state.
It isn't needed because there is no such thing.
The LCD is always able to process instructions, in either 8 bit mode or 4 bit mode.
The only state the LCD has is the nibble state when in 4 bit mode. (high nibble vs low nibble)
There is no such thing as an initialization state that involves the need for keeping track of the number of function sets or the order of them or anything else related to 4 bit vs 8 bit modes.

If you look at my comments in the hd44780.cpp code or the wikipedia page you can see that ensuring DB0, DB1, DB2, DB3 are read as low internally by the LCD when not connected is vital to making LCD "initialization" / synchronization work.
This is true regardless of whether the host is using 8 bit and 4 bit mode.

In fact it is unfortunate that the 3 instruction sequence used was ever called "initialization".
Because it really isn't about initialization.
The real purpose is to provide a full proof way for the host and the LCD to get into 8 bit mode, when it then can then be put into the desired 8 bit mode or 4 bit mode.
The first two instructions reliably get the LCD into 8 bit mode.
The 3rd puts the LCD into the desired 8 bit mode or 4 bit mode.
But in order to do that, DB0, DB1, DB2, DB3 must read as low by the internal LCD controller when not connected.

The processing of LCD instructions is actually brain dead simple since it does not involve any LCD instruction state information.
There is no state information needed to process LCD instructions.

Think about how the LCD works.
It was originally based on really old technology and likely had h/w assist to latch the incoming instruction.
For example, I'm guessing that
there is a h/w front end that will latch the instruction when E falls.
The h/w also has a 4 bit mode that can construct a byte from two nibbles.
The controller can then always processes instructions byte at a time.
It doesn't need any state information. It simply waits for a byte (instruction) show up and then looks at the byte to see what to do.

The sniffer could (should, IMO) work similarly.
i.e. LcdBus() performs the h/w front end function of the LCD
It samples the LCD DB signals and ques up full 8 bit byte instructions that can be statelessly processed by code up in loop().

All readLcdBus() should do is que up 8 bit bytes of each instruction it reads from the LCD DB pins.
If in 4 bit mode it would queue up a byte instruction once it has seen both nibbles.
It does not have to worry about queuing up "garbage" byte instructions should the host and the LCD be out of nibble sync if the host is using 4 bit mode.

Remember, the sniffer should be mimicking the way the LCD works so it is processing the same instructions as the LCD.

The code in loop() that processes instructions can be really simple as well since all it needs to do is process each 8 bit
LCD instruction that was queued up.
If it is a command instruction, it processes the 8 bit byte exactly as it was queued.
There is no state tracking of any kind needed to process Function sets.
The code that processes LCD instructions also does not have to worry about host and the LCD being out of nibble sync.

This is what makes things so simple.
The low level just queues up the 8 bit byte instructions as it sees them.
The high level processes the 8 bit byte instructions exactly as they are queued.
There is no LCD instruction state information of any kind needed to process instructions.

To process the LCD instructions, pull the queued byte instruction from the queue.
If it is a Function set, and DL is high, it sets something to tell the low level queuing code it is in 8bit byte mode using 8 Arduino pin sampling mode.
If it is a Function set and DL is low, it sets something to tell the low level it is in 4 bit mode using 4 Arduino pin sampling mode, and it is now looking for the first / upper nibble of the byte.

The only tricky part is how to handle using only 4 Arduino pins vs 8 arduino pins to read/sample the LCD DB pins when in 8 bit mode.
i.e. what does the low level queuing code need to do to properly handle things when only 4 arduino pins are wired up to the LCD data pins?

IF the sniffer wants to allow using only 4 Arduino pins, then it must be told that only 4 arduino pins are being used because the queuing code must ensure that that bits in the lower nibble are set to zero if the Arduino pins used for reading DB0, DB1, DB2, and DB3 are not actually hooked up to the LCD.
THIS IS CRITICAL.
if those other 4 Arduino pins were hooked up to the LCD DB0 to DB3 pins they would read low if those LCD pins were not connected to the host. So the sniffer must fake things out to ensure that the low nibble is all zeros as that is what it would be if the other 4 arduino pins were hooked to the LCD.
The 3 LCD instruction sequence used to get the host and the LCD into nibble synchronization depends on this.
Having those as zeros ensures that you don't accidentally see another function set command or options bits which would break the 3 instruction synchronization sequence.

It is all pretty simple with 8 bit byte instructions being queued and 8 bit byte instructions being processed.
There is no LCD instruction state information. Just bytes queued and bytes processed.

The only tricky part is how to support sniffing when using only 4 arduino pins for data bus sampling.
And that can be handled using a conditional macro to tell the low level code in readLcdBus() to set the lower nibble of the byte to zeros when in 8 bit mode.

--- bill

There is a lot of information there. Than you for that. But just to help my understanding, let's focus on the issue of looking at the stream of 4 bit nibbles and assembling a byte from each pair, and limit it to the case where D3 through to D0 on the LCD module remain unconnected.
This is as I see it. At some point, early in this stream, it is necessary to determine if the current nibble is a high order or a low order nibble, because we cannot assume that the system starts in a known state. Lets call that the synchronisation point. Once that has been determined, the task of assembling bytes out of matching pairs of nibbles is clearly trivial. This is a stateful activity because you cannot look at a single nibble in isolation and determine if it is a high order or a low order nibble.
The synchronisation point I have used is the end of the standard 3 instruction "initialization" sequence.

Can you then describe in a few words what failure I have demonstrated, if any, to understand the situation ?

6v6gt:
There is a lot of information there. Than you for that. But just to help my understanding, let's focus on the issue of looking at the stream of 4 bit nibbles and assembling a byte from each pair, and limit it to the case where D3 through to D0 on the LCD module remain unconnected.
This is as I see it. At some point, early in this stream, it is necessary to determine if the current nibble is a high order or a low order nibble, because we cannot assume that the system starts in a known state. Lets call that the synchronisation point. Once that has been determined, the task of assembling bytes out of matching pairs of nibbles is clearly trivial. This is a stateful activity because you cannot look at a single nibble in isolation and determine if it is a high order or a low order nibble.
The synchronization point I have used is the end of the standard 3 instruction "initialization" sequence.

Can you then describe in a few words what failure I have demonstrated, if any, to understand the situation ?

Right here:

At some point, early in this stream, it is necessary to determine if the current nibble is a high order or a low order nibble

This is incorrect.
I believe that this common belief comes from not fully understanding the purpose of the 3 function set sequence and how it works.
Again, there is no state information needed as there is no need to ever try to determine which nibble is high vs low.
The key take away is that 4 bit mode nibble synchronization is never determined, but rather
nibble synchronization is established by simply blindly processing function set instructions.
i.e. the host and the LCD are guaranteed to be in nibble sync after that 3 function set sequence - no state information is ever needed since the sequence works even if the host and the LCD are in different modes when the sequence starts.

All LCD instructions are always processed 8 bits at time.
If in 8 bit mode, DB0 - DB7 compose the 8 bit instruction.
If in 4 bit mode a high nibble is grabbed followed by a low nibble using nibbles from DB4-DB7.
When composing the 8 bit byte from nibbles, there is no concern if it is in nibble sync with the host nor is there ever any need to try to determine if the nibble being grabbed is really the high vs the low nibble with respect to the host.

The host gets into nibble sync with LCD using the 3 function set instruction sequence.
The hosts sends all 3 function set instructions as 8 bit mode instructions with no regard as to what mode the LCD may happen to be in and regardless of whether the host is controlling all 8 data pins or just 4.
In order for the sequence to work, LCD data pins that are not connected to the host must be read as low by the LCD. i.e. DB0 to DB3 must be read as low if not connected to the host.
This is critical.
If the host and the LCD/sniffer are in different modes (8 bit vs 4 bit, or 4 bit out of nibble sync with each other) after the first 2 function set instructions, both will be in 8 bit mode.
The 3rd function set establishes the final running mode. 8 bit or 4 bit and it to is always done in 8 bit mode even if the host is only connected to DB4-DB7.
If the final mode was 4 bit mode, then at that point, the host and the LCD/sniffer will be in nibble synchronization and the host can start using 4 bit mode.

All this can happen with no state information or any attempt by the LCD/sniffer to "guess" which nibble should be high or low.
When in 4 bit mode, 8 bit instructions are composed from two nibbles.
The LCD/sniffer simply processes 8 bit instructions.

Where it gets tricky is allowing the sniffer to only use 4 Arduino pins to sniff the LCD data pins since it still has to support 8 bit mode at least for a brief period until 4 bit mode is established.

If only using 4 Arduino pins and in 8 bit mode, the sniffer code needs to ensure that the unused Arduino pins always read zero or at least force the lower nibble of the queued up instruction to be zero.
This is what the LCD will see in 8 bit mode and if the Arduino pins are left floating, the sniffer will no longer be seeing the same thing as the LCD.
It is critical that these unconnected pins read as zero as the 3 function sequence takes advantage of that to get things back in sync.
If it isn't done by the sniffer, then under certain circumstances it could cause the 3 function set sequence to not be seen correctly by the sniffer (the LCD would see it correctly) as the sniffer can potentially see different instructions including additional and/or different function set instructions.

--- bill

Maybe I should have phrased it more explicitly like this, although I would have thought it was clear enough:

At some point, early in this stream, it is necessary [for the *sniffer] to determine if the current nibble is a high order or a low order nibble, because we cannot assume that the system starts in a known state.

1 Like

6v6gt:
Maybe I should have phrased it more explicitly like this, although I would have thought it was clear enough:

At some point, early in this stream, it is necessary [for the *sniffer] to determine if the current nibble is a high order or a low order nibble, because we cannot assume that the system starts in a known state.

And again, just like I just said before, That is incorrect.
Yes, things can start out in a unknown state and yes the host can be in a different mode or the nibble order can be of sync between the host and LCD/sniffer but it does not matter.
Note that in my previous response I explicitly used "LCD/sniffer" in several places.
I don't know how to more clearly say this:

The key take away is that 4 bit mode nibble synchronization is never determined, but rather
nibble synchronization is established by simply blindly processing function set instructions.

The sniffer and the and LCD should work the same.
They both need to process LCD instructions coming from the host the same way.

If there is a belief that there is a need for states and some kind of logic to determine nibble sync (high vs low nibble), then the purpose of the 3 function set instructions and how they work is not yet understood.

There are no states, no counting, or a need for any logic to determine where the high nibble is.
The nibble synchronization "just happens" and happens automatically by simply blindly processing 8 bit LCD instructions.
The 8 bit instructions are either read directly as 8 bits or constructed from reading two nibbles with no concern if there is nibble synchronization.
And yes when things are out of nibble sync a garbage/random instruction will likely happen during the first two function sets, but that is ok as the 3 function set sequence is designed to let that happen. It is part of how 8 bit mode is guaranteed to be initially established from the first 2 function set instructions.

This is talked about on the wikipedia page and there is much more detail about this in my hd44780 library hd44780.cpp file in the begin() function, including text that describes specific cases of what happens in all the possible modes or nibble order that the host and LCD can be.

All of this directly applies to the sniffer since the sniffer must process and interpret LCD instructions the same way as the LCD.

--- bill

Hello, congratulations for your good work.

Do you have any additional advance on the sniffer?

Thank you.