MAX7219 display errors

I am using an MAX7219 on my new Master Clock see forum post here Master Clock Mk2 - #2 by oliverb - Exhibition / Gallery - Arduino Forum and Clock Site here http://home.btconnect.com/brettoliver1/Master_Clock_MK2/Master_Clock_MK2.htm .

The Clock and Displays run perfectly fine for a week or so then suddenly some LED segments show incorrect or corrupted digits. Some segments of the display carry on displaying the time but once a digit is wrong it stays wrong. The LCD display is always correct and never gets any errors.

I am wondering if it could be a hardware problem. Do the control wires from the Arduino to the MAX7219 need to be screened or run away from other wires in my clock?
At present the Ardino in mounted on a separate vero board with the control wires run in 0.5 solid copper wire to the MAX7219 chip that is mounted under the LED display vero board.

The only non standard wiring is that I have used the MAX7219 auto dimming to control my LCD LED backlight. I have used the DP connection from the MAX7219 to switch a transistor to drive my LCD backlight. The DP is not connected to my 7 segment display at all.

This has happened 2 times now and always overnight. The clock has auto dimming so the display is on minimum overnight. It also has daytime motion detection so the display goes on during the day only when motion is detected.

I have tried MAX7219s from Ebay and from Rapid so it's not the chips themselves.

edit..... I have a 100µF and 0.1µF cap close to the 7219. The datasheet recommends 10µF but I have read 100µF is better.

Has anyone had this problem before or if you have any ideas it would be great to know.

Thanks.

Not a solution but my Wordclock that uses 2x MAX7219's to drive a 16x8 LED matrix, sometimes (about 2-3 times a year) has half the matrix crash/freeze as your describe. I wondered if it's power spikes as I had seen it happen once while plugging in a phone charger into another gang of the mains socket the clock in in. It does not happen enough for me to go to the hassle of adding extra decoupling to see if it fixes it.

Thanks for the info Riva.

When your displays crash is there a way of resetting them while leaving the clock running? My clock drives other slave clocks so a power down would mean all other clocks would need correcting.

oliverb:
When your displays crash is there a way of resetting them while leaving the clock running? My clock drives other slave clocks so a power down would mean all other clocks would need correcting.

The chip does not have a reset pin/command so the only way would probably be to power cycle. It would be interesting to know if the chips SPI interface still works when it locks up.
What decoupling do you currently have? I use a 10uF & 0.1uF shared between both chips but it should really have been per chip.
Could you control the MAX7219 power through an Arduino digital pin & a transistor an periodically reset it that way and then bring it out of shutdown to continue displaying.

I have a 100µF and 0.1µF cap close to the 7219. The datasheet recommends 10µF but I have read 100µF is better.

I have read that the distance between 2 Max7219 data and clk pins should be less than 10cm. The distance between my Arduino and MAX7219 is prob 15cm. I may try extending the wiring of these 2 pins to see if I can force it to corrupt just to prove if it is the problem.

I don't have any spare pins on my Arduino so I won't be able to do an auto reset. I could add a manual power switch just to see what happens on a power cycle.

I will post back once I have tried some experiments.

I wouldn't think the chip is actually crashing.
My bet would be that the chip is receiving a corrupted message
(message with incorrect bits set or cleared)
that is corrupting one of the upper registers which could
all kinds of odd things from changing the pixel decoding, to
turning the pixels off, to disabling columns, to changing the intensity etc...

This corruption could be caused by either s/w or h/w.
A h/w issue would be something like a glitch on the clock line due
to something like ground bounce on the power, or bit corruption
in the data bit, due to EMI etc..

A s/w issue could be due to the code or library creating an incorrect message
under certain specific conditions that rarely occur.
It could also be do to memory corruption do to some subtle bug clobbering
a piece of memory that happens to affect a message sent to the 7219.

In some cases these types might be occurring more often than observed since
they might clear themselves when the next message is sent, which might be
a very short period of time.

These type of issues can be hard to track down.


I recently found an issue in the Parola max7219 library.
The observed affect was that after a download, random led matrix displays in a multi display chain
might fail to function properly. Pixels might be appear to be random, or dimmer, than they should be
on a given matrix display.
Even a hard reset of the AVR would not clear it up, only a power cycle would clear it up.
Everyone was convinced it was a 7219 "crash" issue and there were lots of accusations
of it being an issue with "cheap" Chinese 7219 "fake" parts.
This ended up being a s/w issue in the library.
The s/w initialization of the 7219 was not properly setting all the necessary registers.
The issue would happen if there was a message in process when the AVR reset occurred.
That incomplete message would send a garbage message when the AVR reset and
library started up.
When that garbage message would corrupt a register that was not being initialized,
one more displays would fail to function properly.
For this case, I used a logic analyzer to catch the bogus commands being sent.
Luckily for me, the condition was very repeatable in a matter of seconds in my
setup which made it much easier to track down.
The solution was to initialize all the 7219 registers, so while the AVR reset would still
cause a register to get corrupted (which is unavoidable), the proper initialization sequence
would configure the 7219 to work properly every time.


Your situation will be more difficult to catch.
Is the AVR ever being reset? (like perhaps a watchdog kicking in?)

If you had a scope, it would be useful to look at the power and clock signals to see if there
were any issues.

You could reinitialize the 7219 which should clear up any issues, but that is
really just masking the issue.

--- bill

Hi bperrybap.
Thanks for the reply.

I have tried powering up and down the MAX2719 several times while it was running normally and it just restarted as normal.

I seem to remember the first time it happened I let the clock run into daytime mode where the clock displays go into standby mode when the PIR detects no movement. When the display was reactivated by movement (it is also reset) the display was still corrupted. Maybe the Arduino is at fault?

oliverb:
Hi bperrybap.
Thanks for the reply.

I have tried powering up and down the MAX2719 several times while it was running normally and it just restarted as normal.

I seem to remember the first time it happened I let the clock run into daytime mode where the clock displays go into standby mode when the PIR detects no movement. When the display was reactivated by movement (it is also reset) the display was still corrupted. Maybe the Arduino is at fault?

Can you explain more of what you mean by:

When the display was reactivated by movement (it is also reset) the display was still corrupted

Are you implying that the display was corrupted prior to the display being reactivated? and then
stayed corrupted?

From what I seen working with LED matrices (not actually 7 segment displays)
corrupted displays is due to corruption of 7219 registers.
And that is from either glitches on the data or clock lines, or s/w screwing up and
not sending the proper bit stream.

Arduino at fault?
In what way?

What code is being used to drive the 7219?
Is it a library or your own code?

Are you implying that the display was corrupted prior to the display being reactivated? and then
stayed corrupted?

Yes that's correct. A PIR shut's down the display during the day and starts it up when movement is detected. Part of the startup is
"lc.clearDisplay(0);" .
When it was reactivated the display stayed corrupted.

I am using the "LedControl" library to drive the display.

Arduino at fault?
In what way?

I was thinking if the display is being reset and yet is still corrupted maybe the source is wrong?

My full test code can be found at the bottom of the page here http://home.btconnect.com/brettoliver1/Master_Clock_MK2/Master_Clock_MK2.htm

oliverb:

Are you implying that the display was corrupted prior to the display being reactivated? and then
stayed corrupted?

Yes that's correct. A PIR shut's down the display during the day and starts it up when movement is detected. Part of the startup is
"lc.clearDisplay(0);" .
When it was reactivated the display stayed corrupted.

So you are saying you see a corrupted display before the display is "shutdown", then it remains
corrupted when it is turned back on again?

I am using the "LedControl" library to drive the display.

Arduino at fault?
In what way?

I was thinking if the display is being reset and yet is still corrupted maybe the source is wrong?

I don't understand. There isn't any sort of reset that is happening on the 7219 chip or the LED display.
note: to me "reset" implies a h/w function that restores everything to a known state.

My full test code can be found at the bottom of the page here http://home.btconnect.com/brettoliver1/Master_Clock_MK2/Master_Clock_MK2.htm

From a s/w perspective:
There isn't anything odd looking in the sketch code other than maybe the code
that constantly sets the scan limit, and clears the display before updating it.
and then setting the data for the two digits that are not being scanned.
It doesn't seem necessary but shouldn't hurt.

Assuming you are using IDE 1.x or later, the digitalRead()/digitalWrites() in the sample routine
should be ISR safe.
There isn't any buffer management type stuff in the sketch.

The only other thing might be to check the RAM usage to see how close to the limit
you are getting, particularly when all the debug stuff is turned on.

From a h/w perspective:
A scope would be useful to look at the power, ground, data, and clock signals.
You might try leaving the LCD & LCD backlight on, to see if that transitioning is creating a ground
bounce condition.

I think this may be a tough one to track down.

BTW, I'm just about to start bringing up a word clock.
So I'll be looking at the 7219 stability in great detail over the next few weeks.

I'm tracking time differently.
I'm using the Time library for main clock functions to keep the unix epoch time offset.
That time is synced by a DS3231.
The TimeZone library is used to correct for time zone and potentially DST corrections.
Then a WWVB receiver will sync the DS3231 and will be used for US daylight savings
corrections.

--- bill

There isn't anything odd looking in the sketch code other than maybe the code
that constantly sets the scan limit, and clears the display before updating it.
and then setting the data for the two digits that are not being scanned.
It doesn't seem necessary but shouldn't hurt.

That's an error on my part. I must change that !

The only other thing might be to check the RAM usage to see how close to the limit
you are getting, particularly when all the debug stuff is turned on.

I am running very close to the max sketch size. My sketch is 28418 bytes a few bytes more and it won't compile. Is this normal or a bug?

From a h/w perspective:
A scope would be useful to look at the power, ground, data, and clock signals.
You might try leaving the LCD & LCD backlight on, to see if that transitioning is creating a ground
bounce condition.

Maybe stopping the display from resetting every second will help. I also dim the LCD using the LED DP O/P from the MAX7219 via a transistor so this is also reset every second.

BTW, I'm just about to start bringing up a word clock.
So I'll be looking at the 7219 stability in great detail over the next few weeks.

Would be very interested to read about your word clock I have one planned for the future.

I'm tracking time differently.
I'm using the Time library for main clock functions to keep the unix epoch time offset.
That time is synced by a DS3231.
The TimeZone library is used to correct for time zone and potentially DST corrections.
Then a WWVB receiver will sync the DS3231 and will be used for US daylight savings
corrections.

I used the timezone library in my previous clock along with the DCF77 library. The DCF77 library in this clock has timezones built in although not so comprehensive. It has other advantages in time keeping though for Master Clocks in that it does not drift when there is very bad signal reception. As it predicts what the DCF77 code should be. Instead of a RTC is auto adjusts the Quartz crystal to the DCF77 signal to get within 1Hz after a few days. I am trialling it now and after just 1 day my crystal is just 4 Hz out.

There is 1 other hardware thing that may or may not be causing my problem. My 1" LED displays use 2 leds per segment and have a voltage drop of 4.4v to 5v. I have my PSU set the 5v I could try tweeking it down just to see if the display errors.

oliverb:

The only other thing might be to check the RAM usage to see how close to the limit
you are getting, particularly when all the debug stuff is turned on.

I am running very close to the max sketch size. My sketch is 28418 bytes a few bytes more and it won't compile. Is this normal or a bug?

the 28418 is code size or the amount of flash used.
I'm referring to RAM size.
The m328 only has 2k of RAM.
Variables, string literals, stack, all consume RAM.
If you overflow RAM weird things happen as the memory gets corrupted

From a h/w perspective:
A scope would be useful to look at the power, ground, data, and clock signals.
You might try leaving the LCD & LCD backlight on, to see if that transitioning is creating a ground
bounce condition.

Maybe stopping the display from resetting every second will help. I also dim the LCD using the LED DP O/P from the MAX7219 via a transistor so this is also reset every second.

RESET? still not sure what you mean by "reset".
I don't see anything in the code that could cause a reset.
hmm.. on the dimming of the LCD backlight.
I saw what looked like a wire going to the backlight, now I see what that is for;
however, the actual wiring in the photo doesn't seem to match the schematic.
I'm assuming that photos are not of final working project
For example, The SDA and SCK LCD pins are not hooked up.
I would be concerned about ground loops on the LCD backlight cathode,
but maybe you cut the PCB trace on the LCD backpack between the backlight cathode
and the backpack on board transistor?

There is 1 other hardware thing that may or may not be causing my problem. My 1" LED displays use 2 leds per segment and have a voltage drop of 4.4v to 5v. I have my PSU set the 5v I could try tweaking it down just to see if the display errors.

Is the 7219 getting hot?
What are you using for iset?
I don't see anything hooked up to pin 18 of the max7219 on the schematic
although it looks like there is something in the photo.

Also noticed that the code is using lcd.backlight() and lcd.noBacklight() to control the backlight
but the h/w also seems to have a 7219 wired up to control the backlight cathode.

So it would seem that there are two ways to turn on the backlight?

Those ground signals may be fighting each other if the grounds are not equal
due to a ground loop.

You could check them with a meter.

--- bill

the 28418 is code size or the amount of flash used.
I'm referring to RAM size.
The m328 only has 2k of RAM.
Variables, string literals, stack, all consume RAM.
If you overflow RAM weird things happen as the memory gets corrupted

How do I check how much RAM is being used?

RESET? still not sure what you mean by "reset".
I don't see anything in the code that could cause a reset.
hmm.. on the dimming of the LCD backlight.
I saw what looked like a wire going to the backlight, now I see what that is for;
however, the actual wiring in the photo doesn't seem to match the schematic.
I'm assuming that photos are not of final working project
For example, The SDA and SCK LCD pins are not hooked up.

I would be concerned about ground loops on the LCD backlight cathode,
but maybe you cut the PCB trace on the LCD backpack between the backlight cathode
and the backpack on board transistor?

By "Reset" I mean reset the display lc.clearDisplay(0) although it prob just blanks the display.

The photos will not show all wiring as it follows the work in progress on my clock. There is a photo showing all the boards connected including the SDA and SCK pins.

Yes I have cut the pin from the backpack to the LCD display board and I should have left out lcd.backlight() and lcd.noBacklight() commands as the MAX7219 controls the backlight LED via the transistor. I will change the code.

Is the 7219 getting hot?
What are you using for iset?
I don't see anything hooked up to pin 18 of the max7219 on the schematic
although it looks like there is something in the photo.

The 7219 does not even get warm.
I have missed pin 18 connections off the schematic. I have checked my veroboard layout program and it says 20K for Iset but I will have to confirm this from my circuit.

Thanks for your help so far. I will mod the code Tues am as I am still testing out the quartz auto tune function.

oliverb:
How do I check how much RAM is being used

The easiest way to get an idea is to build the sketch with IDE 1.5x
While it isn't the total RAM used when running, it
will tell you how much ram all your data is using.
It reports it when the build completes.

I have removed the unneeded looped code lc.clearDisplay(0) etc and after a few days the corrupted display is back.

I have now removed the long Clk, Data In and Load wires and added much shorter plug in wires from the Atmega 328 to the MAX7219. I have also added a 100nF capacitor across the Atmega328 power pins. I had 3 already on the nearby 555 timers and MAX7219 but it won't hurt to add another.

I can now unplug wires to the MAX7219 for testing as needed.

Is there some way of hard resetting the MAX7219?
By disconnecting and reconnecting the wires to the MAX7219 I can sometimes get the display to corrupt. Once corrupted I can't get it back. If I remove the power and all data wires and leave it for a while when I reconnect them I can get most of the display back correctly bar the hours.

I will leave it running for a week or so to see how stable the display is with these mods.

oliverb:
Is there some way of hard resetting the MAX7219?
No.
In my dealings with it, which hasn't yet been over prolong periods, I've never seen a 7219 issue.
The issues I've seen have all been external to the 7219.
Like not fully initializing the part, data corruption sending bad commands to the part,
and clock line glitches that corrupt internal registers.
To "reset" it, you should be able to just re-initialize all the configuration registers.
The 7219 libraries only initialize the registers once at starup. So if there is ever corruption of one
of those registers, it never clears up until the next power cycle, which restarts
the AVR when then calls setup() which calls the library which then re-initializes
the configuration registers. The power cycle had nothing to do with clearing up the
problem, but is often given the credit.

By disconnecting and reconnecting the wires to the MAX7219 I can sometimes get the display to corrupt. Once corrupted I can't get it back. If I remove the power and all data wires and leave it for a while when I reconnect them I can get most of the display back correctly bar the hours.

But you can't just yank the power and data/control connections and then re-connect them to
the running AVR that is continually talking the part.
The part should be fully re-initalized to ensure that all the
configuration registers are properly configured.
It would be more interesting to see if the corruption goes away when you hard reset
the AVR after the 7219 display output is corrupted.
Does it clear up then?

Which exact library are you using - give me link
and I'll look at the code to see if there is anything odd looking or missing
in the initialization sequence.

Thanks bperrybap
I use the ledcontrol library here Arduino Playground - LedControl

downloaded from here Releases · wayoda/LedControl · GitHub

How do I carry out a hard reset?

When I reset the Arduino using the reset button the display returns to normal.

edit...... Just measured the voltage from the Arduino Gnd to the MAX2719 Gnd and it is 10mV.
edit .. 2 ran a new pair of 5v and Gnd wires. Arduino Gnd to the MAX2719 Gnd voltage now.1.8mV

A quick update on my clock since replacing the 5v and Gnd wires and using much shorter Clk, Data In and Load wires.

The clock has been running now for 2 weeks without any display problems.

I will reply back in a few months with an update.

Thanks everyone for your comments/help so far.

1 Like

It's been a couple of months now and my clock is running perfectly.

DCF Master Clock by Brett Oliver, on Flickr

Thanks everyone for your help.

I have made a web page for the clock if anyone is interested http://home.btconnect.com/brettoliver1/Master_Clock_MK2/Master_Clock_MK2.htm