Go Down

Topic: Occasional error reading from AM2130/IDT7132 Dual Port SRAM (Read 3621 times) previous topic - next topic

monte_carlo_ecm

Hi Arduino enthusiasts,

I imagine my problem/question will be difficult to comment on without pouring over my wiring and code, but incase there is something fundamental I am missing, I thought I would ask.  I am very comfortable with programming, been writing code for the better part of 30 years, but electronics is not my strong suit at all.

First I should summarize my project.  I am using my Arduino Mega2560 R3 to read from one port of a Dual Port SRAM chip - in particular an AM2130 (I have tried an IDT7132 with the same result).  The other port is written to by an old 6801 based car computer at the command of an interrupt triggered by the Arduino - I modified the car computer code to add the interrupt routine.  The interrupt routine simply copies the car computer RAM to the SRAM chip so the Arduino can read the data and process it further.  The car computer adds a checksum byte and a request counter to the end of the RAM values it copies so the Arduino can test the data for validity.  Here's the data sheet for the SRAM chip:

AM2130 Datasheet

5 times a second, the Arduino flips a pin connected to the IRQ of the car computer, waits a suitable amount of time for the data to be copied by the car computer (10us - I've tried longer values, doesn't help my issue), then uses an adaptation of this sketch to read the values back (224 bytes total) from the SRAM:

eeprom_read.pde

Let me refer to the 224 bytes as a Packet.  My issue is that about 0.5% of my packets contain at least one "bad" byte.  I know it must be the Arduino side of the equation, because I wrote a retry routine and sure enough, second pass the Arduino gets the correct value.  Here's an example of my Serial output:


Error - Checksum mismatch!
Checksum error - 3C vs 7C
Retrying and dumping differences:
0 : EE/EE 1/1 76/76 80/80 80/80 66/66 66/66 0/0 0/0 0/0 1/1 CD/CD 4/4 74/74 DE/DE 0/0
10 : 41/41 C0/C0 91/91 1E/1E 2/2 0/0 60/60 0/0 0/0 5F/5F 3F/3F FD/FD 2/2 6D/6D 3F/3F A2/A2
20 : 83/83 59/59 8B/8B 8E/8E 8E/8E 0/0 ED/ED 1/1 14/14 88/88 8E/8E 4B/4B 4B/4B 20/20 7D/7D 47/47
30 : 4B/4B 4B/4B 4B/4B 4B/4B 8C/8C 96/D6 96/96 76/76 86/86 ED/ED 2F/2F 14/14 14/14 0/0 14/14 9D/9D
40 : 46/46 85/85 42/42 0/0 2D/2D 2A/2A 3D/3D AE/AE 4E/4E 63/63 1A/1A D6/D6 55/55 1/1 45/45 0/0
50 : 15/15 83/83 6E/6E 30/30 0/0 6E/6E 1/1 BE/BE 0/0 3/3 0/0 0/0 0/0 0/0 2/2 6D/6D
60 : 0/0 5/5 1/1 43/43 50/50 0/0 39/39 0/0 3F/3F 0/0 0/0 50/50 0/0 39/39 3D/3D FF/FF
70 : 58/58 0/0 0/0 50/50 0/0 0/0 3D/3D 0/0 5/5 FF/FF 58/58 1/1 43/43 0/0 0/0 3/3
80 : 2D/2D 3/3 C/C 0/0 32/32 0/0 0/0 A/A 0/0 0/0 0/0 18/18 0/0 0/0 44/44 6/6
90 : 0/0 0/0 B2/B2 A5/A5 80/80 80/80 40/40 9C/9C 4D/4D 4D/4D 40/40 40/40 40/40 40/40 40/40 40/40
A0 : 40/40 40/40 9A/9A EF/EF 9A/9A 5/5 0/0 2/2 4/4 0/0 0/0 9C/9C 98/98 0/0 29/29 91/91
B0 : 20/20 FF/FF FF/FF FF/FF 0/0 0/0 0/0 88/88 0/0 0/0 0/0 FF/FF 0/0 2/2 0/0 0/0
C0 : 3/3 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0
D0 : 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 C4/C4 43/43 6D/6D 3C/3C
E0 : EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE
F0 : EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE
Differences at bytes: 35


This dump shows each byte read in the first (error) pass, followed by a slash, followed by the byte read in the retry pass.  If you look at the byte at offset 0x35, it differs by exactly the checksum, so the second pass got the right answer.  My request counter is at offset 0xDD/0xDE (16 bits so it doesn't wrap around too fast).  It didn't change between the two reads the Arduino made from the SRAM, so I know that the car computer didn't touch the data due to the interrupt line floating around or something else strange.

What I find really confusing is that, even though 99.5% of the packets are clean, when it does error out, it's just as frequent to have many bad bytes as just one.  Here's another packet with lots of differences:


Checksum error - 78 vs FC
Retrying and dumping differences:
0 : EE/EE 1/1 0/0 80/80 80/80 66/66 66/66 0/0 0/0 0/0 1/1 CD/CD 0/0 60/60 DE/DE 0/0
10 : 41/41 60/60 45/45 2E/2E 2/2 0/0 20/20 0/0 0/0 49/48 34/34 D1/D0 2/2 EF/E9 34/34 0/1
20 : 78/78 D8/5C 8B/8B 9B/9B 9B/9B 0/0 ED/ED 1/1 14/14 95/95 9B/9B 2E/2E 2E/2E 2E/2E 7D/7D 46/46
30 : 2E/2E 2E/2E 2E/2E 2E/2E 89/89 DE/DE DE/DE 37/37 47/47 ED/ED 2F/2F D7/D6 D8/D5 0/2 0/86 C6/C6
40 : 61/61 D7/D7 9D/A7 0/1F 92/54 8D/CC BB/92 3C/8D F5/BB 84/3C 7/7 99/99 18/18 0/0 66/66 0/0
50 : 0/26 34/78 0/36 80/30 0/0 0/CA 0/8 0/9A 0/0 0/2 0/0 0/0 0/0 0/0 2/2 ED/F2
60 : 0/0 8/4 1/1 50/4A 50/50 0/0 39/39 0/0 3F/3F 0/0 0/0 50/50 0/0 39/39 3D/3D FF/FF
70 : 3D/3D 0/0 0/0 50/50 0/0 0/0 3D/3D 0/0 5/5 FF/FF 3D/3D 1/1 50/4A 0/0 0/0 3/3
80 : 2D/2D 3/3 E/E 0/0 32/32 0/0 0/0 A/A 0/0 0/0 0/0 18/18 0/0 0/0 3/3 6/6
90 : 0/0 0/0 67/67 80/80 80/80 80/80 40/40 4D/4D 4D/4D 4D/4D 40/40 40/40 40/40 40/40 40/40 40/40
A0 : 40/40 40/40 81/81 0/0 81/81 1/1 0/0 0/0 4/4 0/0 0/0 80/80 0/0 0/0 0/0 2F/2F
B0 : 20/20 FF/FF 0/0 0/0 1E/1E 0/0 0/0 95/95 0/0 9/9 B4/B4 0/0 0/0 2/2 0/0 0/0
C0 : 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0
D0 : 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 B/B A5/A5 78/78
E0 : EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE
F0 : EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE
Differences at bytes: 19 1B 1D 1F 21 3B 3C 3D 3E 42 43 44 45 46 47 48 49 50 51 52 53 55 56 57 59 5F 61 63 7C


So 29 bytes different between the reads there.  What's funky about this one is that many, but not all, of the error bytes are 0, not just a single bad bit but the entire byte is coming back low.  Regardless of how "severe" the error is in terms of number of bytes trashed, a fraction of a second later it goes back to getting clean packets for a while until the next error.   :o

I can and will code around the issue by enhancing this retry logic to just try reading a few times before giving up on a particular iteration.  That said, there is still a measurable chance that an 8-bit checksum is "fooled" by two or more offsetting bad bytes so I'm still wishing I could find a more robust answer.

I used oshpark.com to make a really nice circuit board for this interface.  I'm using ribbon cables with crimped on connectors (like the old style "IDE hard drive cables").  It looks like this (with the IDT7132 chip installed rather than the AM2130, but like I said, both behave the same):



I was careful to use pull-down resistors on both ports for the address lines I don't use (I'm only addressing the lower 256 bytes of the SRAM chip), so the extra address lines aren't floating around.  So I can't see anything sketchy about my wiring or design, it's all put together quite solidly compared to my breadboard version which had the same issue (I thought soldering it all down would help - Argh!)

The one piece that's a little suspicious is that I had to make the ribbon cable that goes between the Arduino and SRAM board about 2.5 feet long because the Arduino can't live very close to the car computer just due to space issues.  I figured that the Arduino isn't going very fast though relatively speaking, and comparing this to running an old IDE hard drive with the same style cable, I remember cables of that length working fine in those applications (did they have to cope with high amounts of error checking/correction in those applications too?)  My breadboard version used a much shorter cable when I had the car computer, breadboard, and Arduino laid side-by-side on the car floor while I was designing this.  I had the same problem then with the occasional errors as I do now.

I guess that sums it up.  Thanks for reading, and any thoughts you might have!


MarkT

Do you have ceramic decoupling capacitor(s), not just electrolytic?
[ I DO NOT respond to personal messages, I WILL delete them unread, use the forum please ]

monte_carlo_ecm

Hi MarkT,

Thanks for your response.

The only capacitor I have employed is at the 5V Vcc power for the AM2130 chip (you can see it there in the photo above).  If you have a look at the picture I posted above - I run a ribbon cable straight from the shorter of the two header blocks to the Arduino Mega digital pins 22 through 45 - the ones at the "back end" of the Mega.  Traces on that board shown go straight to the AM2130 address/data/select pins.


I know just enough about electronics to be dangerous - almost nothing really.  Can you give me a few hints on where/why/how I would use capacitors to solve the issue?

Thanks again!


outofoptions

Parallel communication is susceptible to losing data which is why hard drives now use serial.  If all bits don't arrive in the alloted time it doesn't work.  The reliability of serial  ends up making it faster.   To lose a whole byte though I'd suspect loss of clock so the byte failed to 'latch'.  I doubt this is happening but given the intermittent nature I thought I'd muddy the waters with it. ;)

monte_carlo_ecm

Hi outofoptions,

Thanks for the thoughts.  I'm hoping the problem isn't as complex as what you are aiming at.  The communication in my application is essentially serial.  It's a simple archaic address bus I'm working with, nothing fancy.  The SRAM chip essentially behaves like an old style EPROM, like say a 27C512; all address bits are requested independently by a separate pin, and one data bit is fetched at a time from separate pins.

I'm really hoping to hear more about MarkT's thoughts.  I looked up some info about decoupling capacitors and something smells right about that idea, I just don't know enough about electronics to know how to apply the concept.  Here's why I think he's on to something there:

One thing I forgot to mention in my original question/post:

I can turn the car ignition on but not start the car and the Arduino will consume data error free from the SRAM chip indefinitely.  The data is very boring without the engine started, but it's data, and never corrupt with the engine off.  The issue only occurs with the engine running.

So with that tidbit of additional info... Help?  ???

Paul__B

I can turn the car ignition on but not start the car and the Arduino will consume data error free from the SRAM chip indefinitely.  The data is very boring without the engine started, but it's data, and never corrupt with the engine off.  The issue only occurs with the engine running.
Well, that rather says it all, doesn't it?

monte_carlo_ecm

#6
Jan 23, 2016, 08:49 pm Last Edit: Jan 24, 2016, 05:27 am by monte_carlo_ecm
Hi Paul__B,

Probably that says it all, yeah.  I regret not mentioning that detail in the original post, it wasn't until the questions started coming about decoupling capacitors that my train of thought shifted to it.  I've been a software guy all my life, hardware and electronics is not a strong part of my background.  My brain just wanted to park that detail as irrelevant because the car computer isn't doing very much without the engine running, much of the data is 0's and hardly anything changes over time.  But I can see now that's just my software based perspective being wrong.

Engine noise certainly seems like a reasonable suspect.  We're not at all talking about a quiet/modern engine here.  It's a heavy old Chevy 305 8 cylinder 5 liter with an aftermarket ignition module and medium cam shaft - so muscle car stuff basically. 

So all that said, I'd still be super grateful for some hints/details on what capacitors to choose and where to locate them.  If the location answer is right near the Vcc for the SRAM chip - Do I even need that 10uf electrolytic capacitor or can I reuse that spot for a different capacitor or perhaps pair of capacitors?  I have no reason to have chosen the 10uf electrolytic, I just read something that says capacitors should be used at the Vcc of all integrated circuits so I stuck it there kind of randomly chosen.

Just so everything is out on the table - I power the SRAM chip from the +5V coming out of the port on the car computer; these old GM car computers had unused edge connectors on them (probably intended for factory diagnostics) that expose the entire address/data bus, interrupts, etc. and of course a few +5V and ground pins, so that's what I'm interfacing with.  I built one of these (image below) to power the Arduino Mega (I replaced 7805 with 7809 to satisfy the Mega's need for 7-12V, all other components are the same as in the schematic) taking the input power for that circuit from a spare ignition power on the car fuse box:



The fuse box is on the left side of the car and the car computer/Arduino on the right, so I had to run a few feet of cable to from the fuse box to this power rectifier circuit which I hide behind the radio in the middle of the car, then another few feet to run from the power rectifier output to the Arduino

I'm hoping I can find a way to fix this issue without redesigning my SRAM circuit board, but I shouldn't get ahead of myself here...  Anyone willing to help me select component(s) and location(s) to hopefully smooth out the noise issues?

Thanks for everything so far folks!

monte_carlo_ecm

Anyone able/willing to help me take at least a decent guess at a value and location for a decoupling capacitor that might help my noise issues with this circuit?

I'm about ready to hook up my Saleae Logic Analyzer to try to capture the analog signal on a glitched data line, but even if I do manage to catch it in the act and find the point in time, I'll still not have enough electrical knowledge to apply a fix.

Thanks!


monte_carlo_ecm

I'm trying to collect as much data as possible here in hopes that someone with some background in electronics will be enticed to help me close up this glitching issue likely coming from engine noise.

It turns out that the Saleae knock off that I bought from ebay some time ago and never tried is not the analog capable version, just the old Logic 16 digital only version.  Regardless, I figured I'd give it a go and try it out.  I hooked it up at the SRAM chip, put in a "trigger" from the Arduino so I could find errored packets and went for a test drive.

Here's an errored packet:


Error - Checksum mismatch!
Checksum error - 9C vs 30
Retrying and dumping differences:
0 : EE/EE 6/6 9A/9A A2/A2 80/80 AD/AD 66/66 0/0 0/0 0/0 2/2 36/36 4/4 70/70 DE/DE 0/0
10 : 41/41 86/86 81/81 3E/3E 2/2 0/0 62/62 0/0 0/0 57/57 3C/3C EE/EF 2/2 94/93 3C/3C 0/0
20 : 78/78 A3/2A 8C/8C 9E/9E 9E/9E 0/0 F0/F0 1/1 17/17 95/95 9E/9E 4B/49 4B/49 20/20 7D/7D 47/46
30 : 4B/4B 4B/49 49/49 49/49 8C/8C 67/67 67/67 9A/9A AA/AA EE/EE 2C/2C 35/35 1C/1E 0/0 5B/3F AE/AE
40 : AF/AF C1/C1 C5/C5 4/1 FF/FF EF/EF 12/12 10/10 23/23 29/29 19/19 52/52 50/50 3/3 E2/E2 0/80
50 : 16/FF 78/2A 8D/52 30/20 0/0 73/C7 0/0 B6/BE 0/0 2/3 0/0 0/0 0/0 0/0 2/2 94/92
60 : 0/0 1/3 1/1 41/43 50/50 0/0 39/39 0/0 3F/3F 0/0 0/0 50/50 0/0 39/39 3D/3D FF/FF
70 : 51/51 0/0 0/0 50/50 0/0 0/0 3D/3D 0/0 5/5 FF/FF 51/51 1/1 41/43 0/0 0/0 6/6
80 : 9B/9B 6/6 1C/1C 0/0 32/32 0/0 0/0 A/A 0/0 0/0 0/0 18/18 0/0 0/0 D2/D2 B9/B9
90 : 0/0 0/0 BF/BF BF/BF 80/80 80/80 9C/9C AC/AC 4D/4D 4D/4D 77/77 7C/7C 40/40 40/40 77/77 77/77
A0 : 40/40 40/40 B6/B6 2F/2F B6/B6 5/5 0/0 2/2 2/2 AD/AD 17/17 AC/AC 66/66 0/0 29/29 91/91
B0 : 20/20 FF/FF FD/FD FF/FF F/F 0/0 0/0 95/95 0/0 1/1 0/0 FF/FF 0/0 2/2 0/0 1/1
C0 : 5/5 C9/C9 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0
D0 : 0/0 0/0 0/0 0/0 0/0 0/0 1/1 0/0 0/0 0/0 0/0 0/0 0/0 75/75 F2/F2 9C/9C
E0 : EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE
F0 : EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE EE/EE
Differences at bytes: 1B 1D 21 2B 2C 2F 31 3C 3E 43 4F 50 51 52 53 55 57 59 5F 61 63 7C


With 16 pins on the logic analyzer I only have enough for the lower 4 data bits after the trigger, chip select, read/write, enable, and address lines - without those I can't find the point in time where the data is "bad" at the Arduino, nor can I prove that the inputs to the SRAM are good.  Regardless, that's good enough to prove the point.  I wanted to have a look at Byte 0x50. especially  It should be 0xFF as the retry read shows.  But the first read gets 0x16.  Looking at the logic analyzer at the packet just before my trigger pin fired because of the error I see this:



So D0-D3 are all 1 for the entire read period of that byte.  D0 and D3 would be 0 at some point if the answer was really 0x16.  So the Arduino is getting the wrong answer.  My first question is - would the logic analyzer have seen something different if I hooked it up AFTER the 2.5 feet of cable beyond the SRAM chip?  Do I need to hook it up at the Arduino instead of the SRAM chip?

I checked a few other error bytes as well, same thing, logic analyzer doesn't see what Arduino sees, logic analyzer sees the right answer for the entire period the output enable pin is set to read data.

Here's the only thing I saw that's suspicious.  Take a look at this overall picture:



The Arduino requests a data transfer 5 times a second and that corresponds to the little rectangular blocks of activity.  But then during what should be quiet time, the logic analyzer sees a little blip on some of the pins, designated by the thin lines scatter through the timeline occasionally.  Zooming in on one of them, it's very brief and affects what seems to be a random set of pins:



Does this help shed any light on what needs to be done to eliminate these occasional data glitch issues I'm getting at the Arduino side?  Any thoughts are appreciated!


monte_carlo_ecm

Taking a bit of a shot in the dark, I replaced the 10uf electrolytic capacitor I had on my SRAM board running between VCC and ground with a .1uf ceramic disc capacitor - based on what I was reading online about similar SRAM chips and IC's in general, the .1uf ceramic disc seemed to be the best general purpose decoupling capacitor for such an application.

I took the car for a test drive.  Very similar results - an errored packet at random times, on average about one per 800-1000 queries.  Sometimes just one byte, sometimes a few, and other times 20-30 bytes.

I still don't get why the logic analyzer doesn't see it but the Arduino does.  :o

At the risk of focusing on the wrong thing, I'm going to drop one more question here - can the Arduino Mega (+ EasyVR voice recognition shield + SparkFun MP3 shield [doing nothing but waiting] + 16x2 LCD display) run for any reasonable length of time on a 9V battery?  I'd need about 15 minutes to really prove things as the same or different.  Would driving it with a battery source prove anything in terms of targeting or eliminating the power supply I built (schematic above) for the Arduino from the car fuse box?

Whatever it is I have been saying or not saying here hasn't caused much in the way of repeat engagement.  So I'll probably just rely on my checksum/retry logic for the time being, but if at some point anyone has any ideas or similar experiences, I will be listening with a burning curiosity!

Thanks for reading!

outofoptions

I haven't verified it but the Atmega chips aren't rated for use in automobiles according to a post someone made.  They supposedly took the info from the data sheet.  Could be they know something.

monte_carlo_ecm

That's a good thought outofoptions.  At some point I may need to accept that the Atmega (Arduino) isn't entirely happy in a car.

Please disregard the question about the battery above.  I wasn't thinking straight.  I ran out and pulled the plug out of the power port on the Arduino and ran it off Laptop USB power.  Same result.

But I have a new suspect.  While I thought I had done this before, it may not have been with the engine running.  I wiggled the ribbon cable I have on my SRAM board going to my Arduino a bit while the engine was running.  Sure enough, I could clock dozens of errors that way.  Took apart the cable and found that I must have slightly bent some of the crimping pins on the ribbon cable connector at the SRAM end.  I imagine this could have led to a breach of the wires on the cable too.  Just as a quick experiment, I tried to clean up the pins a bit with a jeweler's screwdriver and a magnifying glass and then reinstalled it for a test.  While I seem to get the same number of errors, it happens on its own on only one byte in each errored packed, I didn't get a single errored packet with > 1 byte bad.  And I can still make it error violently by wiggling the cable at the SRAM end.

That cable is very likely toast, I need to get another cable.  I'm trying to remember why I didn't just use an old 26-pin floppy cable from the electronics surplus place instead of building one from cable and connectors myself - the pre-built cable should be a heck of a lot more robust and reliable than my hand built one if I can get one in the right length, gender, and without that weird twist they put in those old cables as I remember!  I need to go by the surplus place near me and see if they have an old floppy cable that would work.

Consider it in my hands for now until I can prove my cabling is good.  Thanks to everyone for reading thus far and helping talk this through with me, I really appreciate it!  I feel pretty lame that I posted prior to being 200% sure my cables were rock solid.  My apologies for that!
 

monte_carlo_ecm

I didn't use a floppy cable because they were 34-pin not 26-pin.  Been so long since I've touched a floppy drive I had forgotten how many pins they were.  I might need to search for another source of a pre-made 26-pin cable...

Paul__B

That's a good thought outofoptions.  At some point I may need to accept that the Atmega (Arduino) isn't entirely happy in a car.
Nonsense!  A microcontroller is a microcontroller.  Silicon is very similar, one family to another.

Rating for automotive use focuses on temperature specifications for modules in the engine compartment and whether the manufacturer feels like warranting this specification.  If it isn't in the engine compartment, that is generally not a concern.

It is your business as the designer to condition the supply voltage and all the inputs (and outputs) to their specified values.  If you allow those to stray, then no microcontroller (or clearly, any other device) will be warranted to function.

monte_carlo_ecm

Thanks for the encouragement Paul__B.  I'm actually hoping I can get it performing satisfactorily with a new cable.  This won't be the first time in my project where I've been reminded that something which isn't completely rock solid may seem to work okay on the "bench" but then act strangely in the field because of the more hostile conditions.


Go Up