SD Card Connetcion problems

Hi Has any one experienced a Leonardo & ethernet shield intermittently not communicating with an SD card and causing a lock up? When I say intermittently it could be once a year or twice in quick succession.

I have a Temperature controller cum data logger cum web server running on a Leonardo using a standard ethernet shield and 8Gb Sd card. There is a lot of code and I don't need the boot loader or the serial interface so I've created a new board in 'boards.txt' to give the maximum space and i'm progamming the Leonardo with a USBasp and overwriting the bootloader. I've also turned off the ability for progamming to wipe the EEPROM and I preload that with temperature control points and IP addresses.

The code uses interrupts to set flags that set which bit of control code is used next in the loop. So every second the analogue inputs read a series of PT1000 sensors plus winks a led on/off and every 15 seconds the controls are turned on or off if needed. Then every 10 minutes data is written to a file on the SD card. Once a day we get UTP time from the web just to make sure the one second clock is not getting too far out of step and correct if needs be.

The Web server part can run on every loop and allows the current state of play to be viewed on a pc by reading and sending an HTMl file from the SD card and filling in the numbers. It also allows the break points for the temperatures to be adjusted and stored back in the EEPROM Likewise I can change the IP addresses if I really need to. It also allows me to down load the 'CSV' record files from the data logger part of the code.

When the unit hangs there are no more records written to the log file and the web pages will not down load.

This code has run continuously for years but!

At first I thought the problem was the mains power supply/ brown outs, but the system now includes a battery and it still happens.

I did postulate there might be an environmental temperature problem as the working unit is in a 25degC environment, but test units sat in the office at 19DegC have also exhibited the problem

I know its not memory conflicts as I have record the free RAM at various places in the code and the worst that that drops to is 640 bytes.

I've looked at all the 'while' loops including those in the SPI, ethernet and SDFat libraries etc and I don't think the problem is there as it would hang more often.

Given that pressing the reset button brings it all back to life the only other thing I can think of is how snugly the SD card fits in its holder. The problem happens with both the newer shields that have active level translation to the SD card and the older ones that use simlpe resistive dividers.

Unfortunately the system is supposed to run unattended and I've not been able to do that yet, any thoughts on how to find out where the code is stopping would be helpful.

I won't fill this message up with the source code as there is about 170k's worth of .ino/.h/.c/.html files but I'm happy to put them out if there are any specific suggestions

If pushing the reset button gets everything working again, it seems unlikely to be a connection issue. But a couple questions:

Why do you think it locks up in the process of communicating with the SD card? From your description, all you really know is that it's locking up. Why do you suspect the SD card?

If it has the same problem with different SD shields, does it also happen with different cards? With different Leonardos? Different cables? Different PCs?

Are you using FAT32, not FAT16?

If you substitute all the hardware, and it still locks up, then it almost has to be a software problem of some sort.

Well, I don't know how to solve this without adding some kind of debugging. Suppose you turn on an LED as you begin the SD activity, and turn it off when that's complete. Then when it locks up - is the LED on or off?

It just seems that with code the size of yours, not to mention whatever libraries you're using, you almost have to be able to pinpoint when this is happening to have a chance of fixing it. So it would be nice to have a way to indicate what it was doing when it locked up. Maybe even use the WDT to help with that.

Or, maybe you could just use the WDT to automatically reset the Leonardo when it locks up. That's what it's for. Not sure that would reset the shield or the SD card though.

Sorry I should have said : but the post was a bit long anyway.

1 There are a couple leds used one winks on the one second cycle the other is on/ off for sd card access as you suggested. Which is why I suspect that the problem is something to do with the sd card.

2 The problem occours with different Leonardoes /Shields/SD cards but one SD card seams to be more prone than others.

3 Using FAT 32 as there could ultimately be over 1000 record files and I don't want to run into FAT directory limits.

4 Ive already tried the WDT which as you say it resets the processor but not the shield.

Sounds like you're way ahead of me. Other things I can think of:

SD cards can use a good bit of power when writing, and some brands use more than others. If the 3.3V regulator isn't up to the task and the power sags, that can cause problems.

Some cards go into a kind of sleep mode on inactivity. I don't know how long they take to wake up. I don't really know much about this.

When writing, the card's controller may need to erase blocks and move things around, and that can take a fair amount of time. But I assume your library knows to wait. To test if this is involved, you could pre-erase the card using the SDFormater example in SDFat. That would eliminate the need to erase when writing.

How are the four SPI lines connected to the SD card holder? I believe the Leonardo is a 5V device, which means its SPI lines output 5V. But all SD cards are 3.3V. Does the shield have a voltage translator to accomodate this? If this might be an issue, you could try one of the microSD modules that include a translator chip. But you would need to modify it to make it work with the Ethernet shield (the module doesn't properly release MISO when it's not selected).

You said you still have ram available when this happens, but what about the stack. Is there any chance it's filling up?

It's just hard for me to see how it could be a bad connection problem if it happens with everything
substituted out.

Well, this is all just speculation. Maybe someone else can give you more specific advice.

The SDcard is not completely empty at start up it contains the html files.

As to the availability of RAM I was under the impression that the standard routine that uses '__brkval' and '__bss_end' to tell howmuch RAM is free took account of the stack. Have I been mis informed?

solar_eta:
The SDcard is not completely empty at start up it contains the html files.

As to the availability of RAM I was under the impression that the standard routine that uses '__brkval' and '__bss_end' to tell howmuch RAM is free took account of the stack. Have I been mis informed?

I don't know. Perhaps someone else can answer. I was just thinking you might be able to include the current value of the stack pointer in the HTML format, and see if it tends to migrate downwards over time.

Solved -- I think!!
I've got 3 of these systems running at the moment and on one combination of PSU/Leonardo/Ethernet Shield/SD (16Gbyt) card I can get the programme to hang every time the system is supposed to write data to the card. The other two set ups march on happily.

On the one that hangs the 5V rail runs at 4.7v until the write operation when it drops to 3 V despite the PSU giving 12v at the jack socket and supposedly rated at 2 amps, there is no apparent drop in the PSU output voltage. If I swap its PSU for an iPhone type USB charger rated at 1.2amp and supply the Leonardo via the usb port the 5V rail runs at between 4.7V and 5.09V and the unit does not seam to have any problems.

If I swap the other systems to the 12v PSU they hang occasional but not every time. I need to get a new virgin Leonardo and Ethernet shield and do some carefully planed swap arounds, but it looks as if its a voltage regulation problem on the arduinos while driving SD card/LEDs/Ethernet, even though I was careful in the sketch to make sure that each activity was complete before going on to the next.

By the way the kit is all genuine, no fleabay stuff here, the SD cards are all from Kingstone via my local Photographic shop and the Arduinos/shields all came from Maplins, albeit a while ago.

Has any body else had/resolved a similar PSU problem??