SD Card filesystem poblem

Hello! I have a nasty issue with my SD card storage. My application uses DS1307 RTC to generate 1 second interrupts and later in the loop() checks for the interrupt flag, if true resets it and logs some data from 2 sensors on a SD card. It works pretty normal. I also have error logging with numeric error codes on the same card, but different file, also works normally. On every 5 minutes (seconds counted from the interrupt % 300) I launch another part of the application, which transmits the data collected in the sd card to a remote server via GPRS. Sounds like too much but this also works perfectly fine (as much as possible with the dumb sim900...). The part of the program which handles the GPRS init and transfer is state based... so to say... it doesn't hold the processors control, it just advances state by state with one big switch/case sequence, to let the application continue logging while sim900 inits. When the GPRS module is ready to send data, the logger starts writing in a new file, so that the GPRS can read the old one and delete it after its all sent. All this access to the SD happens in a decent manner every time the logger takes the control it opens it's file, writes down stuff and closes it to sync data on the card (once a second), the gprs opens the other file, reads byte by byte around 1 kb and transmits it, then closes the file and releases the processor control, and repeats this operation as soon as possible, until the entire file is sent, after that the file is being deleted, and only the current working file of the logger remains on the card. And so on and so on... I think you got the idea... all this works pretty fine, until in a random moment, between 5 and 20 mins of uptime, the device encounters file read error and suspends it self. When I mount the card on my laptop, I don't see 1 working file and 1 error logging file on the card as expected... I find a shit load of strangely named files (sort of $@(#*(@%$^#&) with random sizes between 5 bytes and 16 mb (lol) that content either no valid data or very very old data deleted long time ago. The card becomes read only and cannot be used normally until I format it again... does any one have any idea how to solve this problem? My device works on adapter, it shouldn't be a power issue... tried with li-ion batteries also.. same result.

P.S.: Forgot to mention that I was using the SD library on arduino.cc but then tried with the latest SdFat, nearly same results. SdFat "lasted" longer and didn't create much corrupted stuff, just one file and one directory.

Some paragraphs and a summary might make your post easier to read. The ellipses do not.

It sounds, vaguely, though, like you have a code problem, but you didn't bother posting any code.

Yeah sorry, I know I chunter pretty much here are the snippets of code in my application that work with the SD:

//declaration
SdFat sd;
SdFile file;

this is being executed only at startup

//setup()
if (!sd.begin(SDCSPIN, SPI_HALF_SPEED))
{
        setupOk = false;
                
        error = 200;
}

...

errorLog();

if (!setupOk)
{
    powerOff();
}

this is being executed every second, but operates with SD only if error is > 0, which is not very often

void errorLog()
{  
    if (error > 0)
    {      
        if (file.open(SD_FILE_ERROR, O_CREAT | O_WRITE | O_APPEND)) 
        {          
              file.println(error);
              file.close();
              
              delay(50); //don't know if this is necessary, probably not, just getting paranoid
        }
      
        error = 0;
    }
}

this is being executed each second

//in the sensor data logging function

if (file.open(filename, O_CREAT | O_WRITE | O_APPEND)) 
{          
      file.print(packet);
      file.close();
              
      delay(50);
}
else
{
      error = 210;
}

this is being executed multiple times every 5 minutes, iterated with "delay" of between 0.5 and 2 seconds, depending on the cellular network conditions...

//some dirty code in the gprs handler.. reads the file and sends
//this code is being iterated until all the file is being read and sent, then the file is being deleted

                  if (file.open(filename, O_READ)) 
                  {
                      if (transferFileOffset > 0)
                          file.seekSet(transferFileOffset);
                    
                      char header[100];
                      
                      buildReportHeader(header);
                    
                      Serial.print(header);
                      
                      int bytes = 0;
                      
                      boolean stopTransfer = false;
                  
                      int data = 0;
                  
                      while (((data = file.read()) > 0) && !stopTransfer)
                      {
                          Serial.write(data);
                          
                           bytes++;
                           
                           if (data  == '\n' && bytes > 890) stopTransfer = true; //stop after the 14th package (~1kb total)
                      }
                      
                      transferFileOffset += bytes;
                      
                      if (file.fileSize() - transferFileOffset < 1)
                      {
                          file.close();
                          
                          sd.remove(filename);
                          
                          transferDone = true;
                          
                          transferFileOffset = 0;
                      }
                      else
                          file.close();
                  }
                  else
                  {
                      error = 220;
                  }

Can you provide links to the RTC, SD and GPRS hardware?

edit: Beware of digital pin 8. Some models of GPRS and SD shields use that pin.

GPRS module: Welcome to ELECFREAKS WIKI — ELECFREAKS WIKI

  • based on SIM900
  • uses digital pin 5 for reset and digital pin 6 for power up
  • has issues sending continuously large amounts of data
  • has power supply issues, probably responsible for the above one
  • has issues with turning on
    etc.

SD card module: http://www.lctech-inc.com/Hardware/Detail.aspx?id=0c3b6f7a-d101-4a60-8b56-3abfb7fd818d

  • SS is connected to digital pin 4
  • has 3.3v voltage regulator and supports both 3.3v and 5v vcc, connected to 3.3, no extra capacitors added... yet..

RTC module: Microbot - Real Time Clock module with DS1307

  • DS1307 based
  • 5v
  • i2c
  • using the "out" pin on digital pin 2 (interrupt 0) with 1hz clock "caught" upon RISING (once per second)

I am using pin 8 for MicroMag3 RESET pin.

Here are some screenshots of what happens to the SD card after some uptime (20-30) mins.

Maybe this caution will make sense?
http://www.elecfreaks.com/wiki/index.php?title=EFCom_GPRS/GSM_Shield#Cautions

SurferTim:
Maybe this caution will make sense?
http://www.elecfreaks.com/wiki/index.php?title=EFCom_GPRS/GSM_Shield#Cautions

I was thinking about this... I am using 1.5A 12V adapter, also tried with 2x 18650 li-ion batteries in series, from 7.4 to 8.4 v when charged... 2400 mAh, but I am not sure about the peak capacity of those batteries... should be above 1x I guess... (hope)

I was also thinking about adding one big fat capacitor after MIC 29302WU (on the GPRS board) to try to compensate some drops... I could also add one cap on the vcc of the sd card module, although its quite odd for the arduino to survive such a voltage drop that the sd wouldn't...

I don't know what type of voltage regulator the SIM has, but if it is linear, the 12V 1.5A power supply may not be enough current and too much voltage. That will cause a heat problem at the regulator.

I don't know about the li-ion batteries. I use 4S 5000ma li-po batteries for testing wifi router (6 watts - switching power supply, not linear) locations, and the voltage drops off pretty quick. Maybe a voltage check would be in order?

I am the author of SdFat which is also the base for SD.h.

I have found most bugs that result in junk file-names are due to overwriting SdFat internal memory. The SD cache often has a directory block with 16 directory entries so you get lots of junk names if it is over written.

Check loop indices, array dimensions, strings with no zero byte termination, bad pointers, and any other causes of overwriting memory.

Hello again, I have decorated my project with some capacitors last night 1000uF and 220 uF on the output and input of the voltage regulator of the gprs module, another 470 uF on the vin and another 1000uF on the 5V. It started to look like a xmas tree, but made no actual effect what so ever.

I then commented just one line of code:

//sd.remove(filename);

and yes, its a miracle, the project lived for a very long time (until I turned it off actually) and outlasted the previous record of uptime (20min) many times. If what fat16lib say is so, then I should stop working very soon, because when I don't delete the files I get a new file each 5 minutes.. 12 directory entries per hour.

None of the other libraries I use are writing on the PROGMEM so I guess that shouldn't be the problem of SdFat. I had problems with SD.h (the old and modified one) because it was pretty larger, and my rom was 29.5 kb large. After uploading it to the atmega328p-pu nothing worked, because there was not enough space on the flash for buffers/cache and etc. With the new SdFat my rom shrinked down to 26.5 kb and everything seems to be working fine. (edit: except this issue with the fat16 filesystem corruption lol)

Do you get corrupt files if you don't comment out the remove()?

If you still have the problem with remove(), is is likely something is writing over SdFat memory.

Remove() needs to do lots of writes to the SD and would likely cause file-name problems if SdFat memory is overwritten.

Unfortunately, yes.

It lasted much more though... 20mins. vs 2h 30mins. Any ideas what could have caused that..?

I have news! After disabling the file removal, the uptime was far more than before, right? Now, after I disabled data logging when the GPRS is working, the device seem to be stable, it's working for like 15 hours now. This is not the best scenario, because the sd card will get full pretty quick and mostly because of the data loss... on every 5 mins I get a blank spot of about 1.5-2 mins, while the gprs is running... Somehow, this leads me to thinking that either voltage drops or interference from the gprs tx bursts are causing the fat16 corruption. Sounds pretty hopeless...

Modern SD cards are very tolerant to voltage drops. They don't commit writes when power is failing.

The SD data blocks have very powerful ECC so hardware write errors would be detected when you read the Sd on a PC or Mac.

This is a software problem.

How do you run data logging while GPRS is working?

You can't have a file open more than once and you can't call SdFat functions in an ISR.

My gprs handling functionality does not hold the mcu control for the entire process of transfer. I don't use delay(). The handler is designed as a state machine. From state 1 (power on) to state 20 (power off), each state has entry procedure and result or time dependent procedures. The handler holds the mcu control for a very short while, but many times thru the whole process of transferring data. Therefor my application runs its loop() pretty normally, and checks the ISR flag every time, if the interrupt has occurred - runs the logger, if not, continues. If both the logger and the gprs handlers are idle (has nothing to do in the current loop()), a sleep is set, until the next interrupt. While the gprs is running sleep is disabled... here is an example...

void loop()
{
	if (interrupted) // show new time only when new interrupt signaled
	{
		interrupted = false;

                seconds++;

		idle_logger = false;

                if (seconds%GPRS_IDLE_TIME == 0)
                {
                    idle_gprs = false;
                }
                
                if (seconds%BATTERY_MONITOR_IDLE == 0)
                {
                    idle_battery_monitor = false;
                }
	}

        if (!idle_logger && idle_gprs) //test test test
            runLogger();
        
        if (!idle_gprs)
            runGPRS();
        
        if (!idle_battery_monitor)
            runBatteryMonitor();
        
        errorLog();

        if (idle_logger && idle_gprs && idle_battery_monitor)
            setSleep();
        
}

The logger resets its idle status every time, so after log is being written the system goes back to sleep, unless the gprs is doing something. The gprs holds its idle status to "false" until the final state procedure has been complete. That's how I combine both processes. You can see in my previous posts part of the code of "runLogger()".

If is not the voltage drops that cause the issue, then maybe signal interference occurs... Maybe I should wrap the arduino and the SD module in some sort of a faraday cage :smiley:

Once the data gets into an SD card there just are not undetected write errors.

Noise on the SPI bus is be the only thing left.

You can check that by enabling CRC on SPI transfers between the Arduino and SD.

I added software CRC to SdFat so edit SdFatConfig.h at about line 35 and change USE_SD_CRC to 1 or 2.

/**
 * To enable SD card CRC checking set USE_SD_CRC nonzero.
 *
 * Set USE_SD_CRC to 1 to use a smaller slower CRC-CCITT function.
 *
 * Set USE_SD_CRC to 2 to used a larger faster table driven CRC-CCITT function.
 */
#define USE_SD_CRC 0

Then all transfers on the SPI bus will be CRC protected. Calls to SdFat functions will fail with an error return.

Thank you very much, I will try this out, because since I've disabled writing while gprs is on the device worked for more than 24 hours without any faults.

Will respond back with results soon!

As a test, try running the GSM and the Arduino/SDcard on two separate batteries.

This will put to rest the problem area.

An barrier of opto isolators between the two may help your problem.

Try to write a RESET message into the SDcard.

Your memory may be getting corrupted.

Good Luck

fat16lib:
Once the data gets into an SD card there just are not undetected write errors.

Noise on the SPI bus is be the only thing left.

You can check that by enabling CRC on SPI transfers between the Arduino and SD.

I added software CRC to SdFat so edit SdFatConfig.h at about line 35 and change USE_SD_CRC to 1 or 2.

/**
  • To enable SD card CRC checking set USE_SD_CRC nonzero.
  • Set USE_SD_CRC to 1 to use a smaller slower CRC-CCITT function.
  • Set USE_SD_CRC to 2 to used a larger faster table driven CRC-CCITT function.
    */
    #define USE_SD_CRC 0



Then all transfers on the SPI bus will be CRC protected. Calls to SdFat functions will fail with an error return.

I actually failed in attempting this, because this made my rom a bit larger than before and I guess that the flash memory was not enough for SdFat buffers, everything crashed in a reset loop...

Because of the fact that no bug occurs when I don't write on the SD while gprs is running leads me to thinking that the problem is not with the SdFat but with some sort of electromagnetic interference that messes up the SPI bus (that runs below the gprs board). I will try isolating some how the two parts of the device, because using AC/DC adapter with larger capacity didn't change anything, which removes the possibility of power issues causing voltage drops on the SPI.

donvukovic:
As a test, try running the GSM and the Arduino/SDcard on two separate batteries.

This will put to rest the problem area.

An barrier of opto isolators between the two may help your problem.

Try to write a RESET message into the SDcard.

Your memory may be getting corrupted.

Good Luck

The memory card works perfectly in other test applications.. It's probably ok. I will do some more tests and will feed back soon...

I guess that the flash memory was not enough for SdFat buffers, everything crashed in a reset loop...

Flash use will not cause a crash in a reset loop. If you are that close to running out of RAM, you will likely have a problem if you log data and run gprs at the same time. A pin change interrupt in SoftwareSerial can cause a stack overflow while an SdFat function like remove() is executing.

If possible check the amount of free stack by adding this include:

#include <SdFatUtil.h>

And this print in setup()

  Serial.println(FreeRam());

You need 200-300 bytes of free RAM in addition to any you allocate in functions.

I looked at your SD module and these often fail.

The problem is that these modules don't use proper level shifters on MOSI, SCK, and CS. These signals should be converted from 5V to 3.3V with an IC based level shifter. Most SD cards are not designed to accept 5V signals.

Too bad you can't use CRC on the SD to check for data transfer errors. You can check for any detected SD problem like this:

if (sd.card()->errorCode()) {
  // print SD I/O error code
  Serial.println(sd.card()->errorCode(), HEX);
}

Here are typical SD modules with a level shifter in addition to a 3.3V regulator

http://www.pjrc.com/teensy/sd_adaptor.html