file.seek() seems to fail beyond some value

I’m building a dictionary with the arduino pro micro and an 8GB SD card.
First thing to note is that cardinfo returns this about the 8GB card:
Initializing SD card…Wiring is correct and a card is present.

Card type: SDHC

Volume type is FAT32

Volume size (bytes): 3706716160
Volume size (Kbytes): 3619840
Volume size (Mbytes): 3535

Files found on the card (name, date and size in bytes):
DATA.TXT 2014-12-06 20:35:36 1782580536

DATA.TXT is a very large text file (1.7GB), with many definitions in it, with a 12000 byte record size. This way, no matter the size of the definition, I should be able to seek to n*12000 to get the nth record from the file.

But, seek seems to silently fail (even returns success) above some number.

Experimentally I have determined this number to be around 948000 - I cannot seem to seek past this position in the file.

Any thoughts?
Thanks,
Josh

ReadWrite2.ino (2.6 KB)

No thoughts or ideas from anyone?

Some additional things to note:

I’m using Arduino IDE 1.0.6 on Ubuntu Linux.

Maybe relevant, maybe not, but note that the cardinfo sketch reports the SD card size as 3.5GB, which is incorrect.
I can attach the (compressed!) DATA.TXT if it will help.

Thanks,
Josh

Did you format the card with SDFormatter as recommended by fatlib?

If Cardinfo says it is 3.5Gb then it is 3.5Gb............... would it be a cheap ebay card by any chance? Sdfatlib is a mature library and does not make mistakes like that.

Regards,

Graham

The SD card is a microSD, Lexar brand. It came preformatted, purchased at Office Depot. I did not use SDFormatter on the card.

I wrote a perl script on the linux machine to parse through with similar seek calls, implementing the same logic as the sketch, and the data is definitely there and presented properly.

Thanks, Josh

Thanks for the guidance on SDFormatter. Unfortunately, it's a windows or mac-only thing, and I'm running Ubuntu. Some extensive research on the 'net about SD cards seems to indicate they don't really need much in the way of care and feeding on formatting, except that skipping the first 4M of the card will result in higher write performance. Ubuntu recognizes the card (via USB reader) as: [55527.927703] scsi 6:0:0:0: Direct-Access Mass Storage Device 1.00 PQ: 0 ANSI: 0 CCS [55527.927937] sd 6:0:0:0: Attached scsi generic sg1 type 0 [55528.193547] sd 6:0:0:0: [sdb] 15644672 512-byte logical blocks: (8.01 GB/7.45 GiB) [55528.193761] sd 6:0:0:0: [sdb] Write Protect is off [55528.193763] sd 6:0:0:0: [sdb] Mode Sense: 03 00 00 00 [55528.193916] sd 6:0:0:0: [sdb] No Caching mode page found [55528.193917] sd 6:0:0:0: [sdb] Assuming drive cache: write through [55528.194961] sd 6:0:0:0: [sdb] No Caching mode page found [55528.194964] sd 6:0:0:0: [sdb] Assuming drive cache: write through [55528.196011] sdb: sdb1 [55528.196864] sd 6:0:0:0: [sdb] No Caching mode page found [55528.196866] sd 6:0:0:0: [sdb] Assuming drive cache: write through [55528.196878] sd 6:0:0:0: [sdb] Attached SCSI removable disk

fdisk reports that indeed, the first partition starts 4MB into the device:

Disk /dev/sdb: 8010 MB, 8010072064 bytes 214 heads, 8 sectors/track, 9138 cylinders, total 15644672 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000

Device Boot Start End Blocks Id System /dev/sdb1 8192 15644671 7818240 b W95 FAT32

...So I'm not sure what any special formatter is going to do for me.

I'm now executing a full physical read of the device in Ubuntu (with dd) to verify I really have 8GB here, although it seems unlikely this would be counterfeit (being Lexar brand purchased at a brick-and-mortar Office Depot local to me here in Raleigh, NC).

Also note, I have been unable to reproduce this problem with comparable logic written in Perl on my linux machine, with the same card, which strongly points to something with the arduino SD (or maybe seek) lib implementation.

Thanks, Josh

It looks like I got a full read of 8GB using dd on linux.

sudo time dd if=/dev/sdb of=/dev/null bs=1024k 7639+0 records in 7639+0 records out 8010072064 bytes (8.0 GB) copied, 437.046 s, 18.3 MB/s 0.01user 8.14system 7:17.04elapsed 1%CPU (0avgtext+0avgdata 1848maxresident)k 15644568inputs+0outputs (0major+511minor)pagefaults 0swap

How do I pursue troubleshooting the SD and/or seek() functions with the arduino?

Thanks, Josh

Doesn't the SD library use unsigned longs and therefore limit the size of the disk to 4GB?

Given that information on the unsigned long limitation, I resized and reformatted the partition to 3GB, and recopied the file, but the same behavior persists.

Any other potential limitations?

Thanks, Josh

countrypaul: Doesn't the SD library use unsigned longs and therefore limit the size of the disk to 4GB?

Are we talking about SD.h or SdFat.h at this point? SdFat can handle the latest SDXC cards 256GB+ !

The SDForfmatter as the official SD Specification group official Formatter correctly aligns clusters to flash boundaries which improves performance...... The FIRST thing you will see when this sort of thing crops up is .... DO NOT USE OS FORMATTING UTILITIES..... Just saying....

Regards,

Graham

SdInfo uses a 32-bit unsigned for volume size so it won't correctly report the size of your card. This is an old bug that never gets fixed.

Linux is really poor for formatting SD FAT volumes. It dosn't understand the SD standard. Windows is a little better but not great.

Seek should work for large files with SD.h but I wrote the core, an old version of SdFat, in 2009.

You might do a test with a new version of SdFat. seekSet uses a 32-bit unsigned and should work. Use the SdFormatter to get proper 32 KB clusters so seekSet will be efficient.

Thank you for the great info.

I have some skills in the Linux arena - I think I can make a utility that Does The Right Thing(tm) if you can give critical guidance (elaborate on "it doesn't understand the SD standard," "Windows is a little better but not great," etc) and elaborate on your comments. I did some research but had trouble finding the meat of these "critical standards" that seem to swirl around SD usage. Before this arduino stuff, I worked with SD quite a bit and never had any issues. But I am willing to learn and contribute.

All that aside, I am at your service and willing to do whatever it takes to get this working, including trying out the new SdFat, giving good feedback, committing patches, etc. Where can I get that newer version of SdFat to try?

Working first, efficient second. Clearly something is broken at the moment...

I am definitely motivated. You see, this is a project that will end up being a Christmas present for my wife (shhh! it's a secret), so I am definitely committed.

Thanks, Josh

Found this on Github, is this you fat16lib? https://github.com/greiman/SdFat

Will give it a try this evening.

Thanks, Josh

DragonJ:
Found this on Github, is this you fat16lib?
GitHub - greiman/SdFat: Arduino FAT16/FAT32 exFAT Library

Will give it a try this evening.

Thanks,
Josh

That’s the one.

Graham

SD cards used on a PC or Mac are accessed in large chunks since these machines have huge caches.

Arduino SD libraries only buffer one 512 byte block but the actual size of flash blocks are a multiple of 16 KB. This causes a huge amount of erase and move operations in the SD.

Each size SD card is designed with a flash layout and format to optimize access. It is best to use the standard format to optimize performance.

So I tried the latest version of SdFat from github, but no change in behavior.

I decided to reimplement the seek() call with a simple for loop, which is INCREDIBLY slow but... it works.

So I am now thinking that seek() is somehow broken...

Thanks, Josh

I wrote a 2 GB file, downloaded your program and tried it.

I only typed in characters until I got to position 122,880,000. It worked fine using SD.h.

Do you have access to a windows PC? If so do a check of the SD for corruption. Linux is great for ext3, ext4 but windoze is better for vfat.

Here is the last seek from my test.

56
incrementor is
256
index is
122880000
data.txt:
seek to
122880000
File position:
122880000
ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZAReceived

I wrote the file using the SdFat bench example on a Due since it writes at 4 MB/sec but did the test with 1.0.6 on an Uno.

I ran another test with this at the end of your loop so I don’t need to type. I wrote the file position in each record

  index += 64000UL*512UL;

Here are tests near the end of the file.

seek to
1966080000
File position:
1966080000
1966080000 LMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZAdata.txt:
seek to
1998848000
File position:
1998848000
1998848000 LMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZAdata.txt:
seek to
2031616000
Seek error!

I am using a 32 GB card formatted with SdFormatter. It takes about 1.5 sec to seek to 1,998,848,000.

Thank you for all your help. I did get this resolved.

I am embarrassed to say it was an "off by one" problem - in that the data.txt file I generated did not have 12000 byte records, but rather 12001 byte records, as my perl script which was used to create it added a line feed at the end of each record. Thus, each subsequent record was one byte deeper in the file.

I apologize for wasting everyone's time with this... I totally owe beers on this one!

Thanks, and Happy Holidays, Josh

Thanks for reporting the problem. After my last post I concluded that there must be a problem with the record layout and have been curious to know the solution.

It wasn't a waste of time, it has been a while since I verified seek works correctly for large files. I finally ran it with a max length 4 GB file. You should use an unsigned long index for files longer than 2 GB.

Perl, was/is a neat scripting language. I went through a phase of trying to use it for everything after Larry Wall developed it in the late 1980s. I had been using awk from the late 1970s and perl was a great upgrade.