Searching a 'large' file on SD?

ghlawrence2000 · August 21, 2013, 4:56pm

Hello all!!

I was wondering if anyone has previously had a similar requirement to do what I am trying to do?

I have a file (27MB to round figures in size) which contains approx 260,000 lines of variable length, colon separated fields.

It has a defined structure as follows :-

Int(6):Char(6):Char(60):Char(4):Int(2):Float(3.1):Int(2):Float(3.1):Int(7):Int(7):Char(1):Char(2):Char(20):Char(60):Char(3):Char(11):Char(1):Int(3):Int(3):Int(3)

As previously mentioned these are maximums...

A snippet of the mentioned file here :-

258316:SO0003:Ysgubor-wen Ho:SO00:51:43.2:3:26.4:203500:300500:W:RH:Rho Cyn Taf:Rhondda,Cynon,Taff:X:01-MAR-1993:I:170:0:0
258317:SN6895:Ysgubor-y-coed:SN68:52:32.5:3:56.3:295500:268500:W:CE:Cered:Ceredigion:X:01-MAR-1993:I:135:0:0
258318:SO0873:Ysgwd-ffordd:SO06:52:21.1:3:20.6:273500:308500:W:PW:Powys:Powys:X:01-MAR-1998:U:136:147:0
258319:SJ1930:Ysgwennant:SJ02:52:51.9:3:11.8:330500:319500:W:PW:Powys:Powys:X:01-MAR-1993:I:125:0:0
258320:SO0537:Ysgwydd Hwch:SO02:52:1.6:3:22.6:237500:305500:W:PW:Powys:Powys:H:21-MAY-2007:U:160:0:0
258321:SO1200:Ysgwydd-gwyn-isaf Fm:SO00:51:41.8:3:16:200500:312500:W:CF:Caer:Caerphilly:FM:01-MAR-1993:I:171:0:0
258322:SO3113:Ysgyrd Fach:SO20:51:48.9:2:59.6:213500:331500:W:MM:Monm:Monmouthshire:H:01-MAR-1993:I:161:0:0
258323:SO3317:Ysgyryd Fawr:SO20:51:51.1:2:57.9:217500:333500:W:MM:Monm:Monmouthshire:H:01-MAR-1993:I:161:0:0
258324:SS5598:Yspitty:SS48:51:40:4:5.4:198500:255500:W:CT:Carm:Carmarthenshire:O:01-MAR-1993:I:159:0:0
258325:SN4826:Yspitty Ifan:SN42:51:55:4:12.2:226500:248500:W:CT:Carm:Carmarthenshire:X:01-MAR-1993:I:146:0:0
258326:SM7923:Ystafelloedd:SM62:51:52:5:12.2:223500:179500:W:PB:Pemb:Pembrokeshire:X:01-MAR-1993:I:157:0:0
258327:SN7608:Ystalyfera:SN60:51:45.7:3:47.4:208500:276500:W:NP:Nth Pt Talb:Neath Port Talbot:O:01-MAR-1993:I:160:0:0

I need to search as quickly as possible, field 3, possibly sub-searched using fields 14 and/or 13....

Clearly this would be an extremely time consuming process to begin at the beginning and search to the end.... Especially if the result was to yield nothing....

To complicate matters further, the file contains characters which do not 'play well' with toupper() and tolower()
For example :-

30:NC3249:A' Chèir Ghorm:NC24:58:24.1:4:52:949500:232500:W:HL:Highld:Highland:X:23-JUN-2008:U:9:0:0
31:NG2605:A' Chill:NG20:57:3.5:6:30.7:805500:126500:W:HL:Highld:Highland:O:01-MAR-1993:I:39:0:0
32:NC2105:A' Chìoch:NC20:58:.2:5:1.2:905500:221500:W:HL:Highld:Highland:X:01-MAR-1993:I:15:0:0
33:NC5729:A' Chioch:NC42:58:13.9:4:25.6:929500:257500:W:HL:Highld:Highland:X:01-FEB-1998:I:16:0:0
34:NG8144:A' Chioch:NG84:57:26.3:5:38.5:844500:181500:W:HL:Highld:Highland:H:01-FEB-1998:I:24:0:0
35:NH0509:A' Chioch:NH00:57:8.1:5:12.9:809500:205500:W:HL:Highld:Highland:X:01-AUG-1994:I:33:0:0
36:NH1115:A' Chìoch:NH00:57:11.5:5:7.2:815500:211500:W:HL:Highld:Highland:H:01-MAR-1993:I:34:0:0

The sort order of the file is numerical on field 1 ... ie 1 - 258422, field 2 is random based on field 3 which is alphabetically sorted while all other fields are also random.

Some sort of caseless 'closest match' style search is what I need.

There is no possibility I can break down the file into 'A' 'B' 'C' on field 3 which was my first idea....

I have already spent a significant amount of time on this problem myself, and basically achieved sweet Fanny Adam! Any and all help would most graciously be received and appreciated!!

Any ideas please?

This is one 'small' problem in a MUCH larger overall project I have brewing, further details to be announced once more progress has been made!

Regards and thanks,

Graham

system · August 21, 2013, 5:31pm

~~speed~~ safety cameras?
Would it be simpler to reorganise the data and have separate index files, based on place-name/lat-long/ whatever?

tylernt · August 23, 2013, 7:07pm

EDIT: It's a "binary" search. This is regular C code for searching an array, but could easily be adopted to work with a file on an SD card on an Arduino:

Topic		Replies	Views
SD library help Programming Questions	4	307	May 6, 2021
I am not understanding SD file.size / file.available Programming Questions	3	703	August 29, 2021
Sketch too big!!! First time this has happened to me. Project Guidance	3	1407	May 5, 2021
20x4 LCD showing long text files via SD card Project Guidance	2	1122	May 5, 2021
Problems reading large File from SD card Storage	16	1518	April 14, 2023

Searching a 'large' file on SD?

Related Topics