Reading data from a large record on SDC

Mogaraghu · February 1, 2016, 1:18am

I have a file stored in SDC that has many rows ( say about 200 ) with the following format :

<#123, CDEFGHI, LMNOPQRST, UVWX, YZ1234567890>

Its created from a Excel CSV file and has a fixed length.

The ABC numeric field is the unique identifier and based on user request I need to be a able to access the record that matches the criteria , display each of the 5 fields on a display unit .

Once displayed some operations are done and when the user enters the next record ID, need to mark the just displayed record with some remark like "OK".

What I want to know is :

Option 1 : Use the ABC field as multiplier and access each field in it with an offset. Thus to reach record number 10, I use 10 x 42. In this 42 is length of each record. Having reached it I readeach field using SEEK() with offsets for each field. For field 2 it will be (10x42)+6 and so on.
Option 2 : Reach the required field by searching for #, then reading the 3 digits after that, match with user criteria. Having reached there, create a loop to read consecutive fields using the comma de-limiter.

Option 1 involves some maths and when the record size changes involves lot of correction to the function of search.

Option 2 is more generic and can handle record size changes with little or no correction.

Which of above is reliable and good ? Or is there any other method ??
Thanks

system · February 1, 2016, 2:12pm

Which of above is reliable and good ?

I think that you've already answered that question.

Perhaps a more relevant question is "Which method is faster?". Obviously, not having to read each character is going to be much faster. Method 1 jumps to the start of record n. Method 2 reads each character, to see if it is the start of record n. Obviously method 1 is going to be much faster.

Gdunge · February 1, 2016, 8:30pm

PaulS:
I think that you've already answered that question.

Perhaps a more relevant question is "Which method is faster?". Obviously, not having to read each character is going to be much faster. Method 1 jumps to the start of record n. Method 2 reads each character, to see if it is the start of record n. Obviously method 1 is going to be much faster.

Except the poster told us that the file was created with Excel. Excel doesn't save CSV files with a fixed record length (as far as I know).

Just saving a spreadsheet as CSV will usually result in records that are different length.
You could try some tricks to make all the fields the same width (such as having the record number start at 100, and manually making sure all the text fields are the same length).

The next issue is adding a code at the end of each line. Adding extra characters in the middle of an existing file generally involves copying everything to a new file. VERY slow. One way to deal with this is to start with an additional field at the end of each record, maybe ",NO". Then you can change the "NO" to "OK" and the file stays the same length.

One way to deal with these issues is to preprocess the file when the Arduino starts up. In the setup() function, read through each line of the CSV file and write out a new formatted file with all the records the same length.

I haven't tested this, but there is an extended database library on the Arduino Playground. Here is a post showing it used with an SD card:

http://blog.brauingenieur.de/2014/01/20/extended-database-library-using-an-sd-card/

You could create a new database file, then read the CSV file one line at a time and create one database record for each line in the CSV file. Now you have a database that can be easily searched and updated. IN THEORY, of course, since I haven't tried it yet.

Mogaraghu · February 2, 2016, 1:17am

Just saving a spreadsheet as CSV will usually result in records that are different length.

You can say that again !! Since I am pre-processing the Excel file prior to transmitting via Serial using a LabVIEW app, I am making sure that I transmit each row as a fixed length entity with proper Start / End markers and delimiters.

I do as below :

I first remove the existing PLAN.CSV and ACTUAL.CSV from SDC.
Then get the original user Excel file duly processed as PLAN.CSV into the SDC
I immediately make a copy of this file called ACTUAL.CSV in the SDC
The file structure is this : <#....., ....., ......, ....>
Once I finish processing each row in my code, I change all the '#' in the ACTUAL.CSV to '&'

Thanks for pointing me to the new link ... I will go through it and revert. So kind of you.

Mogaraghu · February 2, 2016, 11:57am

Thanks to a sample by Zoomkat , I created this function to parse the string. Basically it works OK. The function is :

void parseCharString () {

  char letter;
  int recLen = 94, offSet = 0;
  String readString;
  String SlNo;
  String RFID;
  String GrQty;
  String Route;
  String EquipID;
  String EquDesc;
  String GrType;

  int deLim1, deLim2, deLim3, deLim4, deLim5, deLim6, deLim7, deLim8;

  File dataFile = SD.open("GRPLAN.CSV", FILE_READ);             // Open the file
  dataFile.seek(recLen + offSet);                                            // Ignore the first row
  if (dataFile) {
    while ( dataFile.available()) {
      letter = dataFile.read();
      if (letter == '>') {
        
        dataFile.close();

        deLim1 = readString.indexOf('&');
        deLim2 = readString.indexOf(',');
        SlNo = readString.substring( deLim1 + 1, deLim2);
        deLim3 = readString.indexOf(',', deLim2 + 1);
        RFID = readString.substring(deLim2 + 1, deLim3);
        deLim4 = readString.indexOf(',', deLim3 + 1);
        GrQty = readString.substring(deLim3 + 1, deLim4);
        deLim5 = readString.indexOf(',', deLim4 + 1);
        Route = readString.substring(deLim4 + 1, deLim5);
        deLim6 = readString.indexOf(',', deLim5 + 1);
        EquipID = readString.substring(deLim5 + 1, deLim6);
        deLim7 = readString.indexOf(',', deLim6 + 1);
        EquDesc = readString.substring(deLim6 + 1, deLim7);   // Prints the comma at the end. Why ?
        deLim8 = readString.indexOf('>', deLim7 + 1);
        GrType = readString.substring(deLim7 + 1, deLim8);
        Serial.println (SlNo);
        Serial.println (RFID);
        Serial.println (GrQty);
        Serial.println (Route);
        Serial.println (EquipID);
        Serial.println (EquDesc);
        Serial.println (GrType);     // Prints on same line . Why ?

        break;
      }
      else {
        readString += letter;
      }
    }
  }
}

The sample rows of string that I need to parse one at a time is attached and also a screen shot of the parsed output is attached. I have two minor (?) issues in it as commented in the code.

I have tried my level best to spot the reason but without luck

( Not to worry about the shifted fields in the Serial output - this is due to the space padding done to maintain the row width constant )

Sample Rows.PNG

Topic		Replies	Views
Seek SD file without reading each byte Storage	13	10753	May 6, 2021
A Simple Function for Reading CSV Text Files. Storage	15	48900	May 6, 2021
Indexed read write on SD card Storage	4	1444	May 6, 2021
Reading SD card with CSV format into chars and ints Storage	11	6225	May 6, 2021
Reading CSV position to sD Storage	2	800	May 6, 2021

Reading data from a large record on SDC

Related topics