i have a .txt file located on an sd card containing 5000 rows.... data looks something like this...
This is the FIRST line
And the second line
Miss SMITH
ISnt it greaT
and so on for five thousand rows
I want to search (case insensitive) this .txt file and return all of the rows that contain a provided keyword.
For example a keyword of "is" then from the above data i want to get returned..
This is the FIRST line
Miss SMITH
ISnt it greaT
I'm know that i need to be aware of hogging memory and processing power (i'm using an ESP32)....
Are there any pointers or suggestions anyone can give for good examples / methods / libraries, functions etc etc that I could be using/avoiding before i start coding?
We need to see the code you have for reading the file. We need to know if you want to do a case sensitive search, as you implied, or a case insensitive search, as your example output implies, and exactly what you want to see as output - your input and output do not match.
PaulS:
We need to see the code you have for reading the file. We need to know if you want to do a case sensitive search, as you implied, or a case insensitive search, as your example output implies, and exactly what you want to see as output - your input and output do not match.
Updated my initial post to reflect.. Thanks for your guidance.
/*
String indexOf() and lastIndexOf() functions
Examples of how to evaluate, look for, and replace characters in a String
created 27 Jul 2010
modified 2 Apr 2012
by Tom Igoe
This example code is in the public domain.
http://www.arduino.cc/en/Tutorial/StringIndexOf
*/
void setup() {
// Open serial communications and wait for port to open:
Serial.begin(9600);
while (!Serial) {
; // wait for serial port to connect. Needed for native USB port only
}
// send an intro:
Serial.println("\n\nString indexOf() and lastIndexOf() functions:");
Serial.println();
}
void loop() {
// indexOf() returns the position (i.e. index) of a particular character in a
// String. For example, if you were parsing HTML tags, you could use it:
String stringOne = "<HTML><HEAD><BODY>";
int firstClosingBracket = stringOne.indexOf('>');
Serial.println("The index of > in the string " + stringOne + " is " + firstClosingBracket);
stringOne = "<HTML><HEAD><BODY>";
int secondOpeningBracket = firstClosingBracket + 1;
int secondClosingBracket = stringOne.indexOf('>', secondOpeningBracket);
Serial.println("The index of the second > in the string " + stringOne + " is " + secondClosingBracket);
// you can also use indexOf() to search for Strings:
stringOne = "<HTML><HEAD><BODY>";
int bodyTag = stringOne.indexOf("<BODY>");
Serial.println("The index of the body tag in the string " + stringOne + " is " + bodyTag);
stringOne = "<UL><LI>item<LI>item<LI>item</UL>";
int firstListItem = stringOne.indexOf("<LI>");
int secondListItem = stringOne.indexOf("<LI>", firstListItem + 1);
Serial.println("The index of the second list tag in the string " + stringOne + " is " + secondListItem);
// lastIndexOf() gives you the last occurrence of a character or string:
int lastOpeningBracket = stringOne.lastIndexOf('<');
Serial.println("The index of the last < in the string " + stringOne + " is " + lastOpeningBracket);
int lastListItem = stringOne.lastIndexOf("<LI>");
Serial.println("The index of the last list tag in the string " + stringOne + " is " + lastListItem);
// lastIndexOf() can also search for a string:
stringOne = "<p>Lorem ipsum dolor sit amet</p><p>Ipsem</p><p>Quod</p>";
int lastParagraph = stringOne.lastIndexOf("<p");
int secondLastGraf = stringOne.lastIndexOf("<p", lastParagraph - 1);
Serial.println("The index of the second to last paragraph tag " + stringOne + " is " + secondLastGraf);
// do nothing while true:
while (true);
}
Thankyou - ok let me see if i can pull this off second time around...
We need to see the code you have for reading the file.
No code yet, I'm hoping for any high level recommendations on functions, examples, methods, libraries, best practices that can help steer me in the right direction before i jump in head first and make something incredibly inefficient.
We need to know if you want to do a case sensitive search, as you implied, or a case insensitive search, as your example output implies,
Case insensitive throughout for both input and output.
and exactly what you want to see as output - your input and output do not match.
I would want to see this as my output:
This is the FIRST line
Miss SMITH
ISnt it greaT
but i would be equally happy with:
this is the first line
miss smith
isnt it great
No code yet, I'm hoping for any high level recommendations on functions, examples, methods, libraries, best practices that can help steer me in the right direction before i jump in head first and make something incredibly inefficient.
There are many choices that you have to make. You need to write code to read the file. Try that, doing nothing with the data beyond printing it.
Then, figure out how to store the data. Storing the data in a String is easy, but there are serious issues related to the String class on the Arduino. Storing the data in a string (a NULL terminated char array) is only marginally more difficult, but you do need to decide what maximum record length you want to deal with.
Speaking of records, you need to decide what constitutes a record, so you can recognize when you have encountered the end of a record, so you know that it is now time to determine if the record contains the string of interest.
How you do that depends on whether you are saving the record in a String or in a string. Both String methods and string functions are well documented. strstr() might prove useful.
If you determine that the string of interest is in the record, print the record.
Then, clean up the string or String, and repeat the reading process, until you encounter the end of the file.
And what kind of board are you using ? 5000 bytes is a lot to process for an Uno or nano, and even a mega will have to deal with it's limitations. doing a case insensitive search would be easiest by converting all to lower (or upper) case first and then do the search, once keywords are found you could then anyway refer back to the location in the original file to create the desired output.
Size of the file shouldn't matter. Just read it line by line into a buffer (whose size is important) and serial.print the buffer if it contains the text being sought.
On my ESP32 it takes 580 milliseconds to search through 5000 records and pull out any keyword matches. More than acceptable for my purposes.
#include <SdFat.h>
const uint8_t chipSelect = 5; // SD chip select pin
SdFat sd;
SdFile file;
const size_t LINE_DIM = 100; // Maximum line length plus space for zero byte
char line[LINE_DIM];
void setup(void) {
Serial.begin(57600);
String keyword = "fruit"; //keyword IS Case Sensitive
String datatoreview; // pass line to keyword - probably a smarter way to do this!
int totalcounter = 0; //tracking number of lines in file
int selectedcounter = 0; //tracking number of lines in file
size_t n;
if (!sd.begin(chipSelect, SPI_FULL_SPEED)) Serial.println("SD Begin failed"); //Try changing SPI_FULL_SPEED to SPI_QUARTER_SPEED if you cannot connect to SD card
if (!file.open("massivedatafile.txt", O_READ)) Serial.println("open failed");
unsigned long millisStart = millis(); //start the stopwatch
while ((n = file.fgets(line, sizeof(line))) > 0) {
totalcounter++;
datatoreview = line;
if (datatoreview.indexOf(keyword) != -1) {
Serial.print(line); //print lines that match
selectedcounter++; // increment the counter for matching records
}
}
unsigned long millisStop = millis(); //stop the stopwatch
Serial.print(("Task took "));
Serial.print( millisStop - millisStart);
Serial.print((" milliseconds to select "));
Serial.print(selectedcounter);
Serial.print((" records from a total of "));
Serial.print(totalcounter);
}
void loop(void) { }