Creating a third .txt file based on two other .txt files (SOLVED)

Hey.
I still don't have a code to post. But I have the idea and would like to know how it would be possible to do it in practice.

I have two .txt files recorded in the spiffs of an esp32 wroom.

Each file has a few lines, and each line starts with a number, an alphanumeric identifier, and a name, like this:

FileA.txt
0,A0001,João
4,Z0144,Maria
3,Y0008,Kelly
1,B0003,Pedro
2,C0004,Alfredo

FileB.txt
0,A0001,João
3,Y0008,Kelly
4,Z0144,Maria

I need to create the file File C using any if in such a way that its content is formed by the existing lines in FileA that are NOT in FileB. Would be like this:

FileC.txt
1,B0003,Pedro
2,C0004,Alfredo

The second field, alphanumeric, B0003, C0004, etc is the field that must be used to make the comparison, to do the search, that is, all the other fields must be ignored by the piece of code that would carry out this mission.

In other words, it's the field after the first comma that matters. The other fields will just be repeated line by line.

Does anyone have one....try this...?

Very Thanks

How big are the files? The approach will be different if both files will fit in memory vs having to read the files multiple times.

1 Like

Do You intend to do this on a Pc or....?
Which controller do You intend to use?
One way, but very inefficient, time consuming, would be to read one line from one file. Then read the whole other file and check for correspondance.
If no correspondance, write the line to file C, number 3.
Read the second line from the first file.......

As asked in reply #2, what number of lines in those files?

Can you paste it into a spreadsheet program? If so it should be easy to solve. You can use DOS and enter this line x:type file.txt > c.txt when the prompt comes back type this: x:type file.txt >> c.txt when the prompt comes back c.txt will have both files concatenated in it. If it is linux use the "cat" in place of type.

It does sound like a task better fitted to a PC (whether Windows, *nix, or Mac).

Simpler:

copy a.txt + b.txt c.txt
1 Like

A maximum of 50Kb for each file. Thanks for trying to help.

I intend to use only esp32, each file with a maximum of 50Kb

I will not use PC, just esp32 to create and add lines in the three files FileA, B and C

If there is sufficient memory on the ESP32, read in the 2nd file, storing in an array only the field you are concerned with, then start copying the 1st file to the output file, searching the array each time for a match.
Much faster if you sort the search array first.

What does each line, record, look like?

FileA.txt

3,A003,EMILLY MONTEIRO,F3
2,A002,LUCCA DA MATA,F2
1,A001,SERGIO FRANCA,F1
4,A004,JOANA FRANCA,EM1
6,G001,DAVI LUIZ ARAUJO,EM3
3,A003,EMILLY MONTEIRO,F3
12,H0123,CLARA ALMEIDA,EM3
11,F786,ANA LAURA BARBOSA,EM2

FileB.txt

3,A003,EMILLY MONTEIRO,F3,Quarta-feira,26/07/2023,10:04:36
6,G001,DAVI LUIZ ARAUJO,EM3,Quinta-feira,27/07/2023,18:26:42
3,A003,EMILLY MONTEIRO,F3,Quinta-feira,27/07/2023,19:14:51
12,H0123,CLARA ALMEIDA,EM3,Quinta-feira,29/07/2023,17:24:52
11,F786,ANA LAURA BARBOSA,EM2,Quinta-feira,29/07/2023,17:25:10
4,A004,JOANA FRANCA,EM1,Quinta-feira,27/07/2023,11:20:07

FileC.txt (It should show which line is in FileA.txt and is NOT in FileB.txt using the second field as a search, code after the first comma):

2,A002,LUCCA DA MATA,F2
1,A001,SERGIO FRANCA,F1

How would a piece of code that created the FileC.txt look like when any if condition was met ?

FileC.txt = Repeats the lines of FileA.txt that are NOT in FileB.txt

Hi @sergioarduino,

this is a code that seems to do what you intend:

#include <list>
#include <SD.h>
#define CS_PIN 5

File root;
std::list<String> indexList;
std::list<String> outList;

void setup() {
  Serial.begin(115200);

  Serial.print("Initializing SD card... ");

  if (!SD.begin(CS_PIN)) {
    Serial.println("Card initialization failed!");
    while (true);
  }

  Serial.println("initialization done.\n");
  createIndexList("FileB.txt");
  evaluateFile("FileA.txt");
  writeResultToFile("FileC.txt");
}

void loop() {
  delay(100);
  // nothing happens after setup finishes.
}

void createIndexList(String aFile){
  indexList.clear();
  String aLine = "";
  File textFile = SD.open("/"+aFile);
  if (textFile) {
    Serial.println("Creating Index List from "+aFile);
    while (textFile.available()) {
      char c = textFile.read();
      if (c >= ' ') {
        aLine = aLine + c;
      } 
      if (c == 10){
        String idx = stripIndex(aLine);
        if (idx > "") { 
          indexList.push_back(idx);
          }
        aLine = "";
      }
    }
    textFile.close();
  } else {
    Serial.println("error opening "+aFile);
  }
  Serial.println();
  showList();
} 

String stripIndex(String Line){
  int c1 = Line.indexOf(',');
  int c2 = Line.indexOf(',', c1+1);
  String index = "";
  if (c1 >= 0 && c2 > c1){
    index = Line.substring(c1+1, c2);
  }
  return index;
}

void showList(){
  Serial.println("===== Index List ========");
  for (auto idx : indexList){
    Serial.println(idx);
  }
  Serial.println("=========================\n");
}  

void evaluateFile(String aFile){
  String aLine = "";
  outList.clear();
  File textFile = SD.open("/"+aFile);
  if (textFile) {
    Serial.println("Evaluating "+aFile);
    Serial.println("\n======= New Entries ========");
    while (textFile.available()) {
      char c = textFile.read();
      if (c >= ' ') {
        aLine = aLine + c;
      } 
      if (c == 10){
        String idx = stripIndex(aLine);
        if (idx > "") { 
           if (!indexFound(idx)) {
            Serial.println(idx);
            outList.push_back(aLine);
           }
          
          }
        aLine = "";
      }
    }
    textFile.close();
  } else {
    Serial.println("error opening "+aFile);
  }
  Serial.println("============================\n"); 
} 

boolean indexFound(String aIndex){
  boolean found = false;
  for (auto ix : indexList){
    if (aIndex == ix) found = true;
  }
  return found;
} 

void writeResultToFile(String wFile){
  File tFile = SD.open("/"+wFile, FILE_WRITE);
  if (tFile) {
    Serial.println("Writing to "+wFile);
    Serial.println("\n========= Entries ========");
    for (auto txt : outList){
       tFile.println(txt);
       Serial.println(txt);
    }
    tFile.close();
    Serial.println("============================\n"); 
  } else {
    Serial.println("error opening "+wFile);
  }
} 

I have not checked the sketch for efficiency or memory consumption ... There may be some flaws but it could be a begin for further steps.

You can try it on Wokwi:

I could not check the writing to SD card on Wokwi, but you may try it in a real application.

Good luck!

Fantastic.
Thank you very much.

I don't have SDCARD module. I'm using SPIFFS, so I made this modification. I just don't understand why the code is finding only the first line in evaluateFile. Even though there are more lines below, that is, more data that exists in FileA and does not exist in FileB, it only considers the first line. I wonder why ?

#include <list>
#include "FS.h"
#include "SPIFFS.h"
//#include <SD.h>
//#define CS_PIN 5

File root;
std::list<String> indexList;
std::list<String> outList;

void listDir(fs::FS &fs, const char * dirname, uint8_t levels)  {
 Serial.printf("Listing directory: %s\r\n", dirname);
 File root = fs.open(dirname);
   if(!root) {
      Serial.println("− failed to open directory");
      return;
   }
   if(!root.isDirectory()) {
      Serial.println(" − not a directory");
      return;
   }

   File file = root.openNextFile();
   while(file){
      if(file.isDirectory()){
         Serial.print("  DIR : ");
         Serial.println(file.name());
         if(levels){
            listDir(fs, file.name(), levels -1);
         }
      } else {
         Serial.print("  FILE: ");
         Serial.print(file.name());
         Serial.print("\tSIZE: ");
         Serial.println(file.size());
      }
      file = root.openNextFile();
   }
}

void readFile(fs::FS &fs, const char * path) {
   Serial.printf("Reading file: %s\r\n", path);

   File file = fs.open(path);
   if(!file || file.isDirectory()) {
       Serial.println("− failed to open file for reading");
       return;
   }

   Serial.println("− read from file:");
   while(file.available()){
      Serial.write(file.read());
   }
}

void writeFile(fs::FS &fs, const char * path, const char * message) {
   Serial.printf("Writing file: %s\r\n", path);

   File file = fs.open(path, FILE_WRITE);
   if(!file){
      Serial.println("− failed to open file for writing");
      return;
   }
   if(file.print(message)) {
      Serial.println("− file written");
   } else {
      Serial.println("− frite failed");
   }
}

void appendFile(fs::FS &fs, const char * path, const char * message) {
   Serial.printf("Appending to file: %s\r\n", path);

   File file = fs.open(path, FILE_APPEND);
   if(!file) {
      Serial.println("− failed to open file for appending");
      return;
   }
   if(file.print(message)) {
      Serial.println("− message appended");
   } else {
      Serial.println("− append failed");
   }
}

void renameFile(fs::FS &fs, const char * path1, const char * path2) {
   Serial.printf("Renaming file %s to %s\r\n", path1, path2);
   if (fs.rename(path1, path2)) {
      Serial.println("− file renamed");
   } else {
      Serial.println("− rename failed");
   }
}

void deleteFile(fs::FS &fs, const char * path) {
   Serial.printf("Deleting file: %s\r\n", path);
   if(fs.remove(path)) {
      Serial.println("− file deleted");
   } else {
      Serial.println("− delete failed");
   }
}

void setup() {
  Serial.begin(115200);

  // Initializa SPIFFS
  if(!SPIFFS.begin(true)){
      Serial.println("ERRO NA MONTAGEM DO SPIFFS");
      return;
  }
//  Serial.print("Initializing SD card... ");
//  if (!SD.begin(CS_PIN)) {
//    Serial.println("Card initialization failed!");
//    while (true);
//  }

  Serial.println("initialization done.\n");
  createIndexList("FileB.txt");
  evaluateFile("FileA.txt");
  writeResultToFile("FileC.txt");
}

void loop() {
  delay(100);
  // nothing happens after setup finishes.
}

void createIndexList(String aFile){
  indexList.clear();
  String aLine = "";
  File textFile = SPIFFS.open("/"+aFile);
//  File textFile = SD.open("/"+aFile);
  if (textFile) {
    Serial.println("Creating Index List from "+aFile);
    while (textFile.available()) {
      char c = textFile.read();
      if (c >= ' ') {
        aLine = aLine + c;
      } 
      if (c == 10){
        String idx = stripIndex(aLine);
        if (idx > "") { 
          indexList.push_back(idx);
          }
        aLine = "";
      }
    }
    textFile.close();
  } else {
    Serial.println("error opening "+aFile);
  }
  Serial.println();
  showList();
} 

String stripIndex(String Line){
  int c1 = Line.indexOf(',');
  int c2 = Line.indexOf(',', c1+1);
  String index = "";
  if (c1 >= 0 && c2 > c1){
    index = Line.substring(c1+1, c2);
  }
  return index;
}

void showList(){
  Serial.println("===== Index List ========");
  for (auto idx : indexList){
    Serial.println(idx);
  }
  Serial.println("=========================\n");
}  

void evaluateFile(String aFile){
  String aLine = "";
  outList.clear();
  File textFile = SPIFFS.open("/"+aFile);
//  File textFile = SD.open("/"+aFile);
  if (textFile) {
    Serial.println("Evaluating "+aFile);
    Serial.println("\n======= New Entries ========");
    while (textFile.available()) {
      char c = textFile.read();
      if (c >= ' ') {
        aLine = aLine + c;
      } 
      if (c == 10){
        String idx = stripIndex(aLine);
        if (idx > "") { 
           if (!indexFound(idx)) {
            Serial.println(idx);
            outList.push_back(aLine);
           }          
        }
       aLine = "";
      }
    }
    textFile.close();
  } else {
    Serial.println("error opening "+aFile);
  }
  Serial.println("============================\n"); 
} 

boolean indexFound(String aIndex){
  boolean found = false;
  for (auto ix : indexList){
    if (aIndex == ix) found = true;
  }
  return found;
} 

void writeResultToFile(String wFile){
  File tFile = SPIFFS.open("/"+wFile, FILE_WRITE);
//  File tFile = SD.open("/"+wFile, FILE_WRITE);
  if (tFile) {
    Serial.println("Writing to "+wFile);
    Serial.println("\n========= Entries ========");
    for (auto txt : outList){
       tFile.println(txt);
       Serial.println(txt);
    }
    tFile.close();
    Serial.println("============================\n"); 
  } else {
    Serial.println("error opening "+wFile);
  }
} 

FileA.txt:
0,B123,USUARIO TESTE,E5
1,X001,SERGIO FRANCA,F1
2,H002,LUCCA DA MATA,F2
3,M789,EMILLY MONTEIRO,F3
4,S111,JOANA FRANCA,EM1

FileB.txt:
0,B123,USUARIO TESTE,E5,segunda-feira,31-07-2023,19:58:00
1,X001,SERGIO FRANCA,F1,segunda-feira,31-07-2023,19:58:00
2,H002,LUCCA DA MATA,F2,segunda-feira,31-07-2023,19:58:00

FileC.txt: (Should be)
3,M789,EMILLY MONTEIRO,F3
4,S111,JOANA FRANCA,EM1

But it just prints:
3,M789,EMILLY MONTEIRO,F3

I guess the reason is this:

The code as it is at the moment expects a Line Feed after each relevant line ... You can try to put an empty line after the last entry. That should work...

If you don't like that restriction you may change the reading algorithm to react on Line Feed ( c == 10 ) and also if aLine is not empty after EOF has been reached.

You are amazing.
Cheers.
Hugs.

Thanks .. But it was a "flaw" in my quick solution... :wink:

Minimal thing. The important thing was their motivation to help.

1 Like

How is this part ?

if (c == 10){
        String idx = stripIndex(aLine);
        if (idx != "") {
//        if (idx > "") { 
           if (!indexFound(idx)) {
            Serial.println(idx);
            outList.push_back(aLine);
           }          
        }
       aLine = "";
      }
    }
    textFile.close();