reading only the first character in a line on the SD card

Hi

Im still searching for a easy/ fast way to read a file on a SD card
My intension is , to know how many lines are written in the file

Do i have to read the how line until i have a end of line ('\n')
then read again (it automaticaly reading the next line i suppose) etc etc until i found a EOF

or is it possible to read the first character, then check if the first character is an EOF
If not go to next line

and what about the read pointer, if i read a file and closed it and i what to read it again will it start again at the first line?
..

greetings

If you have fixed length lines/records in the file then get the size of the file and divide by the size of the record to get the number of lines/records. Remember to add the \n to the size of the record.

If you don't, then you need to scan the file and keep reading each character, counting the \n, until you reach end of file. The count will be the number of lines. This is a simple while loop.

Every time you open a file for reading the file pointer should be reset to the start of the file.

Thanks for the answer..

My lines in the file are not of the same length /bytes, so using the file.size() cannot be used .

I hoped i could skip most charactes is the file , to win time. i have no idee how fast the arduino could read.
My file consist normally of 296 lines and around 9kbytes, each byte is 1 character i think

greetings

Here is a sketch that demonstrates counting lines.

#include <SD.h>
#define BUF_DIM 32
#define CHIP_SELECT 10
File file;
//-------------------------------------------------------
uint32_t lineCount(File* f) {
  char buf[BUF_DIM];
  uint32_t nl = 0;
  int n;
  
  // rewind file
  f->seek(0);

  // count lines
  while ((n = f->read(buf, sizeof(buf))) > 0) {
    for (int i = 0; i < n; i++) {
      if (buf[i] == '\n') nl++;
    }
  }
  return nl;
}
//-------------------------------------------------------
void setup() {
  Serial.begin(9600);
  Serial.println("start");
  if (!SD.begin(CHIP_SELECT)) {
    Serial.println("begin error");
    return;
  }
  file = SD.open("TEST.TXT");
  if (!file) {
    Serial.println("open error");
    return;
  }
  uint32_t m = micros();
  uint32_t n = lineCount(&file);
  m = micros() - m;
  Serial.print(file.size());
  Serial.println(" bytes");
  Serial.print(n);
  Serial.println(" lines");
  Serial.print(m);
  Serial.println(" micros");
  Serial.println("done");  
}
void loop() {
}

Here is output from a test file:

start
9388 bytes
299 lines
53084 micros
done

It takes about 0.053 seconds with a 32 byte read buffer.

The time varies with buffer size. Here are other results:

One byte read buffer.

start
9388 bytes
299 lines
327632 micros
done

Ten read byte buffer.

start
9388 bytes
299 lines
72732 micros
done

100 byte read buffer.

start
9388 bytes
299 lines
47452 micros
done

1 Like

Thanx for the code and samples

Sorry im not so familiar with the code, could you please explain more detail

? uint32_t // whats this for type an unsigned int 32bits where does the _t stand for

? f->seek(0)[; // resetting pointer??

while ((n = f->read(buf, sizeof(buf))) > 0) //read a buffer in sequence of size buf, if n=0 no characters any more
{
for (int i = 0; i < n; i++) // check in a loop is one of the characters is a end of line
{
if (buf == '\n') nl++;

  • }*
  • }*
    What will happon if there are lines with no characters, so only a '\n' ,will that be detected as EOF or is n = 1 in this case
    n=0 when EOF is detected?

About the buf size, how larger the buffer how faster the file is read
but using a buffer of size 1 , will result in a read time of 0.32sec
So a buffer of 30 is really fast 0.053sec
i though in a direction of this code
File fh;
char ch;
int lineNo;
int chPos;
while (fh.available() && ch!= 'EOF' && chpos <30)
{

  • ch = fh.read(); //read character*

  • if (ch != '\n')*

  • {*

  • chPos++;*

  • }*

  • else*

  • { *

  • lineNo++;*

  • chPos = 0;*

  • }*

*} *

Please put your code within code tags or it will look screwy. Use the # button on the posting page.

uint32_t

Unsigned 32 bit integer. The _t is just a convention to show this is a TYPE definition.

f->seek(0)

Go to the start of the file (seek to position 0 in the file).

What will happon if there are lines with no characters, so only a '\n' ,will that be detected as EOF or is n = 1 in this case
n=0 when EOF is detected?

As I read the code, the counter is only incremented If there is a \n. So if there is no \n then there is no line in the file.

About the buf size, how larger the buffer how faster the file is read
but using a buffer of size 1 , will result in a read time of 0.32sec
So a buffer of 30 is really fast 0.053sec

There is already buffering in the SD library so there is not much point creating really big buffers in your code (wastes memory for not much gain at all). What you need is a balance between excessive function calls (wasting CPU cycles) to get one character at at time out of the library, and the amount of local memory you are using. This will depend on you application and some experiments are good to give you parameters to help you understand the differences. There is a point where there are no more speed gains and this is probably the optimal balance.

What means this:

((n = f->read(buf, sizeof(buf))) > 0)

every time a size of a buffer is read , then every char is scaned if it is equal to "\n" then counter will increase.
then the next scan will be done . but where stands n for??
Is that the number of characters or bytes which are read?
So it is not scanning for the EOF signiture, but scans untill the bytes which are scaned are 0

To be clear, there are two loops to look at:

while ((n = f->read(buf, sizeof(buf))) > 0)     //read a buffer in sequence of size buf, if n=0    no characters any more
  {
      for (int i = 0; i < n; i++)      // check in a loop  is one of the characters is a end of line
      {
        if (buf[i] == '\n') nl++;
      }
  }

n is the number of characters returned from the read function. The while loop will stop when there are no more characters available (n <= 0). The for loop is the one that is doing the counting by scanning the data read (0 to n bytes) and incrementing nl when a \n is found. The number of lines, at the end, is contained in the variable nl.

If you ever need to understand what a function does, the best place to look is the function definition. You will find that the read function return the number of characters read.

There is already buffering in the SD library so there is not much point creating really big buffers in your code (wastes memory for not much gain at all).

I wrote SdFat and an old version of SdFat is the base for the SD.h library. SD.h is just a simple wrapper to change the SdFat API.

The overhead for reading a single byte is very high though SdFat has an internal 512 byte block buffer.. I used read(buf, count) to minimize this overhead.

From the above test you can see that a 32 byte buffer speeds up lineCount() by about a factor of six. There is very little memory cost for this buffer since it is allocated on the stack only during the call to lineCount(). The only problem could be stack overflow if lineCount() is called with less than about 40 bytes of free RAM.

Thanks Marco_c you code is clear now

@fat16lib you wrote``

"There is very little memory cost for this buffer since it is allocated on the stack only during the call to lineCount().

But if im correct , after a use it will still be allocated on the SRAM and can not be used for other codes anymore. it will not be released

But if im correct , after a use it will still be allocated on the SRAM and can not be used for other codes anymore. it will not be released

You are wrong. Local storage for functions is allocated on the stack. Study the following, it will help you learn to use functions more effectively.

Here is how C++ SRAM is allocated:

The globals area, where global variables are stored.
The heap, where dynamically allocated variables are allocated from.
The stack, where parameters and local variables are allocated from.

Here is how the stack is used in a function call:

Here is the sequence of steps that takes place when a function is called:

  1. The address of the instruction beyond the function call is pushed onto the stack. This is how the CPU remembers where to go after the function returns.

  2. Room is made on the stack for the function’s return type. This is just a placeholder for now.

  3. The CPU jumps to the function’s code.

  4. The current top of the stack is held in a special pointer called the stack frame. Everything added to the stack after this point is considered “local” to the function.

  5. All function arguments are placed on the stack.

  6. The instructions inside of the function begin executing.

  7. Local variables are pushed onto the stack as they are defined.

When the function terminates, the following steps happen:

  1. The function’s return value is copied into the placeholder that was put on the stack for this purpose.

  2. Everything after the stack frame pointer is popped off. This destroys all local variables and arguments.

  3. The return value is popped off the stack and is assigned as the value of the function. If the value of the function isn’t assigned to anything, no assignment takes place, and the value is lost.

  4. The address of the next instruction to execute is popped off the stack, and the CPU resumes execution at that instruction.

Typically, it is not important to know all the details about how the call stack works. However, understanding that functions are effectively pushed on the stack when they are called and popped off when they return gives you the fundamentals needed to understand recursion, as well as some other concepts that are useful when debugging.

Here is a sketch that illustrates stack usage.

uint16_t f() {
  char a[256];
  Serial.println(FreeRam());
  // make sure compiler allocates a[]
  for (int i = 0; i < 255; i++) a[i] = i;
  uint16_t r = 0;
  for (int i = 0; i < 255; i++) r += a[i];
  return r;
}
//----------------------------------------------------------------
void setup() {
  Serial.begin(9600);
  Serial.println(FreeRam());
  Serial.print(f());
  Serial.println(" f return value");
  Serial.println(FreeRam());
}
void loop() {}
//------------------------------------------------------------------------------
#ifdef __arm__
// should use uinstd.h to define sbrk but Due causes a conflict
extern "C" char* sbrk(int incr);
#else  // __ARM__
extern char *__brkval;
extern char __bss_end;
#endif  // __arm__

/** Amount of free RAM
 * \return The number of free bytes.
 */
int FreeRam() {
  char top;
#ifdef __arm__
  return &top - reinterpret_cast<char*>(sbrk(0));
#else  // __arm__
  return __brkval ? &top - __brkval : &top - &__bss_end;
#endif  // __arm__
}

It prints the following:

1824
1563
65409 f return value
1824

So, 1824 bytes of RAM were available before the call to f(). 1563 bytes were available while f() was executing so f() used 261 bytes of stack. 1824 bytes was available after f() returned.

woohh , ok. i did not know that!! . :blush:

thanks for the explanation, you know it very well :slight_smile:

many thanks ....