Sort a list of file names from SD card

I am attempting to alphabetically sort a list of files on an SD card, as part of a project to make an SD file browser using a Teensy 4.1 and a TFT display. Qsort seems to be a good solution for this, but all the examples I've seen use a char * array with the strings added in the declaration before Setup.

char *listItems[] = { "Zorro", "Alex", "Allan", "Celine", "Bill", "Forest", "Dexter" };

Since these are then in the stack rather than the heap, the array contents cannot be changed in runtime. This in unworkable because the array will need to change every time the directory changes.
Below is some amalgamated code from a couple of sources which seems to be able to write a list of files names to a char * array, then print it. However when I try to Qsort it, my Teensy stops, then reboots - I presume due to memory error. My limited knowledge cannot find what might be causing this, although I'm presuming it's a trivial oversight; if someone could point it out I'd be very grateful!

//main code from http://anyexample.com/programming/c/qsort__sorting_array_of_strings__integers_and_structs.xml
//char array population code from http://www.goodliffe.org.uk/arduino/sort_array.php

#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 


#define listItemsTotal 100 //maximum number of file names to store
#define listItemMaxChar 100 //maximum number of characters per file name, including null terminator
char *listItems[listItemsTotal];
char tempString[listItemMaxChar];//temporary string of the max number of characters per file name
byte listLength = 0;

#include <SD.h>
#define SDCARD_CS_PIN    BUILTIN_SDCARD
SdFat sd;
File dir;
File file;
char curDir[256] = "/"; //address of current directory being accessed, set to root

#include "SPI.h"


void setup()
{
  Serial.begin(9600);

  if (!(SD.begin(SDCARD_CS_PIN))) {
    while (1) {
      Serial.println("Unable to access the SD card");
      delay(500);
    }
  }

  scanDir();
 
/* print pointers array */ 
    for(int i=0; i<listLength; i++){
        Serial.println(listItems[i]);
    }

  Serial.println("Sorting....");  
  /* sort array using qsort functions */ 
  size_t listItems_len = sizeof(listItems) / sizeof(char *);
//  Serial.println(listItems_len);
    qsort(listItems, listItems_len, sizeof(char *), cstring_cmp);

  Serial.println("Result:");    
    /* print sorted string array */ 
    for(int i=0; i<listItemsTotal; i++){
      Serial.println(listItems[i]);
    }

    Serial.println("Finish");
}


  /* qsort C-string comparison function */
//edited C version of this function from //https://www.benjaminwuethrich.dev/2015-03-07-sorting-strings-in-c-with-qsort.html
//explained here https://www.bytellect.com/resources/sorting-in-c-with-the-qsort-function-bytellect001.pdf
int cstring_cmp(const void *a, const void *b){ 
    char *ia = *(char **)a;
    char *ib = *(char **)b;
    return strcmp(ia, ib);
}


//count the number of files in the current directory
//assign each file/directory name in the current directory to the listItems array
void scanDir(){
  freeListMemory(); // Start by freeing previously allocated malloc pointers
  File root = SD.open(curDir);
  while (true) {
    File entry = root.openNextFile();
    if (!entry) {break;} // no more files
    else {
//      listItems[listLength][0] = entry.name(); //add file name to list array
//      listItems[listLength] = (char[128]) malloc(128);
      sprintf(tempString, "%s", entry.name());//save file name to temporary string with null terminator at the end
      listItems[listLength] = (char *)malloc(listItemMaxChar);//assign enough memory for 100 chars to current list item pointer
      sprintf(listItems[listLength],"%s",tempString);
      listLength++; //increment counter of files
      entry.close();
    }

  }
}

void freeListMemory(){
  // If we have previous items in the list, then free the memory
      for (byte i=0; i<=listLength; i++)
        {
          free(listItems[i]);
        }
        
    listLength = 0;//reset list length
}

void loop(){
}

that's freeing one too many items:

      for (byte i=0; i<=listLength; i++)
        {
          free(listItems[i]);
        }

use for (byte i=0; i<listLength; i++)

how many files do you have? what's your Arduino (how much SRAM do you have?)

instead of doing

      listItems[listLength] = (char *)malloc(listItemMaxChar);//assign enough memory for 100 chars to current list item pointer

you could just allocate enough bytes for the tempString (including the trailing null)

      listItems[listLength] = (char *) malloc(strlen(tempString) + 1);  //assign enough memory for the current file name to current list item pointer

I've not checked the rest

Good spot! I copied that line from another example which was starting its For loops at 1 rather than 0...thought I'd fixed most of it but that one slipped through.

Probably in the hundreds in some directories...it's going to be a music player.

I'm using a Teensy 4.1, which has 1024kb of RAM. Hopefully this will be plenty?

I've made both changes but no luck so far!

you only have space for 100 pointers

#define listItemsTotal 100 //maximum number of file names to store
char *listItems[listItemsTotal];

I am aware of this, sorry I should have clarified: this test I'm running has less than 10 files in the root directory. I've given myself tons of headroom though as you pointed out: up to 100. When added to the main project, it may have to deal with hundreds of files, but I will adjust the array size accordingly.
As a test, I reduced the listItemsTotal down to 15, but this didn't change anything.

I’d like to hear someone say that qsort is an appropriate algorithm for this use case.

Have you looked at any other sorting methods and picked from among on any rational bases?

Qsort seems to be a good solution for this

isn’t inspiring my confidence.

a7

what do you see in the Serial Monitor?

what happens if you do this in the loop()


  /* print pointers array */
  Serial.println("Initial List:");
  for (int i = 0; i < listLength; i++) Serial.println(listItems[i]);

  Serial.println("Sorting....");
  qsort(listItems, listLength, sizeof(char *), cstring_cmp);

  Serial.println("Result:");
  for (int i = 0; i < listLength; i++) Serial.println(listItems[i]);

  Serial.println("Finish");

(instead of your code ➜ you don't want to sort the null pointers)

He has to as he allocated them dynamically

listItems[listLength] = (char *)malloc(listItemMaxChar);//assign enough memory for 100 chars to current list item pointer

ah, missed that my bad

System Volume Information
MusicBee
datalog.bin
Ftest1
SDTEST1.wav
SDTEST with super long file name.wav
New Folder Test
Sorting....

...and then it hangs and restarts.

So we get the unsorted list of files/directories as they are read from the SD card's root folder, but as soon as we get to the Qsort section it all falls over.

I suspect the comparison function may be to blame. However, my knowledge of casting types and referencing/deferencing pointers is patchier than the wifi coverage in my house, so it could be something completely different.

The general consensus across various forum posts and articles on sorting in C stretching back to the mid-2000s was that Bubble Sort was clunky and slow, and Qsort was the fastest and used the least amount of memory. I can't point you to a specific source for that information though.
I have also tried to use a couple of sorting libraries endorsed by Arduino (Kicksort and QuickSortLib) but ran into incompatibility issues with the latest version of the SDfat library, which I have to use to accommodate long file names.
If you have an alternative method that might be better for this use case then please enlighten me!

No, I just wondered if you did anything other than pick one out of the air.

Bubble Sort is a pretty low bar! And yes, qsort is very popular. But it seems to have become sorta the "no brainer" go-to sort. But you know what they say about goto. :expressionless:

I thought if you were going to struggle to get something working, it ought to be the right sort to begin with.

What are the incompatibilities between theSD library and qsort? I do not see why they would be in contention for anything, they tots should be separate one from the other.

I would recommend getting qsort to function separately from the SD thing. I would also do this outside the context of uploading to a microprocessor, no matter how powerful. You would have a much faster time getting sort to work on an array of pointers to char in a regular programming environment - it has been done, of course, and it isn't yet clear to me why you having trouble, so divide and conquer.

Then move the SD-less working qsort to the target, it may fail and have nothing to do with the SD stuff.

Then mix in the SD stuff.

I would do now, but it's tired and I'm getting late. But my curiosity is raised.

With a few hundred items there are probably a few other sorts of sorting that would be fine. It does depend on many factors one of which is where and how the list comes into existence in the first place.

a7

OK I just hacked my way through I think the very document you refer to above, and made this with the GDB online C compiler.

I'd be happy to see you do whatever it takes to make this into an upload able program, and see if it can work as well on your Teensy.

I gave myself until the top of the hour… just made it.

# include <stdio.h>
# include <stdlib.h>
# include <string.h>

// array of pointers to strings

char *myWords[] = {
           "zebra", "wildebeest", "duck", "swan", "author", "programmer", "engineer",
           "elephant", "donkey", "house", "home", "computer", "rabbit", "ear", "nose"
};
       
unsigned elementCount = sizeof(myWords) / sizeof(myWords[0]);

int CompareStrings(const void *p1, const void *p2)
{
	char *s1 = *(char **)p1;
	char *s2 = *(char **)p2;
	return strcmp(s1, s2);
}

int main()
{
    printf("Hello Qsort Test\n\n");
    
    qsort(myWords, elementCount, sizeof(myWords[0]), CompareStrings);
    
    for (int i = 0; i < elementCount; i++) printf("%s\n", myWords[i]);

    return 0;
}

a7

Wow that was a quick turnaround!

A few weeks ago I managed to do something similar to this, just by copying an example from the internet (link below) and modifying it very slightly. This runs perfectly on the Teensy:

//main code from http://anyexample.com/programming/c/qsort__sorting_array_of_listItems__integers_and_structs.xml

#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 

char *listItems[] = { "Zorro", "Alex", "Allan", "Celine", "Bill super long name to test what happens to super long names", "Forest", "Dexter" };
    size_t listItems_len = sizeof(listItems) / sizeof(char *);


void setup()
{
  Serial.begin(9600);
  
/* print unsorted string array */ 
    for(int i=0; i<listItems_len; i++){
      Serial.println(listItems[i]);
    }
  Serial.println("Result:");
  
/* sort array using qsort functions */ 
    QSort();
 
    /* print sorted string array */ 
    for(int i=0; i<listItems_len; i++){
      Serial.println(listItems[i]);
    }

}

void loop(){
}

//sort alphabetically
void QSort(){
  /* sort array using qsort functions */ 
//  listItems.toCharArray(listItemsChar,sizeof(listItemsChar);
    size_t listItems_len = sizeof(listItems) / sizeof(char *);
    qsort(listItems, listItems_len, sizeof(char *), cstring_cmp);
}

  /* qsort C-string comparison function */ 
int cstring_cmp(const void *a, const void *b){ //edited C version of this function from //https://www.benjaminwuethrich.dev/2015-03-07-sorting-strings-in-c-with-qsort.html
    const char *ia = *(const char **)a;
    const char *ib = *(const char **)b;
    return strcmp(ia, ib);
  /* strcmp functions works exactly as expected from
  comparison function */ 
}

The only difference I can see between this and my version is how the char * array is populated/allocated in memory.
To the best of my understanding, in this example the array is in the stack, and its size is determined when declared. For the SD browser version it all gets populated and allocated dynamically.

Is there a difference (besides where the array is stored) between:

char *myarr[] = { "string1", "string2"};

and

#define arrStrings 100
char *myarr[arrStrings]; //list of strings

#define stringLen 100
char tempString[stringLen]; //temporary to store string ready to add to char * array

void setup(){

//assign first string to array
   sprintf(tempString, "%s", "string1");//save file name to temporary string
      myarr[0] = (char *)malloc(strlen(tempString) + 1);//assign enough memory for the new string
      sprintf(myarr[0],"%s",tempString); //add new string into first element of char * array

//assign second string to array
   sprintf(tempString, "%s", "string2");//save file name to temporary string
      myarr[1] = (char *)malloc(strlen(tempString) + 1);//assign enough memory for the new string
      sprintf(myarr[1],"%s",tempString); //add new string into second element of char * array

//etc

}

which I am overlooking?
Does this store the arrays differently, and therefore confuse qsort?

Completely agree! As this is my first deep dive, qsort was what I settled on due to its popularity, but if there is a better option which is easier to implement then I would love to know.

You're right, the issues were not SDFat related! Looking back through my sketches, I couldn't get the examples from either library to compile (Arduino IDE 1.8.15, Teensyduino 1.54), so I abandoned them early on in favour of implementing qsort without a library thinking "How hard can it be?" (famous last words).

Since you are using a T4.1 you might consider using a more modern approach using the standard std::sort algorithm. Here some prove of principle code which avoids fiddling around with memory management.

#include "SdFat.h"
#include <string>
#include <vector>
#include <algorithm>

// some aliases to reduce typing....
using string = std::string;
using strVec = std::vector<string>;

SdFat sd;

void setup()
{
    while (!Serial) {}
    sd.begin(SdioConfig(FIFO_SDIO));

    // read filnames from some directory (here: root)
    SdFile dir("/", O_RDONLY);
    strVec filenames = getFilenames(dir);

    // use the stl sorting algorithmus
    std::sort(filenames.begin(), filenames.end(), [](string a, string b) { return a < b; });

    // print the sorted names
    for (string& name : filenames)
    {
        Serial.println(name.c_str());
    }
}

void loop(){
}

//-----------------------------------------------
strVec getFilenames(SdFile& dir)
{
    strVec filenames;

    SdFile file;
    while (file.openNext(&dir, O_RDONLY))
    {
        constexpr size_t maxFileLen = 30;

        char buf[maxFileLen];
        file.getName(buf, maxFileLen);
        filenames.emplace_back(buf);  // directly construct the string in the vector to avoid copies
    }
    return filenames;  // Ok, std::vector implements move semantics -> nothing will be copied here
}

This works almost perfectly first go! I had dismissed all C++ examples out of hand, not realising that they would be possible (it's also not a language that I'm at all familiar with).

The output is now:

Unsorted list:
System Volume Information
MusicBee
datalog.bin
Ftest1
SDTEST1.wav
SDTEST with super long file n
New Folder Test
Sorting...
Sorted list:
Ftest1
MusicBee
New Folder Test
SDTEST with super long file n
SDTEST1.wav
System Volume Information
datalog.bin

Presumably the std::sort function is case-sensitive, thereby causing the "datalog.bin" file to be sorted last?

EDIT: Also increased the maxFileLen to 100, to capture longer file names. This ensures"SDTEST with super long file name.wav" is stored, sorted, and printed in full.

@luni64 That's very slick, and presumably the end.

However it remains troubling that the qsort didn't work, so I am going to be counting kits, cats, sacks and wives for a bit before y'all pop off with ONE! One person going to St. Ives...

The sort works fine no matter where the pointers point. The OP said something about the strings being on the stack; I do not think they were ever on the stack, just in memory statically at compile time OR mallocated memory during run time. Qsort don't care. Even if they were on the stack, as long as that's where they stayed there should have not caused then problem.

Here is @thecomfychair2's placement, I just further hacked my hack to copy out the strings from their statically positioned locations into dynamically allocated memory, all code is just riffing off the various pieces from the original code and the document explaining how-to sort string arrays.

# include <stdio.h>
# include <stdlib.h>
# include <string.h>

/* so I can use good old printf */
int serial_putc(char c, FILE *) 
{
  Serial.write(c);

  return c;
} 

void printf_begin(void)
{
  fdevopen(&serial_putc, 0);
}

# define arrStrings 32
char *myarr[arrStrings]; //list of strings

#define stringLen 32
char tempString[stringLen]; //temporary to store string ready to add to char * array

void setup() {
	Serial.begin(115200);

  printf_begin();
  printf("Jello Whirled!\n\n");
}

void loop() {
	ymain();

	printf("\nTha's'all Folks!\n\n");
	while (1);
}

char *myWords[] = {
           "zebra", "wildebeest", "duck", "swan", "author", "programmer", "engineer",
           "elephant", "donkey", "house", "home", "computer", "rabbit", "ear", "nose"
};
   
unsigned elementCount = sizeof(myWords) / sizeof(myWords[0]);

int CompareStrings(const void *p1, const void *p2)
{
	char *s1 = *(char **)p1;
	char *s2 = *(char **)p2;
	return strcmp(s1, s2);
}

// move myWords to  array of pointers to dynamically allocated strings

void setupX()
{
    for (int ii = 0; ii < elementCount; ii++) {
        sprintf(tempString, "%s", myWords[ii]);
        myarr[ii] = (char *) malloc(strlen(tempString) + 1);
        sprintf(myarr[ii],"%s",tempString);
    }
}

int ymain()
{
    printf("Hello Qsort Test\n\n");
    
	setupX();

    printf("go to bed\n\n");

    for (int ii = 0; ii < elementCount; ii++) printf("%s\n", myarr[ii]);
    
    printf("\n\n");
   
    qsort(myarr, elementCount, sizeof(myarr[0]), CompareStrings);
    
    for (int ii = 0; ii < elementCount; ii++) printf("%s\n", myarr[ii]);

    printf("\n\n");

    return 0;
}

Obvsly these are alligators, and you have just jumped right TF out of the swamp, so no never mind if there is little interest. I just think there may be something to learn.

a7

With an SD card you can sort huge lists by

1 > Make a text file of unsorted words (file names read) and then
2 > Build a file of ordered links on the SD.

NEVER rearrange data in SD files, ALWAYS write index files pointing into the data.
The same data can be linked in many ways, the order is kept external to the data.

Step 2 can be done by reading the file to find each next-lowest-text-string to append a link. A byte array can hold 1 bit per word and then during the search for next word, don't check words already indexed because their bit gets made 1 when they get indexed. Do NOT change the original data file, that can wear the SD out quicker.

Yes, it is. Unfortunately std::string has no case insensitive compare (at least none I'm aware of). But you can easily work around by using stricmp to do the comparison. I.e., replace the sorting by this line:

 std::sort(filenames.begin(), filenames.end(), [](string a, string b) { return stricmp(a.c_str(), b.c_str()) < 0 ; });

I just saw that the constant is somehow badly named. Would rename it to maxFileNameLen or similar to avoid later confusion...