Have you changed all ints used to hold values that could fit in a char (signed 8 bit, -128 to 127, also called a short) or byte (unsigned 8 bit, 0 to 255) to a char or byte?
Have you moved all constant text to be serial printed to flash using F( ) macros?
I don't remember your code but most more than tiny projects can be RAM-trimmed something wicked.
With libraries, unless you go through you have no idea how any are on RAM use. The stack lives in RAM as well. Nested function calls and calls with a lot of passed args can grow the stack way down towards the heap. Recursion is self-nesting and can be particularly fierce at stack growth. When the stack corrupts the heap is bad. When the heap corrupts the stack, especially a return address, it's crash time.
Maybe the most important lessons you will come out of this are
- You can't predict what's going to work and how on a "real", non-trivial (and sometimes even then) project.
- Don't get boards made (or buy everything) until you at least have a working prototype. Even then, remember Murphy!