Forgetfulino - Upload your (compressed) code in the board - IDE 2.x extension

Hi everyone!

I’ve just published my first Arduino library and I’d really love to hear what the community thinks about it.

It’s called Forgetfulino and the idea is pretty simple: it embeds the original sketch source code directly inside the firmware, so you can retrieve it later through Serial. Basically, if you ever upload a sketch and later lose the .ino file, the board can still “remember” the code that was flashed.

The library works by converting the sketch into a flash-stored array during compilation and then reading it directly from flash at runtime, so it uses zero RAM. It should work across several architectures (AVR, ESP8266, ESP32, SAMD, RP2040).

I built it mainly because I’ve had more than one moment of “where did that sketch go?” after uploading something to a board.

LIMITATIONS:

One current limitation is that you need to run a small Python script before compiling. The script reads the .ino file and generates the header that will be embedded into the firmware. So every time the sketch changes, you need to run the script again to update the embedded source.

Ideally I’d like to automate this step, but I haven’t found a clean way to do it yet without modifying board definitions or using approaches that feel a bit too hacky. Personally I would love if the Arduino IDE allowed developers to optionally run pre-compile scripts or batch files, it would make this kind of workflow much simpler.

Since this is my first library, I’d really appreciate some feedback from people with more Arduino experience than me.

A few things I’d love to know:

  • Would you ever use something like this in your projects?

  • Does the workflow make sense to you?

  • Is there anything you would change or improve?

  • Any ideas for additional features?

I’m especially curious if there’s a cleaner way to handle the generator step, or if the current approach feels reasonable.

If anyone wants to try it, feedback, criticism, or brutal honesty are all welcome. I'm here to learn.

Thanks!
:slightly_smiling_face:

I moved your topic to an appropriate forum category @Blu-Vector.

In the future, when creating a topic please take some time to pick the forum category that best suits the subject of your topic. There is an "About the _____ category" topic at the top of each category that explains its purpose.

This is an important part of responsible forum usage, as explained in the "How to get the best out of this forum" guide. The guide contains a lot of other useful information. Please read it.

Thanks in advance for your cooperation.

1 Like

Quoted from the repository readme:

Works with sketches of any size, no memory limitations

Not so sure about that. There's more leeway on non-AVR boards as they have more flash, but if the source is being stored in flash as an array, then that would be taking up a sizeable chunk of flash that could otherwise be available to store compiled code. In that sense, it is limiting the amount of memory available for the compiled sketch.

Thank you for the feedback! You're absolutely right about flash usage, that was a bold statement. I'll clarify that claim: the library uses zero RAM (reads directly from flash), but yes, it does occupy flash space that could otherwise be used for code.

Do you handle sketches with multiple .ino or a .ino and a bunch of .cpp and .h in the sketch's folder ?

I like the idea, and is in itself a nice programming exercise.

Can you tell when one would need this functionality?

E.g. for my libraries I just print the version number and one can find the code on GitHub.
Think having version control in some way is important as, if one gets the source of a sketch, it might not be the latest greatest.

Not dived into your code yet, but I hope it has at least a timestamp as "poor mans version control".

A small blink at the code I see a lot of "separator strings" that are look a likes. By making them identical the footprint will be reduced.

Finally, does the library use compression, to minimize the footprint?
You could gzip the source code in flash to save roughly a factor 2 or FLASH needed.
The library could then tell the instructions how to un-zip the binary stream.

@J-M-L

You’re absolutely right about the limitation with the main .ino file. Once a project grows into multiple .cpp and .h files, the whole idea of this library starts to lose its meaning anyway. At that point you are already in the territory where proper version control makes sense, you have a repository, tags, commits, maybe CI, and printing a version number that maps to a commit on GitHub is the correct solution.

This library is really aimed at a very different situation: small sketches, quick prototypes, experiments, teaching setups, or those one-off projects where you wrote something quickly, uploaded it, and three years later you find the board in a drawer and think “what exactly is running on this thing?”.

Another situation where this happens is with field prototypes. The version in the repository might be the “best” or latest one, but it is not always the one that actually works on that specific piece of hardware. Real devices accumulate small fixes, tweaks, and adjustments over time that never make it back to the repository. You can track everything with digital twins and strict documentation, but that quickly becomes a time management problem. In those situations it can be very useful if the device itself can simply tell you what code produced the firmware.

It also happens in places where there is no real operational continuity, for example hacker spaces, university prototype labs, shared workbenches, or teaching environments where many people touch the same hardware over time. In those contexts it’s very easy for boards to accumulate without anyone being completely sure what firmware was last uploaded. Ans most importanty environments where we cannot enforce versioning to everyone, like a company.

In those cases setting up a full versioning workflow often feels like overkill, and people end up relying on folders like “final_v3_really_final”. This is simply meant as a convenience tool for that scenario, where the board itself can tell you what code produced the firmware without having to guess.

Thanks again for the feedback.

@robtillaart

I really appreciate the detailed feedback.

About the use case, replied above! :slight_smile:

Regarding compression, I actually experimented with it. I tried using zlib and also played with Unishox to zip the source before storing it in flash. Compression itself worked, but I ran into problems when reading and streaming the compressed data back reliably from flash on the microcontroller side. Because of that I kept the current version simple and stored the raw text. I agree though that compression would make sense and it’s something I would like to revisit, because even a simple gzip could roughly halve the flash footprint.

Your comment about timestamps is interesting because that is actually where another idea came from while reading the discussion in reddit. In the past I used macros like FILE, DATE, TIME, and TIMESTAMP to print build information over Serial. It worked surprisingly well as a sort of “poor man’s versioning”. The board could literally tell you which file compiled it and when.

Seeing the feedback there made me think about a second approach that could combine both worlds. I’m thinking about a small companion library called “Githolino”.

Instead of embedding the full source code in flash, it would only embed lightweight compile metadata such as the file name and timestamp using those macros. Since they are compile-time constants they basically cost nothing in terms of resources.

The device could then report something like “this firmware was built from file X at time Y nad has this unique ID”. A small tool on the host side could query the Git repository, match the file and timestamp, and automatically retrieve the exact version of the code that produced the firmware.

In that model the device tells you what it is actually running, and Git provides the full history.

So Forgetfulino is the brute force approach where the device remembers everything, while Githolino would be the lightweight approach where the device only provides the key needed to reconstruct the exact source from the repository.

I’m starting to think the architecture of the second option as this might actually be the cleaner architecture, possibly implemented as a small tool between the device and Git.

Curious what you think about that direction.

thanks for the clarification, makes sense.

1 Like

gzip.compress should be easily used in python to get the compressed file, then it's just a matter of transforming this into the progmem stuff. Dumping this out will of course generate a binary image (the gzipped file) so you can't just copy and paste from the serial monitor. Using a different terminal you could redirect the flow in a file and that should be then what you unzip.

That’s actually a good suggestion.

Gzip can easily reach ~50% reduction on plain source code. The main issue I ran into was on the microcontroller side when streaming the stored data back reliably from flash. Since the output becomes a binary stream you can’t just copy-paste from the Serial Monitor anymore, you really need a terminal that can redirect the stream to a file and then unzip it on the host.

Conceptually though I agree with you, the pipeline would be quite straightforward: compress → store in flash → stream raw bytes → reconstruct file on the host → unzip.

This discussion actually made me think that the really interesting place for something like this would be directly in the Arduino IDE itself. Imagine if the IDE had an optional flag during upload that stores the sketch in flash in compressed form. Given that gzip can reduce source size by around half, the overhead would become much smaller.

Then the IDE could expose a simple button like “Retrieve sketch from board”. The IDE would read the compressed blob from flash, reconstruct the sketch and open it again locally. That would feel much more integrated and much less hacky than doing it entirely from a library.

So I still like the idea of experimenting with gzip inside the library, but I’m starting to think the real long-term solution would be something implemented at the tooling level rather than purely inside firmware.

It would be pretty easy to write a serial command tool in Python that would actually extract the binary by sending the right command to the Arduino and unzip and save the file somewhere.

(I’ve written a small tutorial on interfacing with Python. See Two ways communication between Python3 and Arduino)

You could use cat flash | base64 | Serial
To handle the binary zip file.


Githolino,
Create a QR code from the git commit + filename + ....

Such QR code could also point to a website with more info about the sketch, or a site selling veggies, whatever.

So, just invented the QRuino :slight_smile:

Hi @Blu-Vector.

You might investigate whether this could be implemented via an extension.

Additional capabilities can be added to Arduino IDE 2.x via VS Code extensions:

In addition to the standardized VS Code extension API, Arduino IDE also makes Arduino-specific information available to extensions. This information is provided by the "Arduino IDE API for VS Code extensions" extension:

That extension is pre-installed in Arduino IDE, so the information is always available for use by 3rd party extensions.

You can see some examples of how that is utilized by existing Arduino IDE 2.x extensions:

Due to the tremendous popularity of VS Code, there is a lot of information available on the Internet about creating VS Code extensions in general. This information will be applicable even when making extensions that target Arduino IDE 2.x specifically. So if you are searching for information, make sure to refrain from adding the "Arduino" keyword to your searches.

That’s actually a very interesting direction, thanks for pointing this out.

I implemented a watchdog, that’s a video of how everything works

Thanks for your idea guys!

@J-M-L **
**
@ptillisch

@robtillaart

Check the new forgetfulino on steroid

The idea is amazing. I starred your repo on Github.

But why did you post update to the separate topic?

I wanted to have a track but the output of the new forgetfulunino was totally different tool :) . Do you think a single post would have it better?