Arduino Config Library

This library has been in the works for a while now... so I'm proud to present version 1.0 of the Config library! :smiley:

Basically, this is a library for easy storage, retrieval, and update of any type of data you can think of... all stored in nonvolatile EEPROM.

Keep in mind, this is version 1.0. It has a few bugs, but largely works well. I tested, there are no memory leaks related to this code. The code is rather large however. Anyone with suggestions about how to make this smaller please chime in.

Check out the documentation (will be posted to http://blog.internaldrive.info in the next couple days) and the example. Those have all the information you need to start developing with this library.

Download it here. I'm eager to hear your thoughts and will base future versions off of your suggestions or improvements.

Thanks!
Morgan

Big picture comments

  • there is a lot of EEPROM relative to RAM on a ATMEGs 512 t0 1024 168, 1024 to 2048 on the 328 and 4k to 8k on the 1280
    So using the copy into RAM to parse approach would require 50% of memory free for the purpose :(.

  • the copy then write approach leads to much busy work, maybe there is a way to achieve your goals with most changes only impacting the value of a single config variable.

ideas like/

  • length encoded config variable pairs
  • deleted space markers (say combined with versions below)
    would allow updates without global changes.

It is likely good to impose a packing size on the variables so that gaps are reused more often.
[there are a few approachs
KISS - 1 / all variables are the same size - o fragmentation possible
KISS - 2 use only half of EEPROM/ so free space always enuf no matter the size distribution and update order
hope - 3 allocate on a boundary that deals with most variables, so packing using old single spaces is likely to release enough space for an update to a large variable
4 - fall back to update in place if no space available for second copy
]

Memory layout could be in [bytes]:
1 {version 1... 255 or deleted 0, eg [3]}
2 {variable name length eg [5]}
3-7 {name eg

[r][a][z][y][d]}
8       {value length eg [4]}
9-12  {value eg [0xDE][0xAD][0xBE][0xEF]}
13     {check byte [calculated-value]}
14-16{buffer-length-padding-to-multiple-of N- eg 16 }

And then
* version numbers on variable values
* updates as writing a new version followed by deleting the old version (setting the version to 0)
would prevent crashes from destroying a config.

The above could be implemented in two main functions:

int /*bytes read, 0 not found*/ config_find_and_read(const char *config_value, int bufSize, char*buf) ;
{/* scan for latest version that has a valid checksum and copy data into supplied buffer */
}
bool /*write suceeded*/ config_write_value (const char* config_name, int valueLength, const char*value) 
{
/* - scan for latest version - if not found use version 0. */
/* find free space (possibly compact smaller variables) 
(return false if not enough space)
*/
/* write new value with next version # (255 -> 1) in found free space */
/* for all older versions of configName, mark deleted, ie set version to 0 */
/* return sucess */
}
With a C++ wrapper layer to do  conversions from real data-types into char pointer, sizeof(type) etc...

Small picture comments:

You use two variables to identify a config item/ section and name/.
This increases the code size...

The user of the library can implement their own config structure in the name with punctuation/ for free.

The following is a pretty expensive way of getting the next highest power of 2; it is inline and maybe GCC will convert it to a constant load - if not, there might be better ways for a micro-controller with no 16 bit multiply..
bytes = 0x01 << (uint16_t)ceil( log2( Size() ) )

You allocate the value buffer on the heap, inside the constructor, which will inflate the memory required by a heap prefix/postfix chunk.

You could use the template to create the value as a member variable.
Then if the object is a static variable of the sketch the minimum memory used, and if it is an auto variable on heap, it is all on the stack.

====
Really big picture comments :slight_smile:

First, My comments come from reading your code, and I hope that they are useful to you. You have done a lot of work that I have not commented on, as it seems completely reasonable and rational to me;

And your code is clear and easy to read which is a great thing; other-wise I would not be able to make these comments.

However, an important thing that is missing in the package is a short statement of objective.

I ** think ** it might be something like this...

This Config package provides the ability to store configuration values identified by name in the EEPROM of an Ardunio.

The configuration will be accessible by all sketches that run on the Arduino - so multipe versions of a sketch can share configuration data.

Variables can be read by sketches without needing to know the datatypes [I think you implemented this in your code - not sure].

Configuration variables can be treated like very short files...

And if the above is the objective, a simple sketch that dumps the contents of the configuration variables to the serial output would be a good test tool.

And a more complex sketch that allows the deletion/insertion by hand of config values would be useful.

=========

To restate my comment from above/ your library is well written and very clear.

You could make the code more understandable for others now, and yourself in a few years time, by including short 'statements of intent' in the code before each larger section.

====
I hope that these comments are of assistance to you.

Dafid

Thanks for the comments. I realize some of this could be done with less work and memory usage on the side of the microcontroller. You got the "statement of intent" right -- the major problem for me though is trying to implement some algorithm that doesn't require me to align everything to powers of two. I've tried, but don't know where to start. I just want to create small "files" on the Arduino that can be accessed and written to on the go. If you look in the readme, I am attempting to get some of that stuff (including that massive memory allocation) fixed.
The section and name thing does increase the code size, I agree. I think I should drop the "section" thing.
This library was actually an offshoot of a standalone CSV parser library I made to parse the data from here. That's why there is some left over section code that I haven't got around to removing yet.
This library is a work in progress, I agree. It has some potential, though, as a complex config filesystem for the Arduino.
Anyway, I'm posting on the go. I will look over your comments more thoroughly soon. I am hoping to push 1.0.1 tonight or tomorrow, then I will probably branch off separate source trees for 1.1.0 and 2.0.0, 1.1.0 with the "easy" 2^x alignment and 2.0.0 with the better version.

Thanks a lot for your comments and source work. Much appreciated. I will do more source work tonight, but I'm getting hungry... haha.

Morgan

The only thing I am confused about in your comments is the "version." What would this byte do? Also, what about the "checksum" byte? Wouldn't a 1-byte checksum result in a greater probability of checksum collisions?

The intention of the checksum on each entry is to allow a subsequent sketch to reject entries that have been corrupted or partially written.

The intention of the version/delete flag is to allow a sketch to write a new version of the config data as a sequence of operations that could be interrupted, and after it is all completed delete the older version of the data - with it being clear at all times what is old data and what is new.

Now, however, I see a fault in the checksum byte intention - imagine a sketch is updating a config value as the power is pulled; there is a 1in 256 chance that the contents of the byte that is still present at the location where the checksum would be stored for a new value will match the data written when the sketch is interrupted :(.

This could be prevented by writing the version/delete flag as deleted, and only setting the real version # once the write of the rest of the entry, including the checksum, is complete.

dafid

Oh, good idea. What about storing config items as a normal distribution where the more common numbers of bytes are clustered together?

One of the questions I have is about padding. Say someone creates a config item called "uint16_t" with a 16 bit value: [0xDE] [0xAD]. Let's put that as the first value in EEPROM.
Say the second is an array of bytes (or characters, whatever) called "array" with the value [0xCA] [0xFE] [0xBA] [0xBE] [0xDE] [0xAD] (6 bytes).
And the third is a 32 bit value "uint32_t" [0xDE] [0xAD] [0xBE] [0xEF]

What if the second value is lengthened? How about shortened? How about being removed entirely?
I agree with your "placeholder" idea (setting the version to zero and such). But if it was lengthened, where would the new data be moved to? What would happen to the old location? Would a shortening keep it in the same location but add NULL padding? And would a deletion set the version flag to zero?
There are so many options to consider here. One thing I don't want (for the new version, at least) is excessive fragmentation.

Morgan,

First, I do not think variables changing size on an Aduino is likely - sounds more like a server or PC consideration...
Perhaps you have a need for it that I do not understand?

However note that the sketch is the slave of the user, has no control of when the power is disconnected, and only byte at a time write access to EEPROM. In this scenario, where debugging is painful a fault in the Config library results in persistent and inexplicable failures in the sketch - so the library should be very careful.

Hence the suggestions to ALWAYS write a new copy of the data before deleting the old value/ and only trusting the value after the very last byte is written.

If the config was not important across restarts the sketch programmer would just hold it in normal variable.

In my case, I need to spend 10 minutes running a special sketch that moves a servo reporting the value - while I watch and then record the centre value for subsequent use. And the value changes with the geometry of the control arms to the boat. It is a mornings messing about...
Followed by compiling the magic numbers into the sketch as #defines.

If the same Config system was used in plane then I would expect even more care.

Of course it depends on the use of the sketch.. In the environment of the Arduino it is important that the code be small and not wasteful of the resources.. as there are only a few available!

Here are two imagined uses of the EEPROM by myself:-

Record sequence of GPS locations visited (not a case where I would use Config):

  • lots of data values (compactness reqd)
  • no expectation to reuse beyond the day or weekend - transferred to PC for mapping etc after trip.
    I would use a simple function to record an array of values.
    [Would be cool to have a flash logger as this sort of sketch is likely quite small and there is 16x as much flash as EEPROM :)].

My case for Config - record for various model boats:

  • center PWM of servos.
  • speed versus PWM for winches.
    Maybe
  • waypoints for navigation

no more than ten per sketch;

Given these are on an arduino, with no keyboard; maybe a dump to log window/clear EEPROM/re load selected entries from PC interaction is a good way to address the build up over time of unused config values over time.

So given the above... of my original approaches
KISS - 2 use only half of EEPROM/ so free space always enuf no matter the size distribution and update order

Is simple' easy to implement and adequate to purpose!

What do you think?

It is not clear what you need the library for .. so my comments are perhaps not applicable.

Dafid

No, you got it right. I might as well say why I created the library here and now...
I want to make a config "filesystem" that allows the dynamic update of small "files" in EEPROM with the call of a function. This filesystem will persist through updates of sketches, so programmers don't need to worry about the EEPROM in their programming.
Why variables "need" to change size is the case of character arrays and such. For example, if I had a "hostname" config item for network configuration, I would want it to be able to be updated with an array of arbitrary length, eg. "Arduino" or "ArduinoMega". Of course, the other option here would be to just specify a maximum length in the ConfigItem's constructor. Also, I want it to be future-proof: say the programmer wants to change the size of an integer from 8 bits to 16 bits. The class needs to cope with those changes.
I like your KISS idea about using half of EEPROM, but I want to reduce fragmentation as much as I can. I have tried to be careful in development and will be working to test it extensively to make the code smaller and faster.

ok, i understand the change in sizes now/ but it is not the main case surely.
The idea that an object to represent a config value seems to add complexity to the implementation of the storage.
It is a pretty easy task to make a wrapper object for a templated read() and write() method; this class would need to deal with changing the size of the variable of course.
In the storage layer/ with write then delete as the update mechanism/ the size of the old versus new value is not important apart from impacts on fragmentation - see next reply.

for fragmentation/ i think the following policy for allocating space for the next write reduces it to a minimum.

Try each of these sub methods in turn until success or no more methods implemented

  1. search deleted chunks for a space of the right size
  2. allocate from free space at the end
    KISS 3) stop/
    maybe 3) copy EEPROM buffer in code (see CODEMEM) ie flash and then put marker in EEPROM to say needs to get from flash if crashed/ and rewrite back to EEPROM with out fragments.
    BAD points
  • the user can replace the sketch while the process is waiting to get the values back from FLASH, losing the config.
  • writing FLASH is slow relative to most things so updates to config are not predictable in time.
    Complex - 3) compact by repeated application of algorithms that are either:
  • not guarenteed to work
  • using memory to buffer values makes it more likely to work, but does not protect the contents from a user restart

===========
i dont think the above is a solution really..

rules 1) + 2) work for a user that needs only 30 varibles of sizes between 10 and 30 bytes.. average storage size, ie allowing for chunking, of 20 bytes..

ie 3020 to store variables = 600
16 bytes for free 16 byte chunk and 32 bytes for free 32 byte chunk
total 648 bytes.
average waste might be a bit high say 30
1/2 chunk size = 240 bytes.

if the chunk size was 8, then the above data would probably be

average storage size 16 [saying size distribution is uniform]

so 30*16 = 480
plus free chunks of 8, 16, 24, 32 = 120
or 600 bytes.

==
a really pushed author might use two byte variable names [sketch index][variableindex] each containing 2 bytes, plus 2 lengths, version + checksum - so EEPROM footprint is 8 bytes

allowing 127 variables to be stored and updated on a 168.

=======
As I said/ I dont have a compaction/defragmentation solution with no downsides for the author that wants to store 10 small variables mixed into a variable that continuously changes size..

The off-line compaction and editing of the config variables still appeals as for my use cases - as play and rework a lot.

And the cases where growth of a variable leaves a trail of growing husks seem a bit unrealistic to bother with any of the 3)s listed above..

thoughts?

You might want to use GitHub - benhoyt/inih: Simple .INI file parser in C, good for embedded systems

Its working for me on an arduino.