Pages: 1 [2]   Go Down
Author Topic: Option to include sketch in flash?  (Read 1659 times)
0 Members and 1 Guest are viewing this topic.
Arlington, MA, USA
Offline Offline
Sr. Member
****
Karma: 0
Posts: 259
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

It would be nice if the Arduino IDE had some sort of SCC functionality built in. I know I'd appreciate that.

Not meaning to get religious on the matter, and I freely admit that my familiarity with the Arduino toolchain is superficial, but unless there's a timestamp in the binary, or more debugging data that I'd expect (for a microcontroller), it ought to be possible to have different source files compile to the same binary, hence the SHA1 computed solely against the binary couldn't *uniquely* identify a set of sources. I'd expect that changes in comments wouldn't affect the binary, nor would changes in variable names. Those are a couple of examples I can think of; perhaps there are others. In either of those cases, you could certainly correctly state that they are *equivalent* sources, and perhaps therefore the discrepancies don't matter.
Logged

San Francisco Bay Area
Offline Offline
Newbie
*
Karma: 0
Posts: 8
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Just use GIT and make regular backups to external media of your choice like everybody should do it.

It's so easy these days. 16GB USB sticks (40MB/s) cost almost nothing, external 2.5'' disks are almost free considering the storage space they offer. Unless we're talking about backing up a video collection.

If the AVR chips had 'quasi unlimited' storage, I wouldn't oppose as much though. But I would never rely on the source code being stored on the chip as well. Murphy's Law would get you anyway, trust me on that. It's much better to keep your valuable code in good condition and safe somewhere else.

I do use source control, and I do backup to external media, and I even make periodic off-site backups.  I still have to keep careful notes to ensure I've got the right version of a project for a given board, and I still get it wrong sometimes because I forgot to update my notes when I did a quick last-minute rev on a sketch in the field.  Loosing source code is not the problem I'm trying to solve, I'm trying to come up with a fool-proof way of associating a specific source version with the physical object it goes with.

I sometimes velcro a USB stick to larger things like test equipment to keep track of relevant source/drivers/scripts/manuals/notes.  This works extremely well, except when somebody borrows it and looses the USB stick (in which case I make a new USB stick from my backups).  I was just proposing a built-in version of this for Arduino.

Cheers,
 - Dean
Logged

San Francisco Bay Area
Offline Offline
Newbie
*
Karma: 0
Posts: 8
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Don't get me wrong, I like the concept of storing sketches in the device, it is an inspiring problem and got me thinking. It has several advantages (especially finding the right code as you pointed out) no doubt about that, but it is not a final solution.

This problem is called deployment management. [very recognizable]

I've also heard it variously called Release Control, Release Engineering, Product Engineering, and other various things over the years.  I have yet to find a company that has completely nailed it.

As a developer I need a solution that I can trust, it should work every time I want to. Storing sketches in the Arduino is not allways possible (due to size) and therefor I cannot rely on it.

I find this an interesting view, because the '328 (or any processor) has a finite set of resources that your program could eventually outgrow.  There are always "grey area" resources like circular logs in RAM or extra debug ports that you jettison along the way as resources get tight.  Source-in-flash is just another such thing that you could use until the program outgrows it.  You then decide that (1) it is OK to stop using this feature, or (2) it is valuable to your workflow and you find a bigger chip.

Embedded developers are always making tradeoffs about how to deploy the resources of the platform.  Would you give up on circular log buffers or debugging ports because some designs don't have enough resources to afford them?  Of course not - they are tools that you deploy when appropriate.  Source-in-flash is just another tool to be deployed when appropriate.  That doesn't make it unreliable, it just makes it another decision in the tradeoff calculations.

I do get your point that just storing a reference to the source is orders of magnitude less expensive than storing the source, drastically altering the tradeoff decision.

So if one wants to spend energy in solving deployment management for Arduino, we should think of a way that is:
- transparant for the programmer - (don't do things that can be automated --> KISS)
- configurable (switch on/off etc)
- works for all deployments
- robust, reliable
- and so on.

Storing a sketch in an Arduino does not work allways, as it fails on a crucial point imho SIZE. That doesn't mean it has no value, on contrary it can be very usefull as you pointed out, I am just stating it isn't reliable enough. Storing a reference to the source (etc) takes 16 bytes (UUID) which is 0.05% of the available memory and independant of sketch size. And yes there will be applications that don't even have these 16 bytes free. A real final solution should even work for this case. That means that the reference to the code should be stored in the Arduino but at the same time can't be stored in the Arduino. This is a typical TRIZ contradiction.

Solving that contradiction => the binary code itself is the reference (mmm 32K keys, no good...)

making 32K keys more practical: after uploading a sketch, AVRdude reads the complete memory back and makes a SHA1 hash to be used as reference to the sourcecode. So if one arduino comes back through the mail one can read its memory back, do the SHA1 hash and one has the reference to the sources.

That said, this reference will not be the only way to access the sources, full text indexing of all your sketches is very well possible these days, so such things need to be in the final solution too.

 The complexity to realize SketchWithin, SHA1 and UUID is comparable. The differences between the SHA1 and UUID versions are
- SHA1 will generate a new code for every source iteration, where with the UUID this is optional
- UUID uses (at least) 16 bytes, SHA1 uses 0 bytes of Arduino memory
- SHA1 will detect image tampering, UUID will not  (key lost??)
- UUID will probably be faster than SHA1.

My final choice would be using UUID, and the SHA1 at release moments. The cases I need the last 16 bytes I should really consider an new larger platform.

EPILOGUE:
In short storing a sketch in the Arduino is usefull in many cases as you pointed out. However it won't solve the "what version of code have I deployed problem" in all cases. The SHA1 and UUID solutions will perform better especially for large sketches. My choice would be using UUID all the time, and the SHA1 at release moments. The cases I need the last 16 bytes I should consider an new platform smiley

Again thanks for this inspiration,

This reasoning all seems sound to me.  I also agree that if you are down to the last 16 bytes and can't afford space for the UUID, you are probably ready to look for another platform.  I think the utility of your solution would be high enough that I'd go find a way to shrink the image by 16 bytes rather than forgo the UUID.

As for the SHA1, I agree with tastewar that if it is a hash of just the executable image, then it won't capture things like updated comments, formatting changes, or even logic changes that compile down to the same opcode stream.  If it is always used in conjunction with (and includes in its hash) the UUID, then it would cover trivial source changes.  Also, I think you would need to know the range of memory to calculate hash - you wouldn't want to do the full 32K since that may include garbage from prior uploads, and even runtime flash data.

Cheers,
 - Dean
Logged

Left Coast, CA (USA)
Offline Offline
Brattain Member
*****
Karma: 331
Posts: 16459
Measurement changes behavior
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

On the KISS theme, maybe go for the 80/20 rule for funtionality Vs resource requirements. Just perform a 32 bit CRC run ( or maybe a hash code) on the source file and store that result as a long into the flash ( even simpler, into eeprom). That would give the basic hook for a method to verify that the flash code was created from a specific source file, no?


Lefty

« Last Edit: June 19, 2011, 12:39:34 am by retrolefty » Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 167
Posts: 12417
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset


The way I solve it for my real projects (not my tinker projects) is to have a version string in the code containing name & versionnr (approx 25 bytes). This string is printed at the startup, so as long as the device can restart it will produce its name and version. If it can be restarted it is time to upgrade anyway. This method is not foolproof as I don't update the versionnr with every increment ...

if time permits I check if I can get an UUID signature from a hexdump today - check if the compiler optimizes it away -
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 176
Posts: 12285
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


If it's possible with AVR executable images, a digital signature would solve some of the problems.  As an added bonus, if the developer keeps the signing key a secret, the provenance of an image can determined.
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 167
Posts: 12417
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

OK, took some time today, ran a few tests with essentially some variations on the following code.

Code:
volatile char UUID[]  = "<UUID=da51a9f0-9a49-11e0-aa80-0800200c9a66>";

char version[] = "UUID_TEST 0.04";

void setup()
{
  Serial.begin(115200);
  Serial.println(version);
  UUID[0] = UUID[0];
}

void loop(){}

Then I retrieved the whole 32K image with  [dosbox windows 7]

cd C:\Program Files (x86)\arduino-0021\hardware\tools\avr\bin>
avrdude -C "C:\Program Files (x86)\arduino-0022\hardware\tools\avr\etc\avrdude.conf" -v -v -v -v -p atmega328p -c stk500 -U flash:r:"C:/arduino.bin":r -P\\.\COM5 -b57600


viewing the binary easily reveals the UUID. See picture attached.

Some notes:
The UUID array must be volatile and the assignment UUID[0] = UUID[0] are needed both otherwise the compiler optimizes it away.

size without UUID string: 1908 bytes
size with UUID string: 1960 bytes

so this proof of concept implementation took 52 bytes.
* Remove the tag structure <UUID=...> (-7) => 45 bytes
* Remove - signs in the UUID (-4) => 41 bytes
* Make the UUID binary (-17) => 24 bytes.

proof is in the pudding test:
Code:
volatile uint8_t UUID[] = { 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65,
                          0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65 };  // character e 16 times
                         
char version[] = "UUID_TEST 0.06";

void setup()
{
  Serial.begin(115200);
  Serial.println(version);
  UUID[0] = UUID[0];
}

void loop(){}
size without UUID uint8_t array : 1908 bytes
size with UUID uint8_t array: 1932 bytes
so UUID signature now takes 24 bytes.

OK tinkered enough smiley-wink
Rob




* UUID.jpg (138.04 KB, 806x443 - viewed 11 times.)
« Last Edit: June 19, 2011, 03:53:28 am by robtillaart » Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

SF Bay Area (USA)
Online Online
Tesla Member
***
Karma: 106
Posts: 6364
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

RAM is in short supply.  You MUST figure out how to put the UUID in flash only (and not have it garbage collected.)
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 167
Posts: 12417
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

first try
Code:
#include <avr/pgmspace.h>

//volatile char UUID[]  = "<UUID=da51a9f0-9a49-11e0-aa80-0800200c9a66>";
volatile uint8_t UUID[] PROGMEM =
            { 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65,
              0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65 };
                          
char version[] = "UUID_TEST 0.08";

void setup()
{
  Serial.begin(115200);
  Serial.println(version);
  uint8_t x = UUID[0];
}

void loop(){}
code size: 1928 ;  so 4 bytes less  but ==> 16 in PROGMEM and 4 in RAM, data still need to be referenced from RAM

gotta think deeper ....smiley-wink
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 167
Posts: 12417
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

2.5 hours and many many webpages later, dived into assembly to find out how to declare an array in assembly in flash.
- http://www.avrfreaks.net/index.php?name=PNphpBB2&file=printview&t=35337&start=0 - was the eyeopening site

Code:
char version[] = "UUID_TEST 0.10";

void setup()
{
  asm volatile(
  ".cseg"   // use code segment
  "uuid:  .byte 101,102,101,102,101,102,101,102,101,102,101,102,101,102,101,102"
  );
  Serial.begin(115200);
  Serial.println(version);
}

void loop(){}
code size: 1924 ; so the 16 bytes of the UUID in flash; no reference needed

filling the "uuid array" can also be done in hex

  "uuid:  .byte 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65,0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65, 0x65"

or a bit shorter

  "uuid:  .word 0x6565, 0x6565, 0x6565, 0x6565, 0x6565, 0x6565, 0x6565, 0x6565"

So in the end storing a signature in the flash part of the code is only a few lines.

Thanks westfw for pushing me to the limits, I learned a few bits...
Rob
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

SF Bay Area (USA)
Online Online
Tesla Member
***
Karma: 106
Posts: 6364
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I didn't mean to make you dig deeply into advanced magics (though perhaps it was a good thing!)
It's also possible to get the linker to treat any file as "binary" to be explicitly included (maybe) in a final .elf/.hex file.
It would look something like
Code:
avr-objcopy -v -I binary -O elf32-avr -B avr  mysketch.pde sourcecode.o

(objcopy is one of those utilities that is a lot more complex than you would think it would be.  It handles OODLES of different formats!)
Logged

Pages: 1 [2]   Go Up
Jump to: