Arduino IDE must use Makefile and precompilation

Why Arduino IDE still not use Makefile generator and Wiring part precompilation ? :disappointed_relieved:

Project compiling time is really ugly

libWiring part can be compiled only one time: when user changes board settings.

And all compiling can be managed using autogenerated Makefile for scetches.

How exactly would you have a Makefile do the preprocessing step, where Arduino automatically adds function prototypes and #include "Arduino.h"?

How would your Makefile parse the #include lines in the .ino files to configure which libraries it will compile?

Sure, make is very good at running the compiler from a fixed set of dependencies. But Arduino does much more. If you want to suggest using make, you really need to be specific about HOW to use make for the entire job Arduino does. Simply suggesting "just use make" without addressing the many things Arduino does, which make does not normally do, it a worthless suggestion.

Regarding the ugly compile time, you really should upgrade to Arduino 1.6.1 or 1.6.3. Modern versions of Arduino reuse previously compiled files, just like a make does with a well designed makefile.

I'm using version 1.6.3, and I still notice that everything gets compiled again when I start a new sketch. Takes 30 seconds, and is extremely annoying.

Takes 30 seconds, and is extremely annoying.

Perhaps you might consider what you paid for the compilers and the IDE.

stevenvh:
I'm using version 1.6.3, and I still notice that everything gets compiled again when I start a new sketch. Takes 30 seconds, and is extremely annoying.

Yeah, but how often are you starting a completely new sketch? When you Verify or Upload the same sketch again, with changes, almost everything should be reused from the previous compile of that same sketch.

Reusing compiled stuff between different sketches is possible, but risky, even if you haven't changed boards or other settings. When I originally wrote the speedup to reuse files (originally in Teensyduino in 2010, and contributed to Arduino in 2011), I designed it that way. It almost always works, but it can fail in rare cases where you have extra files in your sketch that conflict in unexpected way with stuff leftover from those previous compiles.

The trouble with those rare fail cases is you get totally unexpected compile errors or other strange problems that aren't your fault. Because they're not entirely related to your own code, they're almost impossible to understand and fix. Quitting Arduino (or rebooting your computer) solves them, of course, since a new temporary folder is created when you restart Arduino. That "happened once, couldn't make it happen again after rebooting" behavior makes these bugs almost impossible for those of us working on the Arduino platform to investigate and fix, because users can't report meaningful info. Even when they provide their complete sketch and the set of libraries they used, reproducing the error requires all the other sketches and libraries they used in that session, and it can depend on doing things in the same order.

Still, today Arduino still has some very rare, very obscure bugs where things can conflict with each other in the temporary directory. One of the really unlikely cases, which I've known about for years but never developed a fix because nobody's encountered it in practice, was recently reported and discussed on Arduino's issue tracker.... only just now, after all these years and hundreds of thousands of people actively using the software! You may believe the build process is simple. If it looks simple, then Arduino is doing a good job of making things easy for you as a user. But I can assure you, there are some really thorny corner cases!

My point is there's a good reason Arduino is starting with a fresh temporary directory and recompiling so much stuff, when you start a brand new sketch. You may believe Arduino should reuse files from your other previously compiled sketches. I'm here to tell you, as the original author of this speedup and a long-time contributor to Arduino's code and especially the build process, I've been down that road. I added very rare, subtle, difficult bugs in trying to do so. Arduino's cautious approach may seem annoying to you, but I can tell you from experience it avoids some really unlikely but really tough problems.

I should mention, a feature I've been planning to someday implement (when I have lots of free time, like they'll ever happen) involves running more than one instance of the compiler at a time.

Believe me, I really want this. I actually do compile complete new sketches, or change boards or settings regularly, since most of what I do with Arduino is develop and fix libraries and Arduino itself. I'm regularly waiting for a full recompile.

I have a i7-3930k processor (12 threads) and 32GB memory and SSD disk, so I could really benefit from 10 to 12 copies of the compiler running in parallel. I sometimes rebuild the gcc toolchain... and when not running the single threaded configure scripts, it really does make huge compile jobs run about 10X faster!

However, for users with crappy computers, running even 2 instances of the compiler could be a net loss, possible a LOT slower. Even if you have a dual or quad core processor, the compiler uses quite a lot of RAM. So does the Java JRE. Multiple compiler instances could easily cause virtual memory swapping, which massively hurts performance. Traditional rotating hard drives, and especially ones without command queuing, can also hurt parallel processing with the compiler needs to read dozens of header files. Even low-end SSD gives thousands of random seeks per second, but with rotating media you get only a couple hundred seeks/sec for uncached data... and odds are little will be cached on low-end PCs with little extra RAM.

Still, someday I'm going to try this. For years, my main hesitation has been the complex MessageSiphon code. Recently Federico got rid of that... which I believe solves the long-standing but harmless bug where the compiler messages end up in the wrong window, if you have multiple windows open and you change keyboard focus among the windows while the compile process is still running. I haven't studied his new code yet, but if it's clean and simple and can be made thread safe (a pretty big "if"), that could eliminate the roadblock that's always prevented me from exploring launching this speedup.

But to publish this to all Arduino users, I think it would need to be optional, or done in a way that detects when it's doing more harm than good on low-end PCs.

Based on my experience of many years of analyzing live systems and writing s/w for high performance disk drive controllers operating system drivers for systems like Cray, Sun Microsytems, NASA, HP, Silicon Graphics, Pixar (back when it was a medical imaging company) and many others, I would expect that for systems with a real disk this will likely make things much slower.

I've seen many foolish attempts at trying to speed things up by trying to do things in parallel that totally break and end up slowing things down when the realities of how the file systems access the data and the mechanical realities of actual disk drives is taken into consideration.
(Yeah I know you mentioned SSDs but there are also some non obvious issues there as well)
It always sound's like a good idea to do things in parallel but more often than not it simply adds a tremendous amount of complexity and in many situations creates new issues and actually runs slower on the systems that most people have.

One of the worst attempts at trying to make things faster was the i/o performance "enhancements" in Windows Vista.
If you used the file manager GUI to copy a directory tree it would fire up as many as 200 copies in parallel.
What ended up happening is that the copy operation slowed down by 2+ orders of magnitude. That's right more than 100 times slower than if you copied the files one at time serially.

When there is only a single spindle and hence a single read head doing seeks/reads/writes, doing things serially is faster than trying to do more than one thing in parallel.
Doing seeks costs way more performance than rotational delays.

The Vista example was even worse than just slowing things down for single spindle system buy causing more seeks.
In their case, they also fired up so many copies that the system starting paging. So not only were you having all the seeks being done (which kills throughput), but you were also reading and writing the copy data multiple times instead of just once because of the paging that was happing.

Many people said that Vista performance issue was myth but if you want to see it first hand just copy a directory of say 200,000 files and time it. Even worse is that when I tried copy a directory tree of 500,000 files of about 500GB of data, the system took about 10 hours to estimate how long it would take to do the copy which would be more than 72 hours.
The actual copy time if you did it from the command line (which copied the files one at time) would be just a couple of hours.
That is what frustrated me the most. Vista would take longer to figure out the size of the data being copied to run their fancy copy operation with a progress bar than the actual copy would take using the "dumb and slow" method of 1 file at time.
Then the actual copy using their fancy new performance i/o methodology that eventually started, was even slower....
What a great system.....

Do that same copy on linux and it only takes about an hour or so.

Paul, you've been around long enough to know that make is just a scripted rule based system.
You obviously need some additional scripts/executables to perform some of the arduino magic since technically make really doesn't do anything itself.
make merely looks at rules and decides which rule needs to run based on dependencies. And then the rule calls other programs/scripts to do the actual work.

For example, in the makefiles you end up with a .ino.cc rule that is used to convert the users .ino file to a .cc file for the compiler.
Make would need to call a tool to do the job of the conversion.

The value of using makefiles is that the build and dependency rules are no longer hard coded inside the IDE itself but rather are outside and be changed/modifed/updated by the user without having to modify and potentially rebuild the IDE.

The build methodology also becomes much more friendly to environments that don't want to use the Arduino IDE like other IDEs, or systems like eclipse or even users that want to write their own makefiles for doing automated builds.

My biggest beef with the arduino IDE is that it has taken a very "Window-ish" mindset of being a monolithic blob rather than being a wrapper layer on top of some smaller more focused tools.

My opinion is that the entire build methodology is wrong.
My preference would also be to change the build methodology completely.

I think the build system should build all the arduino "libraries" the user has into REAL libraries.
Then the user sketch would be built and then link against all the libraries to resolve all the needed references.
This methodology would also solve the problem of allowing arduino "libraries" to use other libraries.

The key is to place the REAL libraries into a persistent location so that they can be used by any sketch once they are built.
This location has to also have a heirchy to support different architectures.
This also has the advantage that you only build an arduino library once or whenever it is modified/changed vs every single time a sketch is first built.

make is really good handling dependencies and could easily handle all this kind of stuff.
With respect to the IDE handling of dependencies, I still run into issues occasionally where the IDE screws up and doesn't rebuild an arduino library when changing processor architectures.

Ironically, if I look WAY back in the arduino repository, it looks like the IDE originally used make in the very early days but then slipped down into being a monolithic blob that tries do everything.

--- bill

I should mention, a feature I've been planning to someday implement (when I have lots of free time, like they'll ever happen) involves running more than one instance of the compiler at a time.

Based on my experience ...I would expect that for systems with a real disk this will likely make things much slower.

If Arduino were to use Make, this would come for free, via make's "-j " switch.
In the real world on large compiles, it helps quite a lot. As an example, I tried it on avrdude 5.10 (which I have lying around.) "Make -j8" was significantly faster (about 4x) than a single-threaded make, and make -j30 was only slightly slower (I have an 8core system with 14G of RAM.) IIRC, moderate values have some benefit even on single-core systems.

make

real    0m5.767s

make -j2
real    0m2.955s

make -j4
real    0m1.842s

make -j8
real    0m1.399s

make -j16
real    0m1.462s

Paul did mention that compiling gcc itself gets much faster.
For the very large compiles (6+ hours) on very large "compile farms" we had something to implement "-j auto" that took into account the current number of users and compiles...

westfw:
Based on my experience ...I would expect that for systems with a real disk this will likely make things much slower.
If Arduino were to use Make, this would come for free, via make's "-j " switch.
In the real world on large compiles, it helps quite a lot. As an example, I tried it on avrdude 5.10 (which I have lying around.) "Make -j8" was significantly faster (about 4x) than a single-threaded make, and make -j30 was only slightly slower (I have an 8core system with 14G of RAM.) IIRC, moderate values have some benefit even on single-core systems.

Paul did mention that compiling gcc itself gets much faster.
For the very large compiles (6+ hours) on very large "compile farms" we had something to implement "-j auto" that took into account the current number of users and compiles...

I've also seen that using parallel makes does improve things at least on *nix based OS's when you have a decent amount of memory so that all the compiler executables are cached and you don't go too crazy on the number of parallel makes.
Not sure how well this kind of stuff works on Windows.

6+ hours for a build. wow. I'm curious of what that was and when that was.
Was that 6+ hour build very recent?

But yep, here is another example of how you can take advantage of capabilities in existing tools like make vs having to re-invent the wheel.

6+ hours for a build. wow. I'm curious of what that was and when that was.

cisco's IOS. As of my retirement (4y ago today, FB tells me.) Big, relatively monolithic source.
You had a choice of checking out over the network-based SCCS (clearcase), which was fast (for the "checkout") and had your compiles work on the huge but nfs-based compile farm, or copying everything to a local zippy PC (slow "checkout", smaller and fewer processors, but local disk and no other users.) They'd end up taking about the same time for an actual initial compile, IIRC.

How exactly would you have a Makefile do the preprocessing step, where Arduino automatically adds function prototypes and #include "Arduino.h"?

Presumably, the IDE would build the my-sketch.cpp file in a temp directory, and compute the list of needed libraries, just as it does now. Then it would run "make" one or more times ("make -f corelib.mk", "make -f userlib.mk", "make -f sketch.mk"; where corelib.mk is static and off in the IDE somewhere, userlib.mk is constructed based on the #includes in the sketch (and/or additional recursion, since that's a current problem), and sketch.mk is pretty trivial.)

westfw:
Presumably, the IDE would build the my-sketch.cpp file in a temp directory, and compute the list of needed libraries, just as it does now. Then it would run "make" one or more times ("make -f corelib.mk", "make -f userlib.mk", "make -f sketch.mk"; where corelib.mk is static and off in the IDE somewhere, userlib.mk is constructed based on the #includes in the sketch (and/or additional recursion, since that's a current problem), and sketch.mk is pretty trivial.)

I'd have the ide buid a template make file that imported all the rules from the IDE installation and have another gui-less tool be called by make for the .ino.cc conversion rule.
(I prefer .cc over .cpp for C++ files for those systems that cant handle upper and lower case file-names and confuse .c with .C)

But I'd still like to change the overall build methodology to build all the arduino libraries in a persistent area so that once they are built you never have to compile a library unless it changed and this includes when changing between architectures.
This allows arduino "libraries" to call and use other arduino "libraries" and also means that sketch builds are very quick since you are only doing compiles of the users sketch files and then linking that against the libraries.

Well, the new "appdata" based storage scheme should support that idea...
There is the usual problem that (AFAIK), there's no way to figure out that a core library compiled with a -mmcu=atmega328 compile switch needs to be recompiled when you switch to an atmega168. (gcc "solves" this for startup code by having ~200 different crtXXX.o files (one for each "possibly distinct" chip. Yuck.)