Replace unicodes in variable / function names by their hex representation

I study at a german school and since I’m better at programming than any of our teachers most people come to me if they encounter problems.
However, like 1/3rd of these people just used an “ä” or something similar in the name of a variable or function.

error: stray '\033' in program
error: stray '\244' in program

doesn’t really give a hint on what’s wrong with your code if you haven’t programmed too much before.

I looked through the arduino IDE source and I think it wouldn’t be too hard to implement something that replaces any non-ascii character by u_ or something similar in the arduino-preprocessor repo before feeding the data to libclang and g++.

I’d just make a PR myself, but I can’t get the arduino preprocessor (cloned from to compile on my system(s). Any thoughts? Is it a bad idea? Should I try harder to get it working myself and make a PR? :stuck_out_tongue:

Also if anyone can compile the preprocessor I’d greatly appreciate giving me info on the installed packages and distro. I really want to do this myself, but I keep getting undefined reference errors even though I passed every single clang / llvm library on my system or the downloaded ones to the linker :confused:

This does not concern Arduino or is IDE, it concerns the standardization of C / C ++ language.

But you’d still want to allow the comments to contain Unicode, if the non english programs I’ve seen are any indication. That makes it much more difficult.

And strings, too.

I found this nice example:

“Übertragungsgeschwindigkeit”... wow.