Go Down

Topic: Designing a new programming language for Arduino (Read 14023 times) previous topic - next topic

UKHeliBob

I'm with you on that Robin.
I can read and understand x = x + 1 etc until the cows come home but I cannot do the same for & and * as used for pointers.  I have to stop and think when I read code containing them.  I am getting better but I am not there yet.

However, as I have said before, we need to understand what this new language is for.  Is it going to be a "be all and end all" language or a stepping stone to C++ ?  If the latter then let's keep it as C like as possible but make the environment more helpful.
Please do not send me PMs asking for help.  Post in the forum then everyone will benefit from seeing the questions and answers.

pYro_65

Quote
The thing that I want to see made more intuitive is * and & as used for pointers. I understand pointers but I just can't get my brain to interpret those symbols intuitively
This new fan-dangled language might not even need to use pointers, for instance, c# has value types and reference types, these do not need explicit '&', its more a feature of whether something is a struct/intrinsic (value type) or class (reference type), or JavaScript where virtually everything is a reference and you just take it for granted.

The major factor with these languages is garbage collection (reference counting) which allows a lot of background work to be done with little user interaction. C/C++ was designed for places where these features may not be suitable (C++11 can implement GC however).


What is it you struggle with? Is it because the same symbol has different meanings in different contexts (& can = b/w and, declare a reference, get pointer of var), (* can = declare a pointer, dereference a pointer, multiplication).

Why not make yourself some defines that make these things easier (use words rather than symbols).

Quote
I have to stop and think when I read code containing them.
The one thing I can recommend is use them more, its the simplest way of pushing through a barrier (obvious I know, but its the only way I was able to understand operators like '->*' and '.*' explanations just did not cut it until put into practice).
Forum Mod anyone?
https://arduino.land/Moduino/

Robin2

#47
Nov 17, 2015, 06:19 pm Last Edit: Nov 17, 2015, 06:21 pm by Robin2
Why not make yourself some defines that make these things easier (use words rather than symbols).
Unfortunately they would not make it easier to read another person's code

I guess I could write a parser that goes through the other person's code and replaces * and & with my #defines  :)   :)    (for avoidance of doubt, I am NOT going to try)

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

YemSalat

#48
Nov 17, 2015, 07:03 pm Last Edit: Nov 18, 2015, 01:54 am by YemSalat
However, as I have said before, we need to understand what this new language is for.  Is it going to be a "be all and end all" language or a stepping stone to C++ ?  If the latter then let's keep it as C like as possible but make the environment more helpful.
Right now I think that the best approach is to make a semi-subset of C++ with lighter syntax and a couple built-in language constructs for some core Arduino development things (like basic time handling, interrupts, etc.)
So the language would have all the things like primitives/loops/functions/classes
It would also be good to allow for easy compatibility with current Arduino libraries and as you said - easier transition to C/C++, so keeping the syntax as C-like as possible is the way to go in my opinion.


This new fan-dangled language might not even need to use pointers, for instance, c# has value types and reference types, these do not need explicit '&', its more a feature of whether something is a struct/intrinsic (value type) or class (reference type), or JavaScript where virtually everything is a reference and you just take it for granted.
Well we definitely can't afford a proper GC on the Arduino, but I do think that some common tasks that use pointers can be abstracted away and use them behind the scenes to optimize things, while providing nicer syntax.

Why not make yourself some defines that make these things easier (use words rather than symbols).
I was actually thinking of adding a couple built-in operators, e.g. ref() - to create a pointer and valueOf() to get its value, etc.



A small update

Unfortunately I did not have much time to work on the language yesterday, I did a few tests and worked on the evaluator. Now it generates a kind of a "byte-code" (if you can call it that) that is later going to be consumed by the code generator. I still have not yet started on the generator itself, but I am hoping to start tomorrow and finish in 1-2 days depending on how busy I will be.

The "byte-code" looks like this: ..[INIT INT $1 $2][FOR_BEGIN][L_PAREN]...
, (pretty 'literal')

I also worked on the IDE a little bit and implemented a simple file handling system.


Everything is stored locally on your computer (browser Localstorage allows for up to 5MB of space by default, that should be enough for most use cases, but the app can request more if required), so there is no downloading-uploading going on, you can however "download" any file as a text document or simply copy-paste it. Also, once the IDE is loaded - it becomes completely available offline (via appcache) so it will work even if you reload the page or reboot your computer.


As I already said I am hoping to have more spare time in the next few days to work on the compiler.
The C++ code generated by the first version will most likely be quite ugly, so please do not expect any great optimizations or elegant solutions in the first release.

Robin2

Keep going.

It is nice to see a man hard at work.

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

YemSalat

#50
Nov 20, 2015, 08:52 am Last Edit: Nov 20, 2015, 09:19 am by YemSalat
Another progress report


I worked on the compiler for the past few days and I gotta say it is starting to get a bit more complicated; in terms of compiler now having more code in order to account for language features.
Most of the complexities are introduced by interactions between types.

Initially was only checking types for literals and variables and wanted to throw part of the type checking work on the C compiler, but now I think that it would be much more useful to do a complete type check so we produce a completely type safe C code. So I had to introduce yet another stage in the compilation process, which is the Type Checker, it goes through the parse tree (post-order) and assigns types to every single expression and checks them all.

I also added special built-in type conversion functions: int(), float(), str(), etc.
For example to convert a string to an integer, you would do the following:
int a = int( "42" ) // 42

As I mentioned before, I would like to overload the '+' operator for strings, so you can concatenate two strings like so: "Hello" + " World"

But I am not sure whether its a good idea to overload '+' for concatenating strings with other types,
i.e. "hello" + 42

It would be good to have some input from the community on this.



So far I came up with the following rules for the overloaded '+' operator:
Strings:
String + (Int | Float | Bool) = String

Numbers:
Int + Float = Float, etc.. // same as C

Others:
Int + Bool, etc.. = ILLEGAL


I have not worked on the IDE since last update and am hoping to finish the compiler by tomorrow, most likely without concatenation of different types in the first version, but we'll see how it goes and what you have to say about this as well.
After the compiler is done - I'll just need a little bit of time to 'link' it with the IDE and it should be good for the first release.

Robin2

#51
Nov 20, 2015, 10:53 am Last Edit: Nov 20, 2015, 10:54 am by Robin2
As I mentioned before, I would like to overload the '+' operator for strings, so you can concatenate two strings like so: "Hello" + " World"
I presume you are talking about strings (small s) and not Strings (capital S).

Quote
But I am not sure whether its a good idea to overload '+' for concatenating strings with other types,
i.e. "hello" + 42
In a system that uses duck-typing this might make sense. In a system that requires typed variables I think it would just be confusing. However even Python requires "hello" + str(42) and Ruby (IIRC) requires "hello" + 42.str

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

YemSalat

I presume you are talking about strings (small s) and not Strings (capital S).
Yep, sorry, just capitalized the first letters in the example, in the language all those entities are primitives.
In a system that uses duck-typing this might make sense. In a system that requires typed variables I think it would just be confusing. However even Python requires "hello" + str(42) and Ruby (IIRC) requires "hello" + 42.str
Yep, good point, perhaps its best to not include concatenation of different types.

Plus, I wanted to introduce sub-string expressions anyway.
So one can write: foo = "This is how a {{ variable }} can be represented inside a string"

Could also add 'filtering' functionality: {{ number | %d }}

YemSalat

#53
Nov 21, 2015, 05:48 pm Last Edit: Nov 22, 2015, 02:51 am by YemSalat
Have spent some more time on the project and the evaluator is pretty much complete at this point :)
It can now do a full semantic analysis of all the language entities (respecting the scope), and point out various errors.

A couple screens of some tests I've done today (this is not the IDE UI):

[ View Full Size ]


*Underlined on the image above:
Semantic error on line 9 column 5
Error message: variable "foo" was already initialized as 42 on line 3


Some others (clickable):
   

I am about to add functions to the language, for now they will look the same as C functions (hope everybody is ok with that), but I do have plans on changing some things around in the future.

I am also thinking about adding an ability to the IDE for the users to generate and upload test suites. How I see it - one will be able to write some piece of code that produces some output, so you write the code and specify the output that it produces, then it gets added to the global database, so that new language versions can be tested against all the previous tests. Any thoughts on this?

I am proposing this as I am afraid I can miss some unobvious errors including the ones in language semantics.

YemSalat

#54
Nov 22, 2015, 01:47 pm Last Edit: Nov 22, 2015, 02:02 pm by YemSalat
A few more updates

IDE

I finished integrating the new evaluator into the IDE.
Had to spend a couple hours banging my head against the wall trying to figure out the timing for it (like delays after the code changed, etc.) and the fact that parser runs in a parallel process does not help at all.. But I figured it out in the end, so I am happy with the result.

Interesting fact:
I did some performance tests, evaluating a 2000 line file which contains all language entities.
Chrome and Firefox on my 3 year old core i5 laptop parse it within 250-300ms. I also tried IE11 on Windows 8, and surprisingly it got the best result of all - around 70-80ms.


The Language

I added functions.

Right now they are very much like C functions, but I also added docstrings like in Python, which allow you to annotate your functions with some documentation. The syntax for them - is just three slashes (/// text..), so they look like comments:

/// This function multiplies two integers
int multiply (int a, int b) {
 return a * b
}


This allows the IDE to show you tooltips containing the function description when you mouse over them in the code.
Docstring comments will also be kept when the code is transpiled to C++ (all other comments are ignored by the parser)


I am working on the code generator and to be honest it is a bit harder then I though, as I don't have too much experience with Arduino, I might need to create a topic or two in the Programming questions section soon :)

Anyways, we are getting really close to the release, I will be a bit busy with other stuff in the next couple days, not sure if I'll do any more updates before the demo, but we'll see how it goes.

UKHeliBob

Quote
I might need to create a topic or two in the Programming questions section soon :)
Questions from beginners are always welcome.  :)

Well done with progress so far.
Please do not send me PMs asking for help.  Post in the forum then everyone will benefit from seeing the questions and answers.

Robin2

I might need to create a topic or two in the Programming questions section soon :)
I suggest you keep the discussion in this Thread at least until you publish a working version. I find it very difficult to follow a discussion over multiple Threads. (Can't walk and chew gum at the same time)

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

YemSalat

OK, it probably IS best to keep everything in the same thread.


Just to get a couple small things out of the way:

1) Is there any particular reason for having the setup() {} routine in every sketch?

2) What is the preferred way of concatenating two strings? (meaning arrays of chars, do you think sprintf is a good fit for this?)


Thanks!

UKHeliBob

Quote
Is there any particular reason for having the setup() {} routine in every sketch?
There will almost certainly be things that need to be done once in every program, such as setting up pinMode()s and starting the Serial interface, hence the setup() function.  It also suits the way in which the Arduino environment runs setup() once and then calls loop() repeatedly from the hidden main() function.

Quote
What is the preferred way of concatenating two strings? (meaning arrays of chars, do you think sprintf is a good fit for this?)
strcat() is the obvious solution whereas sprintf() is more powerful but uses more memory.  The standard Arduino implementation of sprintf() does not allow for the use of float variables although it can be added at the expense of more memory use.  snprintf() is safer to use than sprintf()
Please do not send me PMs asking for help.  Post in the forum then everyone will benefit from seeing the questions and answers.

Robin2

1) Is there any particular reason for having the setup() {} routine in every sketch?
I think requiring it provides consistency that is very valuable for beginners.

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

Go Up