Using functions (noob help with syntax)

I have never done any programming before, aside from extremely basic programs for a calculator and a bit of HTML. I've been trying to figure out the syntax for functions, but all of the tutorials on the web site seem to be written for people who already have programming experience rather than in plain English that I might actually be able to understand. Can someone help me out?

What I need to know: (1) Sometimes they use

int function()

and sometimes they use

void function()

As I understand it, you use int if you are returning something, but I don't know what that means. I thought int was for variable declarations.

(2) I'm not sure what's supposed to go in the parentheses. I've seen various numbers and also the word "void," so what do each of those things do? Is void the same as not putting anything there? The parentheses define parameters, right? But what parameters are you defining there?

I may have other questions about other things later, but these are the two bugging me at the moment. I've read through Getting Started With Arduino, but he doesn't do a very good job of explaining all the code there, either...

In plain English, the part in front of the function is the type you expect to receive and the part in the parenthesis is the type you send. Void is just another way of saying nothing, so void function(void) doesn't expect any input and will not send anything back. Just a blank () implies (void).

This is probably the best starting reference http://www.cplusplus.com/doc/tutorial/

Ugh, that web site scared me away on the first page about functions... it's worse than here at giving me plain English! Let me go through an example code from the arduino.cc tutorial on functions and you can tell me if I'm on the right track. I still don't get what you mean by sending and receiving (where am I sending to and receiving from?) or why you would use int over a different type. Can you provide some examples of why I would use int, void, or a different type?

Anyway, here's the tutorial code. Hopefully I understand at least some of it.

void setup(){
  Serial.begin(9600);
}

void loop{
  int i = 2;
  int j = 3;
  int k;

  k = myMultiplyFunction(i, j); // k now contains 6
  Serial.println(k);
  delay(500);
}

int myMultiplyFunction(int x, int y){
  int result;
  result = x * y;
  return result;
}

So in the setup we are opening up the serial port. In the loop, we are establishing some variables, then calling on the myMultiplyFunction function, then printing the result of that function to the serial port (which is useful because a computer could pick that up, right?). In the final part, we are establishing the function. int is used because the function will produce an int variable. Inside the parentheses are the parameters of the function, which means that, when we call on it in the loop...

k = myMultiplyFunction(i, j);

...we are saying that the variables i and j should be used as the values of x and y for myMultiplyFunction. In this case, i=2 and j=3. What I don't understand is why we declare the variables (x and y) during myMultiplyFunction instead of earlier when we declare all the other variables. I also don't understand why we declare "result" as part of the function instead of earlier. At any rate, the loop is now running myMultiplyFunction with x=2 and y=3, thus result=6, as established in the function. I'm not exactly clear on why we need the command "return" at the end. Does it just establish that the value for myMultiplyFunction should be whatever the value of result is? As a final question, even though result, x, and y are established during myMultiplyFunction, can they still be called upon in other functions? For instance, could I put

myMultiplyFunction(i, j);
Serial.println(result);

instead of

k = myMultiplyFunction(i, j);
Serial.println(k);

?

The missing piece of the puzzle is scope. This means that different parts of your program can "see" different variables. For example, the variables 'x' and 'y' declared as part of the function definition for 'myMultiplyFunction' can only be referenced inside myMultiplyFunction. If you try to reference x, y, or result from outside myMultiplyFunction, you will get a syntax error, because they are not within the scope of the function. In fact, you can define 'x', 'y', and 'result' in the loop() function, but they will be different, completely separate variables.

Similarly, you can't reference 'i', 'j', or 'k' defined in loop() from within myMultiplyFunction(). They are not in scope.

That's why you need to use the 'return' statement -- it sets the value for the function, and then returns. By the way, you wouldn't say that a function produces an int variable. You would say that the function returns an integer value.

If a function is declared as void:

void voidfunc()
{
return;
}

that means that the function does not return a value. In other languages, it would be called a procedure or a subroutine. If you try to use the value of a void function, like this:int i = voidfunc();you will get a syntax error.

You can still use return in a void function, but you aren't allowed to give it a value. If you don't have a return in a void function, it will return when it hits the closing brace ( } ) for the function. If you hit the closing brace in a non-void function, without a return, there's no telling what value it will return -- it's undefined.

Another important concept is that function call parameters are 'by value'. This means that when you say: myMultiplyfunction(i,j); the VALUES in the 'i' and 'j' variables are copied into the 'x' and 'y' variables inside the function. Changing 'x' or 'y' in myMultiplyFunction will have no effect on the 'i' or 'j' variables back in loop().

I hope this answers some of your questions.

Regards,

-Mike

Yes, I think all of them are answered now! So you declare the x and y variables when you declare the function because otherwise the variables wouldn't exist yet, correct? But you could declare them outside of the functions (up top where you'd normally put the #define lines) and have them be accessible by all the functions, right? In which case, couldn't I have something more like this?

int x;
int y;
int result;

void setup(){
  Serial.begin(9600);
}

void loop{
  int i = 2;
  int j = 3;

  myMultiplyFunction(i, j);
  Serial.println(result);
  delay(500);
}

int myMultiplyFunction(x, y){
  result = x * y;
  return result;
}

Or would that cause other problems?

Just to make sure I'm clear on the other part, by using "int" instead of "void" at the beginning of the function declaration (am I using the right terminology?), I am basically saying that whenever we call on the function elsewhere, it will be basically like using a variable of type "int." In this case, the variable result and the function myMultiplyFunction have the same value. And finally, the word "return" is just used to specify what the value for the function will be if the function is not a void function.

If you declare 'x' and 'y' outside of any function, they are called 'global' variables. That means that any function in that file can access them. (You can also use an 'extern' statement to allow functions in other files to access them.)

Global variables can be very handy, but they also have pitfalls, and limiting the use of global variables is considered a good programming practice. The problem is that if you are accessing the global variables from multiple places, it's hard to keep up with who is doing what.

In the example you gave, when myMultiplyFunction() references 'x' and 'y', it will still be accessing the local copies created in the function definition. That's because if you have global and local variables of the same name, the local copies are used. Having global and local variables of the same name is considered a poor programming practice because of the potential for confusion, both for people reading the program and people writing the program!

The 'return' statement can be used to set the return value of the function, but it also causes the function to terminate. It is possible to have more than one return statement in a function:

int myfunc(int var1,int var2)
{
    if (var1 > 100)
      return var1;
    return var2;
}

if var1 is greater than 100, the value of myfunc() is var1 and the function returns. It never gets to the 'return var2;' statement. If var1 is less than or equal to 100, the function returns the value of var2.

Regards,

-Mike

I'm a little confused as to why it would still be using local copies of x and y. I thought they were only created locally before because the function said (int x, int y), whereas now it only references the names of the variables, which I thought would just tell it to use the pre-existing variables. Are you saying that it creates separate variables that are just not defined as any particular type in my example? If so, is that what always happens within the parentheses? I guess I'm still a little confused about those parentheses...

Would these be a correct usage of the "extern" statement:

void loop() {
  extern int i = 2;
}

If not, could you provide a correct example? While I'm at it, is "i=2" the same thing as "i = 2"? I've been wondering why everyone uses the spaces when it would be quicker not to use them.

The curly braces define the scope of a variable. A variable has a lifetime that starts with a { and ends at the matching }.

Variable in a function definition statement are populated when the function is called, and exist for the life of the function, but no longer.

The reason this happens is because data is pushed onto a stack when the call is made. Then, a jump to the function code is made, and values are popped off the stack, and placed in appropriate values. When the function ends, a branch to the return location is made, and the stack is cleared. All local variables (local to that function) are destroyed at that time.

Data passed to a function can be passed two ways - by value and by reference. Typically, the data is passed by value. A copy of the contents of a variable is pushed on the stack, for the function to pop.

The other way is by reference. The address of the variable is pushed onto the stack, for the function to pop.

Passing by value is cleaner. The function can only modify the output value.

Passing by reference is faster, since no copy of the data is made, but the risk is that the called function can change variables owned by the caller.

Sometimes this is desired, so pointers can be passed, or the & operator can be used.

An extern statement is used when a value is defined in one file, but used in another. It defines the variable as global in scope. There is little reason for using extern in an Arduino sketch.

White spaces are used to make code easier to read by humans. The compiler ignores all white space. The statement "i=2;" and the statement "i = 2;" are equivalent, but the second form is easier to read.

Atleastinmyopinionitis.

I think the only part of that I understood was that spaces don't matter (although, does it really ignore all white space? For instance, is voidmyfunction() the same as void myfunction()?). The rest was a lot of techno jargon that makes no sense to me. Not trying to be obtuse, I'm just saying I'm still very new to all this and I really need explanations that involve as little programming vocabulary as possible. The more you can define by way of example or simile, the better.

For instance, can you show me how, in my example from before, why myMultiplyFunction(x,y) would still call local variables, even when they are declared earlier? I don't understand your explanation at all, or at least it doesn't seem to answer my question.

Is my usage of extern correct? I don't know what you mean by "used in another file." I understand that it's not that useful because you can just declare global values at the top, but I'd still like to know how it works.

Thanks again for all the help so far. I know a lot of programmers and the like hate explaining things to newbies because everyone should just be able to look things up on Google, but I tried that and found that it wasn't any good. I need real people to help me learn.

is voidmyfunction() the same as void myfunction()?).

No; although the compiler will ignore whitespace, it can't ignore lack of whitespace. "void" is a 'C' keyword, meaning it is a reserved part of the 'C' language. It cannot be used to name a function or variable. However "voidmyfunction" is a perfectly allowable function name, so if you had a line like:

voidmyfunction (int x, int y);

you would have written the prototype for a function with the implicit default type "int".

Sorry about the jargon, but it is pretty necessary (as in all technical discourse) to avoid ambiguity.

Atleastinmyopinionitis

Which is an inflammation of the atleastinmyopinion.

For instance, can you show me how, in my example from before, why myMultiplyFunction(x,y) would still call local variables, even when they are declared earlier? I don't understand your explanation at all, or at least it doesn't seem to answer my question.

Suppose I called you on the phone, and asked you to multiply two numbers for me. I'd need to tell you what those numbers were. You'd probably write them down, even though you wouldn't necessarily give them names.

Done writing? OK, you just made a local copy. Cross out the first number and make it 32. Doesn't change the number I have.

So, multiply the two numbers, and tell be the result. Now I hang up, and, as far as I'm concerned, you no longer exist. I can't see what you've written down as any of the local variables, and you can't see (or influence) what I did with the results or with the numbers I asked you to multiply.

Now, suppose we lived in the same city. I ask you to head over to 47th Street, and find hte third house on the left, and the 4th house on the right. Multiply the numbers together, and spray paint them on the billboard at the end of the road. I'll come by later to get the result.

Now, we're using pass-by-reference, since I'm not actually telling you what the numbers are. Rather, I'm telling you where to find them.

So you go there, and you find that the numbers are 1,234,567 and 8,456,367,543. You're feeling lazy, and decide to change the number on the house on the left to 2, and the number of the house on the left to 3.

You multiply them, and record 6 for me to find later.

Now, someone else comes along, and decides that the numbers should be 27 and 13.

I come along now, and see that you have multiplied 27 and 13, and gotten 6.

See why local variables are sometimes better than global?

No technical jargon was inflamed in the making of this post.

Thank you savitch for the link, I solved my problem from there!

I'm new to arduino also, and my c language is for long forgotten.

I ws trying to send multiple variables into the function and get them out also.

Thanks.

Ok, I think I get it. So it sounds like some spaces matter, but others do not... I assume that the difference is that you want to have spaces between commands and variables and the like, but they aren't needed after a comma or before or after math operators. Does that sound right?

As for the other part, it sounds like you are saying that, even though I created the variables at the top to make them global, when I call myMultiplyFunction, it is creating new local copies. Is that right? If so, can the same be said if I were to create variables i and j at the top and reference them in the loop function (or any other function)? I think I understand the basic difference between them, I'm just confused about when and why local copies of global variables are made.

Thanks for breaking it down into an example like that for me. It really does make it easier to see what's going on. Hopefully Groove also seems how it is entirely possible to explain without the jargon, though obviously it will be easier to explain with the jargon once I understand it. Programming is like learning a language for the first time. What things qualify as trees? Do I live in a house or a duplex? That's why I need explanations.

Scope is probably one of the hardest concepts to grasp initially, because analogies have to be stretched to relate it to the real world. You almost have to imagine the { } braces as enclosing a little bit of your program's "world", and what your program can "see" beyond the enclosure is determined by the rules of scoping.

However, scope is a really important concept, because it can: a) save you a lot of work when you get it right b) cause you a lot of work (mostly debugging) when you get it wrong.