Which method uses LESS CPU ticks?

Hi All - This is one for the people that know the inner workings of C and the MCU chips, the OS and the array functionality library.

This is relating to CNC G-code development on a Due but is probably equally relevant to all Arduinos.

We have 2 arrays each with 100 elements or values.

When the first element in array A has been dealt with, it is normally removed by popping it in the sketch.

The sketch then loads the element at position 0 of array B onto the top of array A and pops it out of array B.

Here is the alternative scenario:

Instead of popping the elements out of the arrays, would it not be more efficient to have a pointer integer pointing to the next element to be used in each array and instead of "popping" it, simply increment the "next to use" pointer value by 1 and when it reaches 99, reset it to 0 ?

To know where the next element is to be inserted use another pointer 'next insert above" to mark where the next element is to be placed and when it is placed, increment it by 1 , also returning to 0 after 99.

When the 2 pointer values are only 1 apart, set a bit to 1 ("array full") or 0 ("array empty") depending on which pointer has the higher value.

The question is does the MCU or OS array handler use the method of rippling all the values down the array (or reserved memory location set for use by the array) when a value gets popped or does it use some strategy similar to the "pointer" example?

Obviously simply changing the value of 2 pointer per array or reserved memory block would take far less ticks than the "value ripple" method which would need each of the 99 remaining values to be moved down 1 position in the reserved memory block for the array.

The whole pointer method above will still need some thought and work to get it working bug free but you get the idea.

Our project will probably need every spare tick we can find!

And every bit of program memory - we only need sine and co-sine so we are thinking of stripping everything else out the trigonometry library - things like that.

REALLY "BARE BONES" !!

And we have a 6-pack riding on the answer we get here ;>}

Thanks everyone.

When the first element in array A has been dealt with, it is normally removed by popping it in the sketch.

If it's an array you can't POP or PUSH. After that of course the rest of your post makes no sense!

Mark

What you are describing is called a circular buffer (aka circular queue, cyclic buffer or ring buffer), it's a common programming technique to avoid shifting data in memory and just repositioning a pointer / an index in an array.

That's the way to go, don't shift things in memory

NOBODY can answer your question, as you've asked it. The ONLY way to know which of several approaches to a particular piece of code is more "efficient" is the compile both, profile them, and compare the sizes and performances based on whatever your particular definition of "efficient" is. In general, the compiler, and MCU architecture, have as much, sometimes more, effect on performance than the source code. The same code, run on two different MCUs (i.e. the Dues ARM7 vs the Unos AVR) will give VASTLY different results, even after accounting for the hugely different clock speeds.

Regards,
Ray L.

Our project will probably need every spare tick we can find!

Why? If you really do need to save microseconds on a Due, you're using the wrong processor. Get a Teensy 3.2 or even a 3.6 - both are faster and have more memory than a Due.

we are thinking of stripping everything else out the trigonometry library

Why? If you don't call a function, the linker won't add it to your code.

Pete

MadAubrey:
Instead of popping the elements out of the arrays, would it not be more efficient to have a pointer integer pointing to the next element to be used in each array and instead of "popping" it, simply increment the "next to use" pointer value by 1 and when it reaches 99, reset it to 0 ?

This is called a "circular buffer" and is a very common technique. You can use a pair of array indexes or you can use a pair of pointers.

Just beware of off-by-one bugs: is your index the index of the most recently added item, or the index of the blank space where the next item goes? Note that you can never store the full 100 items in the array, because there's no way to tell the difference (just from looking at the indexes) whether the array has zero or 100 items in it.

(This is mathematically the case. It's a "pigeonhole" proof: there are 100 different relative offsets that your indexes may have, but 101 different possible numbers of items in the array.)

Having said that - if your current method is slow, it may be because you are using a loop to shift your stack down rather than using memcpy().

Even changing an Int to a byte can (does) affect performance and code size in addition to saving a byte of SRAM.

BUT as you have not written the program you are wasting time and effort worrying about performance.

FIRST get it written and debugged then if it is not fast enough ......

16MHz is a fantastic amount of CPU power when to don't have to waste it on MS crap!

Mark

holmes4:
Even changing an Int to a byte can (does) affect performance and code size...

No, that is NOT always true. It is VERY much dependent on the specific CPU. The OP is using a Due, which has a 32-bit ARM Cortex, and with most 32-bit RISC processors using byte instead of int makes NO difference in performance, or code size.
Regards,
Ray L.

Instead of popping the elements out of the arrays, would it not be more efficient to have a pointer integer pointing to the next element to be used in each array and instead of "popping" it, simply increment the "next to use" pointer value by 1 and when it reaches 99, reset it to 0 ?

Yes.
Note that C arrays are much simpler than PHP arrays, so they don't HAVE any built-in operators like push or pop. You'd have to implement that yourself, and it would normally be done just as you said - with some sort of pointer or index.

struct fifo_array {
   int inptr;
   int outptr;
   mytype data[100];
}

with most 32-bit RISC processors using byte instead of int makes NO difference in performance

In some cases, 32bit cpus are slower and use larger code to do bytewide operations, because the compiler has to insert some sort of "convert32bits_to_8bits" instruction every now and then...

I disagree with some of the comments above. Algorithmic cost can be assessed through formal methods (yes some knowledge of target architecture helps and optimizer in compilers are using some of those techniques) but when you compare an algorithm whose complexity is linked to the number of data O(n) where n is not "small' to something that is a small constant, then There is no need to code and compare two methods: shifting 99 elements of 8,16,or 32 bits every time something is pushed or popped is always going to be less efficient than a moving a pointer regardless of the intrinsic architecture (unless you find a processor with 99 basic elements wide bus / memory access / instructions which to my knowledge does not exist. Current top consumer architectures are 64 bits, 8 bytes, some with a few 128 bit (16 byte) vector instructions that are optimized at register level but not for memory manipulation)

Circular buffers is the way to go for what you express. (IMHO)

That sounds like a good route to Shlemiel The Painter's Algorithm

Consider that moving items in an array is expensive. You have to pull one item out of the main memory and store it to one side while you move the next item up into that space. On regular computers, that's actually stupendously expensive because your 3GHz Pentium doesn't have a 3GHz memory bus - the memory is significantly slower than the processor. The processor spends a lot of time waiting for the next item to come from memory. The Arduino is a little different because its memory is all on the one chip and it all runs at the same clock speed. But doing this for 100 items in an array still adds up to a significant amount of time for the Arduino.

But pointers to memory are just one single number. They can be kept in a register. Moving a pointer to point at the next memory location takes just one single clock cycle.

If you're worried about speed then you definitely don't want to have any trig functions anywhere inside your main loop. There's a lot of ways to simulate a trig function from linear interpolation to just simply storing a list of the values you know you're going to need. For most purposes, you can actually tolerate large errors like +/-10% and your sinewave still looks like a sinewave. If that's close enough for you then cosine of all angles up to 25 degrees can be approximated by the single value 1.0 That's a 50-degree range which uses only one storage element and takes just a few clock cycles to return that number. Then remember that sine is just cosine minus 90 degrees, so you probably only need to store one set of lookup values.

J-M-L:
I disagree with some of the comments above. Algorithmic cost can be assessed through formal methods

I think the answer to this question is simple enough not to need "methods" - just a few minutes of thinking about what needs to happen.

Pointers and / or a circular buffer.

...R

Hi everyone and thanks for all the comments.

This question was a theoretical / hypothetical one so the only “code” there is is our “pseudo-code” which is plain old English and just lists the steps that will be needed to get the job done, sometimes with the particular command that should be used.

In this case I had used “array_shift()” which is a PHP command that returns the first value in the array, removes it and moves everything down.

A few hours later I realized that in PHP (for the most part anyway) arrays are dynamic and that “array_shift()” will only work on a dynamic array - here we are talking about a fixed length array.
A dynamic array will probably be very expensive as far as resources are concerned.

Although I have been programming for more than 30 years I have never had the need to learn the “C” language and have just started reading through the language command list and figuring out that, how etc each command does - have not reached the “array” section yet but I will get there while I wait for Banggood to come to the party with our components.

J-M-L
THAT “circular queue / cyclic buffer” is exactly what I was looking for!
Thank You SIR ;>}

RayLivingston
I disagree!
Everybody who replied DID answer my question and more importantly, brought a whole lot of other knowledge my way.
I agree that the “suck it and see” approach is often the best way but making decisions about static vs dynamic arrays is a no-brainer if you do not absolutely have to have a dynamic array.
Your most valued contribution was your comment about “compilers and MCU architecture” playing an important roll in all of this.

Please remember that the last time I used a compiled language in real life was before the public internet existed!
Since then I have been using on the fly interpreted script languages like ASP, PHP, Javascript etc.
I am aware that different MCU’s have different instruction sets but I was always under the impression that a compiler is a compiler is a compiler.
But from your comment I see that I may need to see if one compiles a specific script more efficiently than another.

el_supremo
You make a good point there BUT you need to realize the following:

Aubrey+soldering iron+discrete components=expensive blue smoke

This project started off before the Arduino was developed and at that time I blew up about 30 high spec PIC16 MCU’s before the project eventually got put in a box under the workbench.
The pin-out plug in connectors on the Uno/Due really appeal to me!
Your comment that if a function is not needed then the linker won’t add it is something that I either never knew or had forgotten.

PaulMurrayCbr
You also brought “circular buffer” into the discussion - Thanks to you too!
I agree that the “pointer” section will have to be carefully thought out and debugged and I definitely do not want to go shifting data all over the place!
Once a specific piece of data is in the array, it “should” stay exactly where it is until it has been used and then it gets replaced - period!

holmes4
You raise a very good point as well!
Using bytes as pointers WILL be far more efficient as far as memory usage goes.
However I see that RayLivingston has added a codicil.
Guess I will have to look into the whole situation there too.

westfw
Thanks for that!
I am starting to realize that C does not have a multitude of “standard” functionality and that is probably why it is so powerful.
You can do just about anything if you build the function yourself because all the low level commands are exposed and not hidden away as in some other programming languages.

J-M-L
I think you are correct as far as the “circular buffer” is concerned.
Getting the pointers and the “full/empty” bit working correctly is going to be fun though!

MorganS
Thanks.

Robin2
Thanks to you too. Once I have finished reading through the commands and explanation list I have found I am going to investigate “circular buffers” (if it is not a native of the language that is)

To all of you who commented here please understand that I am NOT one of those “Please do my homework for me” kind of people!

We all do the “copy/paste” thing, they are called libraries!

However this project will be coded from start to finish (with the exception of libraries of course!) by my fingers.

EVERY error or bug will be mine and hopefully I can sort them all out all by myself and learn mountains in the process.

Hopefully I will never be posting an actual piece of code here because it is concepts and methodology that are going to be my biggest problem.

Which way to do something would be better in what circumstances etc.

Once again, thanks to all and everybody have a good one.

Aubrey.

Robin2:
I think the answer to this question is simple enough not to need "methods" - just a few minutes of thinking about what needs to happen.

Pointers and / or a circular buffer.

...R

Yep - which is my initial point as well in #2.

I was reacting to

The ONLY way to know which of several approaches to a particular piece of code is more "efficient" is the compile both, profile them, and compare the sizes and performances based on whatever your particular definition of "efficient" is.

and emphasis on ONLY.

MadAubrey:
To all of you who commented here please understand that I am NOT one of those "Please do my homework for me" kind of people!

we noticed that - hence your posts getting good participation. Right attitude - keep up the good work!

MadAubrey:
Although I have been programming for more than 30 years I have never had the need to learn the "C" language

This is not a C language issue - it is a general programming concept. But I can understand how someone who has always used interpreted languages may not have been exposed to the way things are done "under the hood". That's the beauty of interpreted languages :). It's a pity that they are not practical for a microprocessor.

I suspect that some time (not a lot) studying assembler programming would actually more clearly expose the relative "cost" of moving data in memory versus using a pointer or index to the memory locations. Even C/C++ can obscure what is really happening. If you have a grasp of the activity at the chip level then it is much easier to see what C/C++ is doing and to infer what is likely to be going on in the engine-room of an interpreted language.

...R

it's not just interpreted language actually.

Some of the newer programing languages are offering high level abstractions like Strings, Arrays, collections etc (Have a look at Swift for example) directly as part of the language.

So new comers in this world who are not really studying the underlying computer architecture and consequences of the abstraction won't get exposed much to the cost of doing things in one way or another. That also means they can focus on getting their app done, at the end of the day Mhz and MB or RAM come cheap those days in computers and smartphones... for most of them that will be "good enough".

of course, when you have that background and come to the micro-controller world, this is a different story!

J-M-L:
we noticed that - hence your posts getting good participation. Right attitude - keep up the good work!

Thanks for that! But be warned, sometimes I come up with weird ideas.

Robin2:
This is not a C language issue - it is a general programming concept. But I can understand how someone who has always used interpreted languages may not have been exposed to the way things are done "under the hood". That's the beauty of interpreted languages :). It's a pity that they are not practical for a microprocessor.

You are 100% correct, it is a "general" concept.
But sometimes what works in one language performs poorly in another.
And the overheads of an interpreter are seriously big. And slow.

Robin2:
I suspect that some time (not a lot) studying assembler programming...

I did try assembler/machine code way back in the Sinclair / Commodore era - twice - the first time and the last time - they were both the same time!!! Not for me, my head is too flat!

MadAubrey:
Thanks for that! But be warned, sometimes I come up with weird ideas.

that's OK the world needs more creative thinkers and dreamers :slight_smile: