Pages: [1] 2 3   Go Down
Author Topic: Help Choosing Upgrade from ATMEGA328 on UNO for Production Version of PCB  (Read 3710 times)
0 Members and 1 Guest are viewing this topic.
Offline Offline
Newbie
*
Karma: 0
Posts: 23
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi!

We've finished our prototype using an Arduino UNO and a Roving Networks RV-XV for connectivity.

We are ready to have our design built onto a custom circuit board, but the issue is that our code is almost exactly 32K in size, and we were thinking of giving ourselves more room just in case. Also, it wouldn't hurt if things ran a little faster.

So, I was wondering if anyone had advice for us. If you needed to move up in storage and/or possibly speed, what would the next best step up from an ATMEGA328 be?

We will only be manufacturing about 3,000 boards a year so I was thinking about even jumping up to the new Arm Cortex based chip they are using for the Due, but it might not be possible due to other hardware engineering concerns (I'm a software guy)

But there is the 640, the 1280, the 2560, and so on. Any advice is appreciated. Thanks!
Logged

Anaheim CA.
Offline Offline
Faraday Member
**
Karma: 48
Posts: 2935
...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Why would you be considering a larger or faster processor?... And now a Due?. Unless your code is plain vanilla, It will have to be re-written and what of testing? Again?.
A 644 would fill the bill as to larger code space. However you would still be at prototype until the code is checked out in it's final resting place.
Remember that you are writing for a totally different core with an ARM processor. Not our lovable little 8 bit'ers and that's a whole different world.
When I did engineering, not so long ago (5 years). I made a prototype board for code that was written for me. My job was the product design and development. I had an engineer that wrote sample code that was breadboarded first to my spec and it was an iterative process where the first article from the board house was used to wring out my electronics and the code that drove it. The parts were ordered at that time and when the boards were done I had two stuffed, one for me and one for the SW engineer.. Funny thing but it is iterative for both the design and the designer The more I did that process the better I got at designing.
There is also that I had a bargain with my employer.. He stayed in his office and didn't look over my shoulder... and I fixed any mistakes I made. Fortunately we only ran 100 Bd's at a time for the first 100 or 200 boards but it did make me aware of my errors because I did this on my time. YMMV.

Doc
« Last Edit: July 20, 2013, 09:04:07 pm by Docedison » Logged

--> WA7EMS <--
“The solution of every problem is another problem.” -Johann Wolfgang von Goethe
I do answer technical questions PM'd to me with whatever is in my clipboard

Global Moderator
Boston area, metrowest
Offline Offline
Brattain Member
*****
Karma: 549
Posts: 27425
Author of "Arduino for Teens". Available for Design & Build services. Now with Unlimited Eagle board sizes!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

'1284P.
128K flash, 16K SRAM, 4K eeprom.
http://www.crossroadsfencing.com/BobuinoRev17/
Logged

Designing & building electrical circuits for over 25 years. Check out the ATMega1284P based Bobuino and other '328P & '1284P creations & offerings at  www.crossroadsfencing.com/BobuinoRev17.
Arduino for Teens available at Amazon.com.

Global Moderator
Melbourne, Australia
Online Online
Brattain Member
*****
Karma: 511
Posts: 19351
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

We will only be manufacturing about 3,000 boards a year so I was thinking about even jumping up to the new Arm Cortex based chip they are using for the Due, but it might not be possible due to other hardware engineering concerns (I'm a software guy)

Personally I wouldn't change processors (to a completely different type), unless you want to do a lot more work and testing. As others have said other chips in the same line (like the 644 or 128) would give you more RAM and program memory, and still run virtually identical sketches.
Logged

http://www.gammon.com.au/electronics

Please post technical questions on the forum - not to me by personal message. Thanks a lot.

Offline Offline
Newbie
*
Karma: 0
Posts: 23
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Part of my trouble is that in looking at the Amtel page, there are 24 different 64K flash size processors listed in the same family as the 328 on the Uno.! Wow!

So based on comments posted here I'm going to look at the 644. I was just really thrown when I saw their list of proccessors in the same family.

Personally, since they want this thing shipped asap, my thought was to just stick with the 328 we've tested thoroughly with. But they are really pushing to have more program space since we've maxed out the 32K and they may want to add more features in a future firmware upgrade.

I've indicated that this will mean we are going to have to delay producing the final board design until we can test everything again, but they seem to really want to add more program space at this point.

NOTE: As a software engineer, our original plan was greatly delayed because we were having all kinds of unexplained problems trying to test the software in real world conditions. As it turned out, the hardware engineer who put together the original prototype circuit board had made lots of mistakes that caused lots of noise on the pins and when we realized this and got another hardware engineer to come in and find and fix the issues things are working so much better now

I have a new found appreciation for the importance of good hardware engineers!!!!!!!!!
Logged

Global Moderator
Melbourne, Australia
Online Online
Brattain Member
*****
Karma: 511
Posts: 19351
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

First, you possibly might save space on the Atmega328 by doing some things more efficiently. Without seeing your code it is hard to comment. Second, why? Why if you are making 3000 boards (and presumably shipping them) do you want more space? You aren't going to recall them, are you, and upgrade them all?

If you were using the bootloader you can save the bootloader space (at least 512 bytes) by programming them through ICSP (which you would want to do anyway to save time).

Quote
Also, it wouldn't hurt if things ran a little faster.

Those other boards won't run faster. However your Atmega328 can run at 20 MHz if it isn't already (change the crystal if you are using one). Again, your code might be able to be improved if it seems slow.
Logged

http://www.gammon.com.au/electronics

Please post technical questions on the forum - not to me by personal message. Thanks a lot.

Offline Offline
Sr. Member
****
Karma: 11
Posts: 358
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Maybe there is a way to optimize your code instead of buying a more expensive micro-controller would it be possible to post your code? A hint is avoid using the arduino libraries they are usually bloated and inefficient anyways and can be replaced with better code. That saves a lot of space.
Logged

Offline Offline
Newbie
*
Karma: 0
Posts: 23
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Thanks for the reply. The thinking is that we might produce a V2 version of the firmware with more features and do a field upgrade of the firmware for those who request the more advanced features some time in the future. We've got lots of software development resources in-house, but very little in the way of hardware engineering resources, so we only want to go through the trouble of designing a custom PCB once.

The code is pretty tight. We've already gone through several times and optimized it quite thoroughly. There are a few places we think we might be able to release a few hundred more bytes, but that's about it.

Good idea on the bootloader though, that will indeed give us another 512 bytes

First, you possibly might save space on the Atmega328 by doing some things more efficiently. Without seeing your code it is hard to comment. Second, why? Why if you are making 3000 boards (and presumably shipping them) do you want more space? You aren't going to recall them, are you, and upgrade them all?

If you were using the bootloader you can save the bootloader space (at least 512 bytes) by programming them through ICSP (which you would want to do anyway to save time).

Quote
Also, it wouldn't hurt if things ran a little faster.

Those other boards won't run faster. However your Atmega328 can run at 20 MHz if it isn't already (change the crystal if you are using one). Again, your code might be able to be improved if it seems slow.
Logged

Offline Offline
Sr. Member
****
Karma: 11
Posts: 358
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Are you using the arduino libraries eg Serial.println() if so that is your problem the arduino libraries are very bloated slow and inefficent.
Logged

Global Moderator
Melbourne, Australia
Online Online
Brattain Member
*****
Karma: 511
Posts: 19351
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Do you have some proof of that statement?

Some libraries (eg. digitalWrite) can do things  in a less efficient way than ones that know the pins at compile time. Sometimes there is a trade-off of ease of use vs space.

However I don't know that the Arduino libraries are, per se, egregiously slow and inefficient.

Please supply some proofs in the form of an Arduino library compared to some other library that is much faster and less bloated. One that achieves the same things.

Logged

http://www.gammon.com.au/electronics

Please post technical questions on the forum - not to me by personal message. Thanks a lot.

Dallas, TX USA
Offline Offline
Faraday Member
**
Karma: 70
Posts: 2763
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Do you have some proof of that statement?
YES

Some libraries (eg. digitalWrite) can do things  in a less efficient way than ones that know the pins at compile time. Sometimes there is a trade-off of ease of use vs space.
Not necessarily. Part of the problem with the digitalWrite() code isn't the flexibility.
There better are ways to implement the exact same interface
that do more at compile time and less at run time.
And there are other much faster ways to deal with the i/o in libraries even when the pins
are not constants and not known at compile time - See below.

From my perspective some of the Arduino core code appears to be written by
folks that are not very experienced with C code, using the C pre-processor to avoid runtime cycles,
or realtime programming.


However I don't know that the Arduino libraries are, per se, egregiously slow and inefficient.
I laughed out loud, but I believe you are serious.
There are many places that are just plain silly and are VERY slow compared to what can and is
done in other libraries.
I'll offer a few below.


Please supply some proofs in the form of an Arduino library compared to some other library that is much faster and less bloated. One that achieves the same things.


Ok, so here are a few.
Have a look at the shiftout() code.
The code can be made substantially faster if the code checked for
direction first then dropped down into separate loops.

Have a look at the HardwareSerial code.
It retardly uses unsigned int for the head/tail indexes.
The reason I say that this is "retarded" is that even though the code declares them
as volatile, the code will break if the values get larger than a single byte because other
parts of the code do not properly deal with atomicity.
Changing these to uint8_t makes the code  faster and saves a few hundred bytes.
Nothing is lost by making this change since the code won't work right if the buffers require
larger than 8 bit values anyway.
(I make this change to every single Arduino release)

Then there is the baud rate calculation. This eats up a large amount of code because of the math.
A small table would be less flexible but would dramatically reduce the code size.
The calculation done by the non Arduino AVR-libC helper routines do all the baud calculations
at compile time instead of run time. It saves many hundreds of bytes to do it this way vs
the way the Arduino code does it.

Back to digitalWrite(), DigitalRead():
As far as digitalWrite() goes there are several implementations that are MUCH faster
when the values are constants. The Teensyduino code which uses the exact same
API interface has the best of all worlds.
It automatically will collapse the operation down to a single cycle when the arguments are constants.
I even wrote a library for constants that looks/works just like the Arduino version
and can even automagically do mutibit i/o like 8 pins at once if the pins specified are in the proper order.
Here is the link: http://code.google.com/p/mcu-io/
(see the avrio portion of the tree to get the avrio.h header file)
with a call like:
avrio_digitalWrite8pins(p0, p1, p2, p3, p4, p5, p6, p7, byteval);
With the appropriate pins and a constant byte value,
that can set all 8 pins in a single instruction vs around 40+us
This will be more than 300 times faster than the Arduino core code to
do the same operation.

My biggest beef with the Arduino core code is that it just looks so amateurish.
There are often better and faster ways to implement the code and if there was a bit
better overall design it could be made substantially better and faster.

Just to give another data point, with respect to library code.
fm's new LiquidCrytal library is a direct replacement for the LiquidCrytal
library that ships with the IDE.
The latest work on fm's new LiquidCrystal library
is getting much higher throughput to an lcd (3.5x) over a shift register using a single wire
(yes you read that correctly,  a single Arduino pin) bit banging in s/w
than the standard LiquidCrystal library using 6 pins.
This is a great demonstration, that there are ways to do portable code in libraries that
live within the confines of existing arduino core code APIs, that still allow users to enter pin numbers
in constructors and yet get much higher performance.
The code in the fm's library has high speed shiftout routines that are portable across
PICkit and Arduino and it does not depend on constant pin numbers that are known
at compile time.

And as performance example,
the stock LiquidCrystal library using 6 Arduino pins in 4 bit mode
can update a 16x2 display around 86 frames per second.
When the code is simply re-factored and optimized, and still using digitalWrite() that jumps to around 300 frames per second.
With a shift register using 1 wire you can get 318 frames per second.
With a shift register using 2 wires you can get 388 frames per second.
With a shift register using 3 wires you can get 480 frame per second.

While the shift register code doesn't use digitalWrite() it is not using direct port i/o with hard coded constants.
It allows the user to specify the pins within the constructor.
This shows how poor the digitalWrite() and the LiquidCrystal library are compared
to what can be done even when the code is portable across processors.
Just re-writing/optimizing the lcd library code and still using digitalWrite(), the LCD can be driven around 3.5x faster
And then when using a shift register and fewer Arduino pins and avoiding digitalWrite():
1 pin  is 3.7x faster than LiquidCrystal using 6 pins
2 pins is 4.5x faster than LiquidCrystal using 6 pins
3 pins is 5.6x faster than LiquidCrystal using 6 pins.

So yeah, right there is just one example of how "egregiously slow and inefficient"
an Arduino supplied library is compared to what can be done.

There are also inefficiencies in things like the Print class,
Then there are cases where certain classes end up getting linked in
even when not used because of the way ISRs are declared.
gcc has some capabilities that can be used to solve this.
Some libraries end up gobbling up RAM that is never used because
of the way the code is written it statically declares ram for all the instances
and it ends up being linked in again, because of the way the ISRs are declared/used.

So there are just a few examples off the top of my head.

--- bill





Logged

Global Moderator
Melbourne, Australia
Online Online
Brattain Member
*****
Karma: 511
Posts: 19351
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Yes, well:

Quote
The code can be made substantially faster if the code checked for direction first then dropped down into separate loops.

Certainly code can be made faster if it is made larger (separate loops). However doesn't that fail the "bloated" test?

Quote
As far as digitalWrite() goes there are several implementations that are MUCH faster when the values are constants.

Absolutely. However the digitalWrite code allows for variables, so this isn't the same.

You need to provide proof of code that is faster and smaller. Not just faster or smaller.
Logged

http://www.gammon.com.au/electronics

Please post technical questions on the forum - not to me by personal message. Thanks a lot.

Global Moderator
Melbourne, Australia
Online Online
Brattain Member
*****
Karma: 511
Posts: 19351
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Just re-writing/optimizing the lcd library code and still using digitalWrite(), the LCD can be driven around 3.5x faster
And then when using a shift register and fewer Arduino pins and avoiding digitalWrite():

I wrote to Adafruit a while back offering a version that used the SPI library rather than their bit-banged version, they responded that "few people need fast updates", so I sympathise. Here's the link:

http://forums.adafruit.com/viewtopic.php?f=19&t=19079&start=15

Quote
1) the screen is very small and few people need fast updates


But again, the speed was at the expense of having the flexibility of using any pin you want. It's swings and roundabouts.
Logged

http://www.gammon.com.au/electronics

Please post technical questions on the forum - not to me by personal message. Thanks a lot.

Valencia, Spain
Offline Offline
Faraday Member
**
Karma: 152
Posts: 5757
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

If you want to reduce code size, the best way is some judicious use of noinline. The Arduino compiler is very aggressive about inlining things.

I usually define a macro:

#define NOINLINE __attribute__ ((noinline))

Then you just add "NOINLINE" to functions which aren't too speed critical. With a bit of experimentation you can make a huge difference to code size.
Logged

No, I don't answer questions sent in private messages (but I do accept thank-you notes...)

Atlanta, USA
Offline Offline
Edison Member
*
Karma: 56
Posts: 1848
AKA: Ray Burne
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

OMG!!!
Quote
We've finished our prototype using an Arduino UNO ...

Seriously, you need to read about software/hardware development life-cycles!  This is akin to playing with bottle-rockets to learn how to build a sounding rocket.

You code with the environment (as close to possible) to what you are going to produce.  You test with an identical environment to production.

GeeWhiz... 3K boards a year... you need to get this right.


Ray
Logged

Pages: [1] 2 3   Go Up
Jump to: