Don't Cross The Streams (FP scientific calculator serial co-processor)

OK, I'm sorry about the title, but I'll blame it on too much coffee, the real, non-decaf stuff. :astonished:

Most probably think of a co-processor as one of those math chips like the 8087 that were plugged into the IBM PC and clones back in the dark ages. They essentially allowed coprocessor-aware software such as Lotus 1-2-3 to calculate faster because software-only calculations were soooo slow. But, essentially a co-processor can be any processor that is programmed for a somewhat dedicated function: math, I/O, or even scientific calculations. I wrote this sketch for two reasons, one to learn more about using Streams and the many functions available which help to minimize code that I need to write and second because I wanted to experiment with off loading floating point from an Integer Only uC and not bother with dealing with integer arithmetic and then scaling to decimal.

The goal: An Arduino sketch to perform basic math and some trigonometric functions using the single precision FP library, 4-bytes. The Arduino must accept keyboard input on an element by element basis AND must accept an entire calculation sequence as a stream with flexible use of delimiters. The Sketch should be flexible in output format so that I could easily test (verbose) and should simply stream the answer out the serial port otherwise.

This is NOT a tutorial on correct use of String or Print objects. This is NOT a tutorial on parsing commands from an input stream, although the sketch does perform a sequential validation and extraction of tokens. This sketch could be written better, but a lots of duplication in echoing status information to the screen in interactive mode was desired.

Please accept this sketch as a fun exercise. It is public domain stuff, so bend it, shake it, hack it... but please post enhancements for all. I have not implemented a second uC to use as a command chip, but I am hopeful to complete that effort in a week or so. If this concept works adequately, anyone should be able to host a GPS and off-load the calculations of bearing and distance to the co-processor. With 328P chips being $2 in 25 Qty, the idea of a very cheap calculator that does not impact the main uC should be workable. One idea that is likely to need implementing is a oPin on the Arduino to signify that an answer is ready. This would allow the host uC to use an interrupt routine to snatch the return RS232 answer stream without crudely waiting on an answer.

Test environment: 328 Nano hosted via USB
Have fun. Due to size, the FULL CODE is an attachment... sorry.
If anyone would like to recommend a decent and tested double-precision library for FP, I believe there is enough Free RAM to provide for such an exercise.

Ray

Commands (NOT case sensitive):
"DIV","MUL","ADD","SUB","Y^X","LOG","NLG","10X","1/X","e^X",
"SQR","X^2","SIN","COS","TAN","ASN","ACS","ATN","DEG","RAD",
"DEC"
**Note:**DEC sets the number of decimals output from 0 to 9

Examples: (all RS232 serial to PC over USB using Arduino terminal)
Input: sin 45
Output:
Enter Instruction: SIN Found at location 12
Prompting for X: 45.00 D-->R = 0.79 Sin(X) = 0.7071068

Milliseconds = 18 Free RAM = 1452

Input: add 2.2 3.0 sub 3.1 4.5 mul 113 3.1 div 355 113
Output:
Enter Instruction: ADD Found at location 2
Enter first number secoond number: 2.20 3.00
a + b = 5.1999998

Milliseconds = 39 Free RAM = 1452

Enter Instruction: SUB Found at location 3
Enter minuend subtrahend: 3.10 4.50
a - b = -1.3999998

Milliseconds = 73 Free RAM = 1452

Enter Instruction: MUL Found at location 1
Enter multiplicand multiplier: 113.00 3.10
a * b = 350.3000183

Milliseconds = 82 Free RAM = 1452

Enter Instruction: DIV Found at location 0
Enter dividend divisor: 355.00 113.00
a / b = 3.1415929

Milliseconds = 75 Free RAM = 1452

Compiled with verbose mode off (0):
Input: add 2.2 3.0 sub 3.1 4.5 mul 113 3.1 div 355 113
Output:
5.1999998
-1.3999998
350.3000183
3.1415929

Input: dec 3 add 2.2 3.0 sub 3.1 4.5 mul 113 3.1 div 355 113
Output:
5.200
-1.400
350.300
3.142

Calculator.ino (10.3 KB)

Does it support math coprocessor chips?

eg. https://solarbotics.com/product/17386/

The uM-FPU V3.1 chip supports 32-bit IEEE 754 compatible floating point and 32-bit integer operations.

Therefore, in a very limited way, it replaces the FP coprocessor with the AVR equivalent IEEE 32-bit lib.

If you download and look at the code you can see that you can customize operation verbs very easily to create any new command or subroutine.

It's just a sample framework written as a calculator. You can create a verb, say HD1, to turn pin D1 High and another verb, say LD1, to turn it low.... and without further programming, a second uC can control digital pin 1. Or... create one verb called D1? and send a 1 or 0 in the stream to parse with Serial.parseInt ...

But to answer the original question, I do not see why you could not offload to a math coprocessor.... surely would run up the co$t for a single-percision FP solution.

Ray

idea: Given enough 328's you can build a physical (rs232 based) Sieve of Eratosthenes

I've got dual '328 boards with plenty of pins for SPI, I2C, or Serial connections between chips if anyone is interested:
http://www.crossroadsfencing.com/BobuinoRev17/
$5 for bare boards mailed in the US

mrburnette:
Have fun. Due to size, the FULL CODE is an attachment... sorry.

Your .zip file is empty :slight_smile:

tronixstuff:
Your .zip file is empty :slight_smile:

Let's blame Vista for that one... uploaded again. Thanks!

Ray

UPDATE: Checked OK on download. HOWEVER, this .ino file has verbose set to 0 ... set to 1 to enable verbose mode.

mrburnette:
But to answer the original question, I do not see why you could not offload to a math coprocessor.... surely would run up the co$t for a single-percision FP solution.

There's a double precision version of the chip. That would give whole new capabilities to the Arduino (double precision math isn't supported by the Arduino libc).

http://www.micromegacorp.com/umfpu64.html

fungus:

mrburnette:
But to answer the original question, I do not see why you could not offload to a math coprocessor.... surely would run up the co$t for a single-percision FP solution.

There's a double precision version of the chip. That would give whole new capabilities to the Arduino (double precision math isn't supported by the Arduino libc).

Micromega: uM-FPU64

Yes. You are correct.
You bring up an interesting scenario. My idea of doing this little project was to move memory hungry FP library routines to a separate chip and to implement a simple scientific calculator that was software-extensible to allow new functions to be easily created. This bloat in calculations and libraries would not be "seen" by the main uC since a standard interface was being used (serial, I2C, etc.) and the code on the hosting uC need not be overly complex to pass intermediate data to the off-load chip and receive the results in calculations. However, if a 3rd chip (dedicated math coprocessor) was added to the mix, the primary uC would still not "know" anything about it... only the 2nd Arduino chip would be more complex in code and would manage handling the double-precision handoff. This makes for an expensive and somewhat more complex interface, but does certainly warrant consideration in some cases.

Rather, I am hopeful that a compatible 64-bit FP library can be found and integrated into the Arduino coprocessor. This would be the minimum cost, minimum complexity for the primary uC.

Ray

I do wonder about whether you really want to do this, other than the thrill of getting it to work. The problem with all co-processors is you lose a lot of the performance when you have to transfer data to/from the co-processor. So you tend to have to move more of the data from the main processor to the co-processor, and do more of the work in the co-processor, but often times these co-processors have limited memory spaces.

If you are doing significant floating point calculations (or occasional fp calculations, and the size of the emulation routines is too large), it is time to change to a processor with floating point instructions built-in.

MichaelMeissner:
I do wonder about whether you really want to do this <...>

**Agreed. **

I wrote this sketch for two reasons, one to learn more about using Streams and the many functions available which help to minimize code that I need to write and second because I wanted to experiment with off loading floating point from an Integer Only uC and not bother with dealing with integer arithmetic and then scaling to decimal.

  • I feel point #1 is obvious and was well served
  • Point #2 probably is obvious if I explain that the Interger Only uC is a PICAXE

Plus, while certainly not rocket science, the little sketch provides the foundation to play around and try other approaches to the ones I used. Having a working sketch as a foundation to investigate changes is helpful (I feel) to learning... and keeps 'trash questions' off the forum (or provide a convenient link to point lazy minds that cannot read online documentation.)

As to the reference to

lose a lot of the performance when you have to transfer data to/from the co-processor

, my experience is that "performance" is a relative term. The biggest obstacle to working with an UNO is the 2K of RAM and this RAM is significantly impacted by linking in the IEEE float library. Further, every transcendental function takes a hit on the flash storage. There 'may' be cases where off-loading is proper and performance of the main uC is not impacted because it is busy doing other chores and will accept the coprocessor answer in 40mS or so over a port that is already implemented... moving a few ASCII bytes over serial requires no additional hardware or software in most cases. I'm expecting that the speed could be increased to 38.4K or 57.6K without any issues whatsoever - but I have not tried it outside of using the Arduino terminal.

The other, maybe less obvious, use is that the code can be extended to include new commands. For example, perhaps I need to routinely calculate the Hypotenuse of a Rt triangle. Pick a command name, say RTT, increment the variable "operations", add "RTT" to sStack[], implement the code needed in a new "case 21:" , snatch "a" and "b" from the serial stream, do the calculations, ... then pass the results back over the serial port. This is no different than defining the macro to the hardware coprocessor except that only a small serial stub is used in the primary uC... not a bunch of setup stuff.

Performance to some would be just to keep the main Arduino free of unnecessary libraries and RAM impact and to off-load the calculations of infrequent calculations and suffer the serial transfer penalties. If a hardware handshake were implemented between the two chips, then the main uC can use an interrupt to snag the results and be busy with other things during the wait.

But, I agree with everything you stated.

Still, a $2 ATmega328P, 16MHz xtal, and a couple of load caps would total less than $3... the performance may not be there but the cost is not there, either. And, you are programming in a standard, already understood, environment.

  • Ray
...
#define operations 22       
...
char* sStack[ ] = {
  "DIV","MUL","ADD","SUB","Y^X","LOG","NLG","10X","1/X","e^X",
  "SQR","X^2","SIN","COS","TAN","ASN","ACS","ATN","DEG","RAD",
  "DEC", "RTT"};
...
    case 21:    // RTT calculate the Hypotenuse of a right triangle from sides a and b
            if (verbose) {Serial.print(F("Prompting for rt triangle side a and side b: ")); }
            a = Serial.parseFloat(); if (verbose) { Serial.print(a); Serial.print(F(" ")); }
            b = Serial.parseFloat(); if (verbose) { Serial.print(b); }
            if (verbose) {Serial.print(F(" Hypotenuse = ")); }
            Serial << _FLOAT(sqrt(a*a + b*b), decimals);
            if (verbose) {Serial.println(); }
            break;

Example
Input: rtt 10 10
Output:
Enter Instruction: RTT Found at location 21
Prompting for rt triangle side a and side b: 10.00 10.00 Hypotenuse = 14.1421356

Milliseconds = 46 Free RAM = 1436

Example
**Input:**dec 3 rtt 10 10
Output:
Enter Instruction: DEC Found at location 20
Prompting for Decimal places 0-10: Decimal Places = 3.00

Milliseconds = 21 Free RAM = 1436

Enter Instruction: RTT Found at location 21
Prompting for rt triangle side a and side b: 10.00 10.00 Hypotenuse = 14.142

Milliseconds = 83 Free RAM = 1436

Calculator.ino (10.9 KB)

This thread started as as fun project of building a floating point calculator that could be "called over RS232" via the main uC processor. Results would be sent back to the requesting processor via RS232. The idea is that the small investment in a ATmel328 (or other) Arduino on a breadboard would provide an inexpensive (approx. $3) capability by offloading the rather large IEEE floating point library (32-bit) to a second Arduino chip on a board-duino.

However, even with all the code for parsing and calculating a variety of scientific calculations and handshaking with the main uC, there is still some flash left over. Worst, nearly an entire 328 I/O is left over! Clearly, this presents an opportunity to extend the instruction set of the coprocessor to incorporate manipulation of the digital IO lines (same can be done with Analog readings.) As a proof, I extended the current calculator repertoire with a single new instruction, "DPR" Digital Pin Read, and the instruction takes a single integer argument value of 2 through 13. The syntax then is: DPR n where n is 2-13. Attempting to read outside the allowed range will be indicated as an Error if in Verbose mode and as a "2" if in processor to processor mode.

Another command could be constructed for Digital Pin Write, DPW, and another for Analog Pin Read, APR. As you can see from the snipplets of code below, the implementation is very easy. While manipulating a remote uC's state over RS232 may seem completely insane, not every design requires near instanteous control; sometimes, the external signals only need to be monitored occasionally and RS232 is a nearly 'free' resource in the Arduino world.

Ray

Here are the changes to implement the new functionality:

#define Err 2
#define operations 23
byte DigPinNo;
char* sStack[ ] = {
  "DIV","MUL","ADD","SUB","Y^X","LOG","NLG","10X","1/X","e^X",
  "SQR","X^2","SIN","COS","TAN","ASN","ACS","ATN","DEG","RAD",
  "DEC", "RTT","DPR"};
...
void setup()
  pinMode(3, INPUT);      //  sets the digital pin 3 as input
  pinMode(6, INPUT);      //  sets the digital pin 6 as input
  pinMode(7, INPUT);      //  sets the digital pin 7 as input
  pinMode(8, INPUT);      //  sets the digital pin 8 as input
  pinMode(9, INPUT);      //  sets the digital pin 9 as input
  pinMode(10, INPUT);      // sets the digital pin 10 as input
  pinMode(11, INPUT);      // sets the digital pin 11 as input
  pinMode(12, INPUT);      // sets the digital pin 12 as input
...
    case 22:    //DPR DigitalPin Read # valid 2 - 13  Return 0, 1 for state or 2 for Error
            if (verbose) {Serial.print(F("Prompting for Digital Pin Number 2 - 13: ")); }
            DigPinNo = Serial.parseInt(); 
            if (DigPinNo <2 || DigPinNo > 13) {
              if (verbose) { Serial.print(DigPinNo);
                Serial.print(" Pin# Error"); break; }
                Serial << Err; break; }
            if (verbose) { Serial.print(DigPinNo); }
            if (verbose) {Serial.print(F(" Logic State = ")); }
            Serial << digitalRead(DigPinNo);
            if (verbose) {Serial.println(); }
            break;

Calculator.ino (12.1 KB)

fun project!

After the DPR I also expect the APR (analogPinRead)

Because the APR has noise in the line I would give it two parameters APR pin [count] (the latteris optional)
Then it would average count readings to average the noise.

Having seen a lot of sketches most of the analogReads are converted directly by map()

That leads to the operand MPA (Map AnalogRead) which has 3 parameters x = MPA(pin, lowerOut, upperOut) as one knows the input is [0..1023]
e.g. "MPA 6 0.0 5.0" converts the analog read directly to 0..5Volt

There are several other interesting functions to implement:

  • Hyperbolic variations of SIN etc

  • Pythagoras c = sqrt(a^2 + b^2)

  • root of 2nd degree polynome x = (-b +-sqt(b^2 -4ac)) / 2a ==> it can have 0,1 or 2 answers!

  • the option to download a user function that can be called e.g. USR = SIN(COS(x) * y)
    then call USR x y

  • how about the use of to memory. commands MS M+ M- MR MC (store add sub recall clear)
    the use of an index may allow 10 memories e.g. M+ [index]
    and then you want to call SIN M0 to get the sin of what is in memory 0

  • complex match? - http://arduino.cc/forum/index.php?topic=96080.0 -

  • boolean math - AND 011100001010101 1010110010101100

keep us informed!

robtillaart:
fun project!
<...>

  • Pythagoras c = sqrt(a^2 + b^2)
  • root of 2nd degree polynome x = (-b +-sqt(b^2 -4ac)) / 2a ==> it can have 0,1 or 2 answers!
  • the option to download a user function that can be called e.g. USR = SIN(COS(x) * y)
    then call USR x y
  • how about the use of to memory. commands MS M+ M- MR MC (store add sub recall clear)
    the use of an index may allow 10 memories e.g. M+ [index]
    and then you want to call SIN M0 to get the sin of what is in memory 0
  • complex match? - Arduino Forum -
  • boolean math - AND 011100001010101 1010110010101100

keep us informed!

The RighT Triangle, RTT, is already implemented, Ex:
rtt 3 5
Enter Instruction: RTT Found at location 21
Prompting for rt triangle side a and side b: 3.00 5.00 Hypotenuse = 5.8309516

The use of "MR+ int value" (where int is the register#) and "MR- int value" would remove the 0-9 limitation imposed by the 3-character command parser! That then presupposes that we implement "MRC int" to clear a register and "MRR int" to manage Memory Register Recall." In this scenario, only 4 new commands will work across any size FP array that you wish to construct within available memory. A great use of such an implementation would be for statistical functions that would work on the FP array and simply return the results... maybe MAX and MIN, with no arguments required.

I originally built the skeleton code as RPN, Reverse Polish Notation, but quickly realized that performance would suffer by having to save (and prompt) for variables before the Operation was known! Polish Notation, however, is perfect since the operation is immediately known and select code can prompt / pull values from the RS232 buffer. It also makes for an easily understood and easy expansion for others. I'm not a particularly good coder and I am hopeful that perhaps others will take up some of the ideas you suggested around complex math should this project be anything more than just a curiosity. Other than for fun on a rainy day, all of the prompts are non-essential. The recovered flash memory would be ideal for more equations and processing rules. The crude loop that I created from 0 to the end of the array which is held by "operations" probably can be done better (likely, much better.) That said, in the current version, there is over 50% of flash memory available for expansion. And the RAM available is 1400+ Bytes. Surely, some serious programming can be done with those resources. Some ideas for interacting with reading ports is here:

Thanks for the good words and encouragement. The Arduino is a fun and educational platform and I find that direct interaction with a piece of code is quiet fun even for this old guy. Perhaps some students will take the code and program their math or physics homework! I just found my old HP-67 notes from my EE college days and found that I had written code and stored on mag-cards such things as: Untuned Primary / Tuned Secondary, Complex 3x3 Matrix, Simple LC filters, Solutions to B^C = A^D, Bessel functions of the first kind, Polar Impedance Calculator, Complex Network Transfer Functions, etc. Obviously the limitation on the 32-bit FP library would make some of this useless, but ... perhaps a better use of internal scaling (!none now!) would offer better results across a limited range.

  • Ray

Hi,

This is a vey nice idea :slight_smile:
Is your dual_128 board still available ? in kit form ?
It is a nice startup for me to evaluate a 12 Forth Array than can
grow up to a 3
3 one if I can make it work.

Best regards,

Guy from Paris

GuyFortabat:
Hi,

This is a vey nice idea :slight_smile:
Is your dual_128 board still available ? in kit form ?
It is a nice startup for me to evaluate a 12 Forth Array than can
grow up to a 3
3 one if I can make it work.

Best regards,

Guy from Paris

Probably need to PM Crossroads w/ this inquiry.

Ray

What does mean "PM Crossroads" ?
Sorry for my poor understanding :),

Thanks,

Guy

PM = Personal Message (It is the IM button on the left of each message)

CrossRoads is a member of the forum

Ray, ever think about interfacing a scientific calculator chip to Arduino? Back in the mid 70's I had a s100 buss 1k Altair computer and I got hold of a board that did that very thing. worked like a charm.

steinie44:
Ray, ever think about interfacing a scientific calculator chip to Arduino? Back in the mid 70's I had a s100 buss 1k Altair computer and I got hold of a board that did that very thing. worked like a charm.

Yes., thought about it.. Back a few years ago when I was playing with the PICAXE but the UNO came out and had enough SRAM to do IEEE floating point math and being a RISC core, it was faster than the PIC. Some of the PICAXE dudes went down the coprocessor route but I threw in the towel and went full AVR.
Here:
http://www.micromegacorp.com/umfpu-v3.html

I played w/ an S100 Cromemco while working in Research. Pretty awesome. However I went down the less expensive road of 6502 until the 8080.

Ray

PS, this was just a one-off to show it could be done. If I ever get around to it, I will probably implement a RPN style stack and the HP67 verbs.