Matrix library comments

Short version: I wrote a matrix class. Is there anything I could improve?
Library is Matrix.h file.
Example code is Matrix.ino file.

Long version below:

I wanted to perform some matrix operations, mainly for convenience of notation, but the existing MatrixMath library didn't really help with that.
I did use it initially and it did work for what I wanted to do but 'simple' operations don't look simple e.g. with matrices P, F and Q
to do P = F * P * F' + Q looks something like:

Matrix.Multiply(*F, *P, 3, 3, 3, *tmpM33_1);
Matrix.Transpose(*F, 3, 3, *tmpM33_2);
Matrix.Multiply(*tmpM33_1, *tmpM33_2, 3, 3, 3, *tmpM33_3);
Matrix.Add(*tmpM33_3, *Q, 3, 3, *P);

(where tempXX are some temporary matrices required to hold the results)

So it didn't really make things any simpler! Plus I found it quite easy to make mistakes like putting in the wrong dimensions in the function parameters. Other mistakes I made included accidentally accessing elements beyond range of the underlying array.

So I wrote an alternative library which deals with all that, and I believe is relatively memory efficient (by which I mean it does not use dynamic memory allocation).
With this you can do:

P = A * P * A.transpose() + Q;

Or another simple example:

Matrix<float, 3, 4> A;  // Declare a 3 row, 4 column matrix of floats
Matrix<float, 3, 4> B;
Matrix<float, 3, 3> C = A*B.transpose();

More examples in Matrix.ino.
The compiler will also catch errors with +-* operations where the matrices are not the correct size.
The fact that template function definitions for non-member functions need to have the template declaration on the same line as the function declaration had me scratching my head for several days trying to work out what I was doing wrong (I hope I have used the correct terms in that statement). It was a good learning experience though.

Other notes:
Does not support matrices of consts
No consideration of sparse matrices
Inversion will probably not work well with non floats
The failure flag does not get reset after a successful operation
The accessor function operator() includes range checking

Possible changes:
Add way to take list of intitial values when declared
Explicitly add dimensions as member variables? Could make easier for initial setup.

Performance:
Most operations in the example file (using copious float operations) take ~7us on the Teensy 3.5 (except inverse which is ~14us).
On the ATMEGA328P, these operations take between 500us and 800us (except inverse which is ~1400us).

I have questions about some aspects:
It seems inconsistent to have some operators as member functions and some not (e.g. multiply). It also means that access to the underlying data is different since the non-member functions must use the public accessor function. I wasn't sure how to make the non-member functions into friends or include as member functions while retaining their flexibility. Comments?

I don't particularly like the way it handles out of range requests, by just returning (a reference to) a dummy/junk variable, but I needed the function to always return a reference to something.
Can you think of a better way to do this?

File structure - it's all currently in an h file (since template classes can't be in h and cpp files in the normal way). Is this reasonable or should I go with the approach of having a separate cpp file and including it at the end of the h file? Or something else?

Matrix.h (7.28 KB)

Matrix.ino (3.03 KB)

I have modified the inverse function to include partial pivoting to improve numerical stability (basically avoid dividing by very small numbers).

Edit: actually adding the overloaded parenthesis operator causes an issue when compiling for teensy (3.5). Currently trying to understand the issue.

Matrix.h (8.59 KB)

I have fixed the error for Teensy 3.5 (really it was a general mistake that appeared to only be picked up by different compilation options used for Teensy). Actually I believe the error would not have made any difference to the results but was not fully compliant with the language standards.

I have attached the updated version (cosmetically identical to the first one).

I'd be very appreciative of any feedback.

Matrix.h (9.2 KB)

BasicOperations.ino (3.25 KB)