I am trying to use the P1AM-100 Arduino, a commercial board that uses mature PLC-type I/O modules, in an industrial control environment. This system has an ARM-based CPU, and looks like a SAMD21. It has 256KB of flash, and 32KB of RAM. (Not an overabundance of either, but it should be sufficient.) The system also has a P1AM-ETH Ethernet module, which is W5500-based. (I believe.) The I/O ports are all driven through SPI, including (of course) Ethernet. There are no direct-connect I/O pins except for one switch and one LED on the CPU card. (This might be a factor in the problems I'm seeing.) This is essentially required, in an environment using 480V 600A three-phase power. (Maximally noisy, in other words.) The I/O modules all have very good input signal conditioning, and are proven. (Not new for Arduino, in other words.)
Up to this point, so far, so good. I've written a number of test programs that prove that we can talk to the I/O ports, and through Modbus-TCP to other PLC's for remote I/O, which we will be using in the final product. I am using FreeRTOS in order to be able to use traditional (pre-Arduino) software isolation techniques. (Threads, basically, in order to avoid the wretched all-in-one-loop Arduino native bias.) The RTOS, at least, seems to behave well. Threads are completely isolated from each other, in that any one thread has exclusive access to a given I/O channel. (All Ethernet on one thread, all I/O to the relay board on another. They must share the underlying SPI, though. RTOS queues are used to communicate to these I/O threads. Adding an RTOS mutex for SPI in these I/O threads made no difference, so I suspect that the SPI layer is probably not at fault.)
The problem occurs when I have a multithreaded test program that is independently flipping bits up and down on both the local and the remote I/O pins. If I unplug the Ethernet cable, all activity on the processor comes to a screeching halt. This is absolutely 100% unacceptable, death to the entire project really, because the local I/O tasks must continue operating even in the face of (we hope, transient) network problems. Mechanical damage, or even injury, might occur if the control program ever gets blocked at an unexpected point. It must continue running, no exceptions. Ever.
I even went so far as to eliminate all the 'local' (through SPI) I/O and have it only blink the one direct-connect LED, with no difference in behavior. (Thus eliminating all potentially-conflicting SPI activity.) Unplugging the Ethernet cable is as dramatic as unplugging the power, so far as continued operation is concerned.
If I plug the Ethernet cable back in, everything resumes normal operation after a delay. Somewhere in the driver software things are getting stuck, and in such a way as the RTOS is unable to do its thing. I get that Ethernet is off the table with the cable unplugged, but nothing else should even burp, much less stick permanently.
I have been looking at the drivers, trying to find unexpected delay() calls, while loops, etc. Anything that can get it stuck in an un-dispatchable state, but so far no luck. I found a few delay calls that I replaced with the equivalent RTOS-friendly calls, with no improvement. Being sure you're even looking at the right source code is not easy in Arduino. Debugging is not easy in Arduino.
Is this sort of thing ringing any bells? Does anyone use Arduino for anything but toybox-level software? I am dismayed at how hostile the Arduino environment seems to be to software techniques that have been in common use since at least the 70's when I began doing this stuff. I believe the hardware to be more than capable of doing the jobs we need, it's the 'ease-of-use' Arduino platform that seems to be the problem. Rolling my own bare-metal program (sans Arduino) for this hardware, as we would have done in the 70's, is probably not going to fly; the fallback position would probably be to revert to a wretched ladder-logic PLC controller, which means it wouldn't be me doing it. Not Attractive.