Restoring Bridge connection (Arduino Mega + YUN shield)

Hi!

I am running an Arduino Mega + YUN shield.
Sometime for some reasons the Bridge connection is lost (probably because the YUN shield resets): my Blynk connection is not recovering and I cannot access to Arduino via "telnet localhost 6571" anymore.
I get the message "telnet: can't connect to remote host (127.0.0.1): Connection refused".

If I "reset-mcu" the Arduino from the SSH the Bridge connection and the Blynk one are restored immediately.

However, I would like to detected the lost of connection from the sketch and being able to restore the Bridge connection by software in my loop() function, without having to restart the Mega, so that I do not have to re-initialize all variables and loose the current data.

Any suggestion how to check within the sketch if the Bridge connection is active and how to restore it?

Thank you in advance for your help.

I get the message "telnet: can't connect to remote host (127.0.0.1): Connection refused".

This usually means the linux part of the bridge got killed for some reason. If you log in by ssh you can get the reason by using this command:

cat /tmp/bridge.py-stderr.log

You can restart the bridge daemon by issuing the following command:

run-bridge &

If you need help interpreting the reason for the failure post the output of the above "cat" command.

Hi pylon!
Thank you very much for your answer, it seems we are moving in the right direction.

This is the output of cat /tmp/bridge.py-stderr.log:

Traceback (most recent call last):

  • File “bridge.py”, line 70, in *
  • import console*
  • File “/usr/lib/python2.7/bridge/console.py”, line 124, in *
  • console = Console()*
  • File “/usr/lib/python2.7/bridge/console.py”, line 36, in init*
  • utils.try_bind(server, ‘127.0.0.1’, port)*
  • File “/usr/lib/python2.7/bridge/utils.py”, line 38, in try_bind*
  • return socket.bind((address, port))*
  • File “/usr/lib/python2.7/socket.py”, line 224, in meth*
    return getattr(self._sock,name)(*args)
    socket.error: [Errno 125] Address already in use

with run-bridge & I can re-establish the telenlet localhost 6571 connection indeed.

So, my questions now are:

  1. why was the bridge connection broken?
  2. how can I detect from Linux side that the bridge connection is broken so that I can automatically restore it (for example with a crontab task checking every x minutes the status and running run-bridge & if not OK)?

Thank you in advance for your help.

  1. why was the bridge connection broken?

Looks like the restart was done while the old program still was running, so it couldn't connect to the local port. Is it possible that you have a debugging session open while you run reset-mcu?

  1. how can I detect from Linux side that the bridge connection is broken so that I can automatically restore it (for example with a crontab task checking every x minutes the status and running run-bridge & if not OK)?

A

netstat -an | grep 6571

shows you if the bridge is listening on port 6571 for telnet connections. If that doesn't show a listening connection you can restart the bridge.

Looks like the restart was done while the old program still was running, so it couldn't connect to the local port. Is it possible that you have a debugging session open while you run reset-mcu?

I am not sure I understood this sentence: do you mean the YUN Shield restarted while the sketch on Arduino was running?
Yes, this seems to be the case and it was my assumpion too.
The cat /tmp/bridge.py-stderr.log I posted was taken before running any reset-mcu and any run-bridge &.

Debugging session

Do you mean some Console.print() function? Yes I do have still some calls in my code.

However, thank very much for your support so far, it is really appreciated.
I will further investigate the behaviour simulating a bridge interruption with kill-bridge and try to apply an automatic recovery runing run-bridge via crontab.

Reagrds

I am not sure I understood this sentence: do you mean the YUN Shield restarted while the sketch on Arduino was running?

No, more or less the other way around. The linux side of the bridge got started although an other process was still around (or at least the TCP stack was still having that active in some state).

The cat /tmp/bridge.py-stderr.log I posted was taken before running any reset-mcu and any run-bridge &.

Yes, I assumed you did it that way.

Do you mean some Console.print() function? Yes I do have still some calls in my code.

No, I mean that you were logged into the Yun shield by SSH and started the telnet command listening to the output of these Console.print() statements while you run mcu-reset.

I found a bug in the software: i was trying to access a global variable declared as PROGMEM without using the pgm_read_dword() function, which was probably causing some unpredictable behavior as you are describing.
After fixing the bug the sketch is now working since some days without any problem and with a stable Bridge connection.
So, let's see how it goes on.
Many thanks for your help, it have learnt some interesting things.