Goes unresponsive intermittently

I'd like to suggest an approach which doesn't address the problem, but makes your system resilient to it: a watchdog.

There's a lot to be said about watchdogs, but in summary they are an essential part of any system which must remain working over very long periods in what might be a hostile environment. In other words, they provide a valuable layer of resilience.

They should not be used as an excuse to leave bugs extant, of course. But they are a realistic response to the imperfection of the environment (and our coding skills).

So, I'm not proposing it as a solution to your problem, but something for you to consider implementing anyway.