Details
-
Help
-
Resolution: Unresolved
-
Minor
-
None
-
Professional Edition
Description
We had a device that was taking a long time to poll via SNMP. Eventually we had 6 poller processes running for just that one device and none of them were finishing. The logs in the server said
(URGENT: poller-wrapper.py poller not started because already running 6 processes, load average (5min) 11.41)
All polling on the server stopped until I killed those processes. Then it did it again. I eventually removed that device and resolved the issue. I'm wondering if there is a timer or setting I can set to kill these processes before they cause an issue like this again?
The device in question is having some issues. I am unable to login to it and it is running very slow. To fix the issue I removed it from Observium. I added it back to get the test debug file you asked for. How does the max poller wrapper processes work? Also what stopped it at 6 total processes? Is that the default? Is there away to configure it to just kill poller-wrapper processes that have run too long?