Details

    • Help
    • Resolution: Unresolved
    • Minor
    • None
    • Professional Edition
    • Poller

    Description

      We had a device that was taking a long time to poll via SNMP. Eventually we had 6 poller processes running for just that one device and none of them were finishing. The logs in the server said

      (URGENT: poller-wrapper.py poller not started because already running 6 processes, load average (5min) 11.41)

       

      All polling on the server stopped until I killed those processes. Then it did it again. I eventually removed that device and resolved the issue. I'm wondering if there is a timer or setting I can set to kill these processes before they cause an issue like this again?

      Attachments

        Activity

          [OBS-2710] Poller Hangs and Eventually Stops

          The device in question is having some issues. I am unable to login to it and it is running very slow. To fix the issue I removed it from Observium. I added it back to get the test debug file you asked for. How does the max poller wrapper processes work? Also what stopped it at 6 total processes? Is that the default? Is there away to configure it to just kill poller-wrapper processes that have run too long?

          ajackson Andy Jackson added a comment - The device in question is having some issues. I am unable to login to it and it is running very slow. To fix the issue I removed it from Observium. I added it back to get the test debug file you asked for. How does the max poller wrapper processes work? Also what stopped it at 6 total processes? Is that the default? Is there away to configure it to just kill poller-wrapper processes that have run too long?

          This is protection against a huge processes increase and stop working for server. Exactly for such cases as your.

          Normally, poller-wrapper not should run at same time more than 4 processes, but it still possible if server LA less than 10.
          These options configurable (see: Global Settings -> Polling -> Wrapper).

          But, for your case need to know, why polling for that device was so long.
          Required debug:

          ./poller.php -d -h <broken_device>
          

          landy Mike Stupalov added a comment - This is protection against a huge processes increase and stop working for server. Exactly for such cases as your. Normally, poller-wrapper not should run at same time more than 4 processes, but it still possible if server LA less than 10. These options configurable (see: Global Settings -> Polling -> Wrapper). But, for your case need to know, why polling for that device was so long. Required debug: ./poller.php -d -h <broken_device>

          People

            landy Mike Stupalov
            ajackson Andy Jackson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: