Uploaded image for project: 'Observium'
  1. Observium
  2. OBS-4300

SNMP Connectivity to VMware ESXi 7.0 Hosts Flaky

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • Poller
    • CentOS Linux release 7.9.2009 (Core)

    Description

      Running version 22.11.12360

      Both with new ESXi 7.0 hosts, and since upgrading existing (previously monitored) ESXi hosts from v6.7 to v7.0, SNMP polling only succeeds intermittently.

      I will attach SNMP walk results from both when the system is "visible" (live) and when polling fails (dead), as well as poller and discovery results.

      With all apologies, I don't know if this is a problem in Observium or in ESXi but wanted to start the ball rolling.  I have a support ticket open with VMware as well, and have sent them logs from two of the affected systems.

      Attachments

        Activity

          [OBS-4300] SNMP Connectivity to VMware ESXi 7.0 Hosts Flaky

          Yah, I still not have solutions. Observium just use external common commands, for ping - fping, for snmp - net-snmp.

          But exactly for your poller debug, strange that I not see any fping cmd run.

          Device configured as skip ping checks?

          Can you try upgrade your CentOS system.. it used old software versions.

          landy Mike Stupalov added a comment - Yah, I still not have solutions. Observium just use external common commands, for ping - fping, for snmp - net-snmp. But exactly for your poller debug, strange that I not see any fping cmd run. Device configured as skip ping checks? Can you try upgrade your CentOS system.. it used old software versions.

          Working with VMware support, they've been running packet captures.

          When the requests fail they're seeing the SNMP requests come in from Observium on a random port (i.e. 37981) but when attempting to respond the port is unavailable.  VMware is wondering why the port would close before the response can be sent.

          I am attaching four activity captures, two from the Observium side and two from the ESXi side.

          NegwerIT Scott Driemeier-Showers added a comment - Working with VMware support, they've been running packet captures. When the requests fail they're seeing the SNMP requests come in from Observium on a random port (i.e. 37981) but when attempting to respond the port is unavailable.  VMware is wondering why the port would close before the response can be sent. I am attaching four activity captures, two from the Observium side and two from the ESXi side.

          Thank you.  I will continue to work this with VMware Support.

          --Scott

          NegwerIT Scott Driemeier-Showers added a comment - Thank you.  I will continue to work this with VMware Support. --Scott

          ok, mainly trouble exactly with snmp response by device..
          this is no how related with observium.
          not sure how can help.. I see you already set snmp timeout to 2sec.
          Try set max repetition for device to 0 (this will disable snmpbulkwalk), but I not sure that this will help.

          landy Mike Stupalov added a comment - ok, mainly trouble exactly with snmp response by device.. this is no how related with observium. not sure how can help.. I see you already set snmp timeout to 2sec. Try set max repetition for device to 0 (this will disable snmpbulkwalk), but I not sure that this will help.

          Device event log screenshot attached.

          --Scott

          NegwerIT Scott Driemeier-Showers added a comment - Device event log screenshot attached. --Scott

          Please show screenshot with device eventlog and overview.

          Not sure, but seems by poller debug this device disabled by some reasons..

          landy Mike Stupalov added a comment - Please show screenshot with device eventlog and overview. Not sure, but seems by poller debug this device disabled by some reasons..

          Files uploaded.

          NegwerIT Scott Driemeier-Showers added a comment - Files uploaded.

          General questions and device support can be discussed in our Discord channel, click here to join.


          Please make and attach additional information about the device:

          • full snmp dump from device:

            snmpwalk -v2c -c <community> -t 3 -Cc --hexOutputLength=0 -ObentxU <hostname> .1 > myagent.snmpwalk
            snmpwalk -v2c -c <community> -t 3 -Cc --hexOutputLength=0 -ObentxU <hostname> .1.3.6.1.4.1 >> myagent.snmpwalk

            If device not support SNMP version 2c, replace -v2c with -v1.

          • If you have problems with discovery or poller processes, please do and attach these debugs:

            ./discovery.php -d -h <device>
            ./poller.php -d -h <device>

          • additionally attach device and/or vendor specific MIB files

          This comment is added automatically.

          bot Observium Bot added a comment - General questions and device support can be discussed in our Discord channel, click here to join . Please make and attach additional information about the device: full snmp dump from device: snmpwalk -v2c -c <community> -t 3 -Cc --hexOutputLength=0 -ObentxU <hostname> .1 > myagent.snmpwalk snmpwalk -v2c -c <community> -t 3 -Cc --hexOutputLength=0 -ObentxU <hostname> .1.3.6.1.4.1 >> myagent.snmpwalk If device not support SNMP version 2c, replace -v2c with -v1. If you have problems with discovery or poller processes, please do and attach these debugs: ./discovery.php -d -h <device> ./poller.php -d -h <device> additionally attach device and/or vendor specific MIB files This comment is added automatically.

          People

            landy Mike Stupalov
            NegwerIT Scott Driemeier-Showers
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: