Uploaded image for project: 'Observium'
  1. Observium
  2. OBS-169

Policy discussion: Unix-Agent, scripts, paths, future development

Details

    Description

      This discussion is related to: BUG#164

      Hello guys
      I'm new to observium, so excuse me if I'm not familiar with some concepts.
      I want to initiate a discussion about future changes in observium and in particular the unix agent scripts. I'm interested in writing more checks (postgresql, dhcp, openvz, etc) but after a discussion in irc yesterday, I want to clear some concepts before I proceed with writing anything.
      My goal is to have as much as posible a working configuration out of the box. I'm looking for almost zero configuration. So when a new use comes to observium and copies all scripts to his check_mk directory, he should expect a proper result every time, and should not get broken graphs. The user should not care if he has mysql, dmidecode or memcached installed and configured correct, before he copies the scripts.

      Please provide constructive criticism and ideas/problems that I have missed.

      1) Script output
      a) There is a problem with observium that if the script does not return the proper output, the graph is broken and sometimes can break other graphs too. Some of the scripts right now does not check their output at all, witch leads to problems. A check can also be implemented in the poller, but I'm not really sure how to do it YET and what good does it do. So I propose that from now on, there will be no "Configuration file missing", "Data fetch failure" or anything like that in the scripts.

      The script should succeed or return nothing at all.

      b) There should be proper checks if the needed binaries and configuration files are installed BEFORE they are executed/included OR you should check the result of the execute/include if it was successful. There was a discussion yesterday about using full paths in scripts or relaying on $PATH to search for your binary.

      • If I use full path (/usr/bin/program), the script is not very portable out of the box, because the "program" may be in different location in different OSes. But setting the full path can give me the ability to check if the binary exists before I execute it. A nice waring // EDIT LINE HERE // DO NOT EDIT BENEATH can be put at the begining of every script. - This option is what I prefer.
      • Another option is to try to discover the binary in $PATH, or `whereis`, or `type`. This is not very portable because some systems return different output
      • Third option is to rely on $PATH. This is very convenient way to work out of the box for every OS, but there is no way to check if the binary exists before trying to execute it. For example when I put $ntpq="ntpq"; the program is executed without checking if "ntpq" exists, So we should be checking the output of the command if it is valid. This option is again a bit harder to implement because of system differences (sh: ntpq does not exists // ntpq program not found // -bash: ntpq: command not found// etc).

      2) Versions
      Some times, the output of a command is changed between program versions. For example old versions of NFS have fewer stats than the newer versions, witch leads to different rrd files and different graphs.. So clearly there should be a way to check the version in the script. Here is an example.
      <<<myprogram>>>
      version1
      stats:534
      stats2:31
      <<<myprogram>>>
      version2
      stats:235
      stats2:523
      stats3:28345

      Another way is to create different scripts for different versions like:
      <<<myprogram-v1>>>

      <<<myprogram-v2>>>

      I have no clear opinion on both options.

      3) Perl/PHP/Python Modules
      Sometimes your scripts require modules to get the needed stats. They should be documented in the script, a check if they exist should be included. Modules are system independent but not all systems have them installed.

      I'm willing to write/edit the existing documentation when we decide how this stuff should work. And of my new additions a full documentation is going to be provided.

      Attachments

        Activity

          [OBS-169] Policy discussion: Unix-Agent, scripts, paths, future development
          Jules Jules added a comment -

          Hi there. Im new here and hope this is the right place to post and ask about

          I'm asked myself if the polling via SSH executing the observium_agent,
          instead tunnel an insecure xinetd shouldn't be an option of the unix agent php task aswell.

          i can be wrong but the overhead that ssh execution is producing is not that high that i expected to be in compare of the existing xinetd tcp connection.

          My raw but working solution (php patch included) i described in one of my admin posts: http://ispire.me/polling-observium-unix-agent-with-ssh/

          Jules Jules added a comment - Hi there. Im new here and hope this is the right place to post and ask about I'm asked myself if the polling via SSH executing the observium_agent, instead tunnel an insecure xinetd shouldn't be an option of the unix agent php task aswell. i can be wrong but the overhead that ssh execution is producing is not that high that i expected to be in compare of the existing xinetd tcp connection. My raw but working solution (php patch included) i described in one of my admin posts: http://ispire.me/polling-observium-unix-agent-with-ssh/

          I think there are a lot more things that need to be changed with the way the agent works.

          http://www.observium.org/wiki/Future#Agent

          adama Adam Armstrong added a comment - I think there are a lot more things that need to be changed with the way the agent works. http://www.observium.org/wiki/Future#Agent

          dunno

          adama Adam Armstrong added a comment - dunno

          So what is the final decision?

          dobber Ivan Dimitrov added a comment - So what is the final decision?

          For versioning, we also need some standardised way of communicating this information to the UI elements.

          I'd say that the module returning the version of its output would be best, rather than having different module names. Unless the versions are sufficiently different that there isn't much shared code.

          The Check_mk link will die sometime soon, when someone gets the motivation to do the work. We need to strip out everything we don't want in there and fork it. It's not actually very complex, and we don't really use any of the check_mk checker code at all!

          adama Adam Armstrong added a comment - For versioning, we also need some standardised way of communicating this information to the UI elements. I'd say that the module returning the version of its output would be best, rather than having different module names. Unless the versions are sufficiently different that there isn't much shared code. The Check_mk link will die sometime soon, when someone gets the motivation to do the work. We need to strip out everything we don't want in there and fork it. It's not actually very complex, and we don't really use any of the check_mk checker code at all!

          Just to be sure here is a transcript of discussion why not to use full paths:

          [11:20:28] <dobber_> but in the "//edit here" I should put the full path by default
          [11:21:43] <CodeKiller> we already had that discussion about full path
          [11:22:00] <dobber_> and i'm not converted
          [11:23:02] <CodeKiller> Full paths i'm against because of the reason not every server (os) is the same, some ppl use custom compiled things other ppl use pre installed packages etc. So a check should be portable over all the systems
          [11:23:28] <CodeKiller> and thats what PATH is
          [11:24:00] <CodeKiller> if i put /myown/compiled/sbin in PATH its going to work just fine when i run script
          [11:25:29] <CodeKiller> but if you hardcode the full path its only going to work for that OS
          [11:25:41] <dobber_> CodeKiller: I'm open for discussion on this topic. If the community agrees on any option, i'll follow.
          [11:25:54] <dobber_> but I need to create maybe 10-15 more scripts
          [11:26:01] <dobber_> so I need a clear guideline
          [11:26:04] <CodeKiller> observium agent is used on debian/ubuntu fedore rhel/centos solaris openbsd/freebsd ...
          [11:26:11] <dobber_> and I don't want to argue with you everytime
          [11:27:33] <CodeKiller> a) if you hardcode paths it will probabbly not be included
          [11:28:00] <CodeKiller> b) if it gets include there will be a patch in no time to change /usr/bin to /bin or /usr/local/bin as hardcoded path
          [11:28:20] <CodeKiller> so it will always break someone's installation
          [11:28:39] <dobber_> ok please make this clear in the jira discussion
          [11:28:47] <CodeKiller> i already did

          codekiller Dennis de Houx added a comment - Just to be sure here is a transcript of discussion why not to use full paths: [11:20:28] <dobber_> but in the "//edit here" I should put the full path by default [11:21:43] <CodeKiller> we already had that discussion about full path [11:22:00] <dobber_> and i'm not converted [11:23:02] <CodeKiller> Full paths i'm against because of the reason not every server (os) is the same, some ppl use custom compiled things other ppl use pre installed packages etc. So a check should be portable over all the systems [11:23:28] <CodeKiller> and thats what PATH is [11:24:00] <CodeKiller> if i put /myown/compiled/sbin in PATH its going to work just fine when i run script [11:25:29] <CodeKiller> but if you hardcode the full path its only going to work for that OS [11:25:41] <dobber_> CodeKiller: I'm open for discussion on this topic. If the community agrees on any option, i'll follow. [11:25:54] <dobber_> but I need to create maybe 10-15 more scripts [11:26:01] <dobber_> so I need a clear guideline [11:26:04] <CodeKiller> observium agent is used on debian/ubuntu fedore rhel/centos solaris openbsd/freebsd ... [11:26:11] <dobber_> and I don't want to argue with you everytime [11:27:33] <CodeKiller> a) if you hardcode paths it will probabbly not be included [11:28:00] <CodeKiller> b) if it gets include there will be a patch in no time to change /usr/bin to /bin or /usr/local/bin as hardcoded path [11:28:20] <CodeKiller> so it will always break someone's installation [11:28:39] <dobber_> ok please make this clear in the jira discussion [11:28:47] <CodeKiller> i already did

          My answers:

          1) Script output
          ----------------
          a) That is only possible for scripts that don't need any configuration files or adjustments in the scripts itself. As long as ppl don't read documentation there will always be an error. It is also easier to debug on the observium server with the debug option to see where the agent scripts fail then on the client server itself. There could be a way out of this by putting a var in the script with "exit 0"; and a line above please comment this out if you want this script to run, to force the users to read the documentation, but this isn't userfriendly.

          b) Full paths i'm against because of the reason not every server (os) is the same, some ppl use custom compiled things other ppl use pre installed packages etc. So a check should be portable over all the systems like with "type" and in case "type" detection fails just run the script anyway. The // EDIT LINE HERE // DO NOT EDIT BENEATH i'm all for because i used this in all off my scripts.
          Checks should always be made in the poller to see if the data that is returned is actually the right data, this can also be used to detect different versions. Even if the executable exists it could happen that the output of the executable is wrong (mem leak, aborted, segfault, ...) so it will return bogus data if there is no check in the poller to verify this.

          2) Versions
          -----------
          I'm all for that and thats exactly what i'm sort of use for the ntpd script to detect server/client and different versions of ntpd.

          3) Perl/PHP/Python Modules
          --------------------------
          I also agree with this, but for apache this is documented in the wiki but still ppl don't read it. So this comes back to 1) where it depends upon who is deploying things and if this person is reading manuals or comments in the scripts. Checking if a module exists is all good and well, but then you must check if the check system is installed. With other words how do you check if a perl module is installed in a perl script, when perl itself isn't installed. If you need to build checks for checks for checks then its easier to just build deb/rpm packages with the dependencies in it (thats what i do with my own repository).

          codekiller Dennis de Houx added a comment - My answers: 1) Script output ---------------- a) That is only possible for scripts that don't need any configuration files or adjustments in the scripts itself. As long as ppl don't read documentation there will always be an error. It is also easier to debug on the observium server with the debug option to see where the agent scripts fail then on the client server itself. There could be a way out of this by putting a var in the script with "exit 0"; and a line above please comment this out if you want this script to run, to force the users to read the documentation, but this isn't userfriendly. b) Full paths i'm against because of the reason not every server (os) is the same, some ppl use custom compiled things other ppl use pre installed packages etc. So a check should be portable over all the systems like with "type" and in case "type" detection fails just run the script anyway. The // EDIT LINE HERE // DO NOT EDIT BENEATH i'm all for because i used this in all off my scripts. Checks should always be made in the poller to see if the data that is returned is actually the right data, this can also be used to detect different versions. Even if the executable exists it could happen that the output of the executable is wrong (mem leak, aborted, segfault, ...) so it will return bogus data if there is no check in the poller to verify this. 2) Versions ----------- I'm all for that and thats exactly what i'm sort of use for the ntpd script to detect server/client and different versions of ntpd. 3) Perl/PHP/Python Modules -------------------------- I also agree with this, but for apache this is documented in the wiki but still ppl don't read it. So this comes back to 1) where it depends upon who is deploying things and if this person is reading manuals or comments in the scripts. Checking if a module exists is all good and well, but then you must check if the check system is installed. With other words how do you check if a perl module is installed in a perl script, when perl itself isn't installed. If you need to build checks for checks for checks then its easier to just build deb/rpm packages with the dependencies in it (thats what i do with my own repository).

          One more thing. basic system tools like sed, awk, cut, grep shoud work without specifying the path.

          dobber Ivan Dimitrov added a comment - One more thing. basic system tools like sed, awk, cut, grep shoud work without specifying the path.

          First off before continuing this discussion we should talk about porting/chancing the core unix agent script. Don't let it use check_mk things but make it fully independ with own directory and own (adjustable) port.

          • Adjustible xinted.d port in configuration
          • Adjustible config.php port in observium config
          • Adjustible per host port config in database

          This way you can run check_mk together with observium agent without having 300 scripts that observium doesn't need (or have collisions in scripts), and the option to run the agent on different ports for whatever reason (nat forwarding for instance).

          codekiller Dennis de Houx added a comment - First off before continuing this discussion we should talk about porting/chancing the core unix agent script. Don't let it use check_mk things but make it fully independ with own directory and own (adjustable) port. Adjustible xinted.d port in configuration Adjustible config.php port in observium config Adjustible per host port config in database This way you can run check_mk together with observium agent without having 300 scripts that observium doesn't need (or have collisions in scripts), and the option to run the agent on different ports for whatever reason (nat forwarding for instance).

          People

            adama Adam Armstrong
            dobber Ivan Dimitrov
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: