Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Professional Edition
    • Unix Agent
    • Debian 8.3 64-bit

    Description

      When I netcat the observium-agent of our storage-server I get the following for nfsd:

      <<<app-nfsd>>>
      rc 0 3262276 11714942
      fh 0 0 0 0 0
      io 1012421615 1079238768
      th 32 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
      ra 64 0 0 0 0 0 0 0 0 0 0 0
      net 14977013 0 14976885 6
      rpc 14976901 0 0 0 0
      proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
      proc3 22 0 304 0 18 20 0 0 0 0 0 0 0 0 0 0 0 0 0 20 0 0 0
      proc4 2 2 14976465
      proc4ops 59 0 0 0 699531 4021436 4346 1438 0 0 13846271 4239317 0 31 0 30 84271 0 0 4366901 0 792334 0 14793981 0 90018 1165970 82534 822 469 11358 13046 4209683 4379447 0 72935 90018 90018 0 2975085 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

      The problem is that is see no rrd being generated in observium/rrd/storage1/ while I can see rrd's of other metrics. Also there are no graphs visible on the nfsd-page of the storage-machine for nfsd (only titles) while other graphs work fine.

      Attachments

        1. poller.log
          107 kB
        2. poller.log
          15 kB

        Activity

          [OBS-1705] app-nfsd shows no stats

          I'm available for testing patches if needed, I'll keep an eye on email.

          veldenb Bernard van der Velden added a comment - I'm available for testing patches if needed, I'll keep an eye on email.

          In our systems I don't see a difference in the amount of numbers:

          # uname -a
          Linux storage1 3.16.0-7-amd64 #1 SMP Debian 3.16.59-1 (2018-10-03) x86_64 GNU/Linux
           
          # # cat /proc/net/rpc/nfsd
          rc 0 828027905 643218917
          fh 12 0 0 0 0
          io 1302794912 2664671801
          th 128 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
          ra 256 0 0 0 0 0 0 0 0 0 0 0
          net 1471294987 0 1471180616 18
          rpc 1471250321 0 0 0 0
          proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          proc3 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          proc4 2 11 1471231510
          proc4ops 59 0 0 0 19377503 17214607 1023247 528078 0 9602946 428639048 7219272 0 1292 0 1273 7889412 0 0 19043186 0 9 3 1472650292 0 20 194325745 3163520 6106 4402702 2625660 101809 0 2625660 0 9729857 6 6 0 832521748 12 0 2 8 11 6 1261 0 0 0 0 0 0 8 447666891 0 0 0 3 8

          # uname -a
          Linux bernard 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
           
          # cat /proc/net/rpc/nfsd
          rc 0 0 0
          fh 0 0 0 0 0
          io 0 0
          th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
          ra 32 0 0 0 0 0 0 0 0 0 0 0
          net 0 0 0 0
          rpc 0 0 0 0 0
          proc3 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          proc4 2 0 0
          proc4ops 72 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

          veldenb Bernard van der Velden added a comment - In our systems I don't see a difference in the amount of numbers: # uname -a Linux storage1 3.16.0-7-amd64 #1 SMP Debian 3.16.59-1 (2018-10-03) x86_64 GNU/Linux   # # cat /proc/net/rpc/nfsd rc 0 828027905 643218917 fh 12 0 0 0 0 io 1302794912 2664671801 th 128 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ra 256 0 0 0 0 0 0 0 0 0 0 0 net 1471294987 0 1471180616 18 rpc 1471250321 0 0 0 0 proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 proc3 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 proc4 2 11 1471231510 proc4ops 59 0 0 0 19377503 17214607 1023247 528078 0 9602946 428639048 7219272 0 1292 0 1273 7889412 0 0 19043186 0 9 3 1472650292 0 20 194325745 3163520 6106 4402702 2625660 101809 0 2625660 0 9729857 6 6 0 832521748 12 0 2 8 11 6 1261 0 0 0 0 0 0 8 447666891 0 0 0 3 8 # uname -a Linux bernard 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux   # cat /proc/net/rpc/nfsd rc 0 0 0 fh 0 0 0 0 0 io 0 0 th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ra 32 0 0 0 0 0 0 0 0 0 0 0 net 0 0 0 0 rpc 0 0 0 0 0 proc3 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 proc4 2 0 0 proc4ops 72 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

          Do you have access to multiple kernel versions to check that proc3 has the same number of variables?

          adama Adam Armstrong added a comment - Do you have access to multiple kernel versions to check that proc3 has the same number of variables?

          I agree. I can confirm removing the extra array_shift() fixes the stats. I didn't add the dummy-label, I don't think it's needed but I can't check since we use NFS4.

          We locally applied this patch to fix things for now:

          Index: includes/polling/applications/nfsd.inc.php
          ===================================================================
          --- includes/polling/applications/nfsd.inc.php	(revision 9671)
          +++ includes/polling/applications/nfsd.inc.php	(working copy)
          @@ -51,7 +51,6 @@
               {
                 $base = strtolower($tokens[0]);
                 array_shift($tokens);
          -      array_shift($tokens);
                 foreach ($tokens as $k => $v)
                 {
                   $datas[$base.($nfsLabel[$base][$k])] = $v;

          Adding proc4ops to the stat's would be nice.

          It should be something like this, but it looks like code also has to be added at some other places and rewriting for readability would also be needed so it's easier to maintain or add new features:

          $nfsLabel['proc4ops'] = array(
              'unused',
              'access_close_commit_create',
              'delegpurge',
              'recovery',
              'delegreturn',
              'getattr',
              'getfh',
              'link',
              'lock',
              'lockt',
              'locku',
              'lookup',
              'lookupp',
              'nverify',
              'open',
              'openattr',
              'open_confirm',
              'open_dgrd',
              'putfh',
              'putpubfh',
              'putrootfh',
              'read_readdir_readlink_remove_rename',
              'renew',
              'restorefh',
              'savefh',
              'secinfo',
              'setattr',
              'setcltid',
              'setcltidconf',
              'verify',
              'write',
              'rellockowner'
          );

          veldenb Bernard van der Velden added a comment - I agree. I can confirm removing the extra array_shift() fixes the stats. I didn't add the dummy-label, I don't think it's needed but I can't check since we use NFS4. We locally applied this patch to fix things for now: Index: includes/polling/applications/nfsd.inc.php =================================================================== --- includes/polling/applications/nfsd.inc.php (revision 9671) +++ includes/polling/applications/nfsd.inc.php (working copy) @@ -51,7 +51,6 @@ { $base = strtolower($tokens[0]); array_shift($tokens); - array_shift($tokens); foreach ($tokens as $k => $v) { $datas[$base.($nfsLabel[$base][$k])] = $v; Adding proc4ops to the stat's would be nice. It should be something like this, but it looks like code also has to be added at some other places and rewriting for readability would also be needed so it's easier to maintain or add new features: $nfsLabel [ 'proc4ops' ] = array ( 'unused' , 'access_close_commit_create' , 'delegpurge' , 'recovery' , 'delegreturn' , 'getattr' , 'getfh' , 'link' , 'lock' , 'lockt' , 'locku' , 'lookup' , 'lookupp' , 'nverify' , 'open' , 'openattr' , 'open_confirm' , 'open_dgrd' , 'putfh' , 'putpubfh' , 'putrootfh' , 'read_readdir_readlink_remove_rename' , 'renew' , 'restorefh' , 'savefh' , 'secinfo' , 'setattr' , 'setcltid' , 'setcltidconf' , 'verify' , 'write' , 'rellockowner' );

          It also seems that the number of data points returned differs based on kernel version, which is pretty absurd.

          adama Adam Armstrong added a comment - It also seems that the number of data points returned differs based on kernel version, which is pretty absurd.
          adama Adam Armstrong added a comment - - edited

          Jesus, whoever originally wrote this code gave zero fucks for any future readability. Why is any of it doing what it's doing?

          The whole thing just needs rewritten so that it's actually readable, rather than arsing around with arrays of text labels and somehow building rrds out of them.

          adama Adam Armstrong added a comment - - edited Jesus, whoever originally wrote this code gave zero fucks for any future readability. Why is any of it doing what it's doing? The whole thing just needs rewritten so that it's actually readable, rather than arsing around with arrays of text labels and somehow building rrds out of them.
          akotelba Adrian Kotelba added a comment - - edited

          It seems that array of tokens is shifted too many times and the last key is never read. A patch below may help (remove array shift and add "dummy" key to compensate for that in nfsv3 parameters).

          diff --git a/includes/polling/applications/nfsd.inc.php b/includes/polling/applications/nfsd.inc.php
          index fc396b88..e386c4f1 100644
          --- a/includes/polling/applications/nfsd.inc.php
          +++ b/includes/polling/applications/nfsd.inc.php
          @@ -36,6 +36,7 @@ if (!empty($agent_data['app']['nfsd']))
           );
           
          $nfsLabel['proc3'] = array(
          + "dummy",
           "null", "getattr", "setattr", "lookup", "access", "readlink",
           "read", "write", "create", "mkdir", "symlink", "mknod",
           "remove", "rmdir", "rename", "link", "readdr", "readdirplus",
          @@ -51,7 +52,6 @@ if (!empty($agent_data['app']['nfsd']))
           {
           $base = strtolower($tokens[0]);
           array_shift($tokens);
          - array_shift($tokens);
           foreach ($tokens as $k => $v)
           {
           $datas[$base.($nfsLabel[$base][$k])] = $v;
          

          akotelba Adrian Kotelba added a comment - - edited It seems that array of tokens is shifted too many times and the last key is never read. A patch below may help (remove array shift and add "dummy" key to compensate for that in nfsv3 parameters). diff --git a/includes/polling/applications/nfsd.inc.php b/includes/polling/applications/nfsd.inc.php index fc396b88..e386c4f1 100644 --- a/includes/polling/applications/nfsd.inc.php +++ b/includes/polling/applications/nfsd.inc.php @@ - 36 , 6 + 36 , 7 @@ if (!empty($agent_data[ 'app' ][ 'nfsd' ])) );   $nfsLabel[ 'proc3' ] = array( + "dummy" , "null" , "getattr" , "setattr" , "lookup" , "access" , "readlink" , "read" , "write" , "create" , "mkdir" , "symlink" , "mknod" , "remove" , "rmdir" , "rename" , "link" , "readdr" , "readdirplus" , @@ - 51 , 7 + 52 , 6 @@ if (!empty($agent_data[ 'app' ][ 'nfsd' ])) { $base = strtolower($tokens[ 0 ]); array_shift($tokens); - array_shift($tokens); foreach ($tokens as $k => $v) { $datas[$base.($nfsLabel[$base][$k])] = $v;

          bump is there any progress on this issue? I'm willing to provide extra information if necessary. I still can't see the amount of writes on the storage-server in nfsd...

          veldenb Bernard van der Velden added a comment - bump is there any progress on this issue? I'm willing to provide extra information if necessary. I still can't see the amount of writes on the storage-server in nfsd...

          I wasn' t paying attention to the issue for a while, but it seems some stats for the module are generated since April 13th... Ironic

          Probably this commit:

          _r7746 | sid3windr | 2016-04-12 23:26:08 +0200 (di, 12 apr 2016) | 1 line

          Major: introduction of new RRD create/update framework, code conversion is over halfway, low hanging fruit mostly picked. Simplifies a lot of the code, provides insight into our current RRD file settings. Could later be used to send certain named metrics to other places from the core functions.
          _

          I see graphs now but a few metrics are still missing:
          NFSd RC
          "nocache" shows -nan

          NFSd I/O
          w_bytes shows -nan (I do get r_bytes but I'm pretty sure the box is also written because VM' s are running on this machine)

          NFSd Net
          t_conn shows -nan

          Graphs NFSd RPC and NFSd v3 show only zero's, but I would expect some stats here also.

          The new logfile as requested:
          poller.log

          veldenb Bernard van der Velden added a comment - I wasn' t paying attention to the issue for a while, but it seems some stats for the module are generated since April 13th... Ironic Probably this commit: _r7746 | sid3windr | 2016-04-12 23:26:08 +0200 (di, 12 apr 2016) | 1 line Major: introduction of new RRD create/update framework, code conversion is over halfway, low hanging fruit mostly picked. Simplifies a lot of the code, provides insight into our current RRD file settings. Could later be used to send certain named metrics to other places from the core functions. _ I see graphs now but a few metrics are still missing: NFSd RC "nocache" shows -nan NFSd I/O w_bytes shows -nan (I do get r_bytes but I'm pretty sure the box is also written because VM' s are running on this machine) NFSd Net t_conn shows -nan Graphs NFSd RPC and NFSd v3 show only zero's, but I would expect some stats here also. The new logfile as requested: poller.log

          Haha, Adam wrote incorrect module name in cmd

          Please do and attach debug again:
          ./poller.php -d -m unix-agent -h <device>

          landy Mike Stupalov added a comment - Haha, Adam wrote incorrect module name in cmd Please do and attach debug again: ./poller.php -d -m unix-agent -h <device>

          People

            sid3windr Tom Laermans
            veldenb Bernard van der Velden
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: