Uploaded image for project: 'Observium'
  1. Observium
  2. OBS-2631

poller-wrapper.py fails to log when log file is a fifo pipe in python 3

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • None
    • Professional Edition
    • Poller
    • CentOS 7, Python 3.4.5, PHP 7.2 from remirepo

    Description

      When the log file is a named fifo pipe or stdout (for injecting logs directly into journald) and the poller wrapper is running on Python 3, it fails with an "OSError: [Errno 29] Illegal seek" on the `f = open(log_path,'a')` line.

      This is a known but unsolved python 3 bug as can be seen here https://bugs.python.org/issue27805 and https://bugs.python.org/issue20074.

      The solution we used for now is to overwrite the open function to the open mode if appropriate, we did not want to modify the poller-wrapper directly to avoid issues with upgrades.

      `python3 poller-wrapper-wrapper.py poller-wrapper.py -w 10`

       

      poller-wrapper-wrapper.py is attached.

      Attachments

        1. OBS-2631.patch
          1 kB
          Gunnar Guðvarðarson
        2. poller-wrapper-wrapper.py
          0.6 kB
          Gunnar Guðvarðarson

        Activity

          [OBS-2631] poller-wrapper.py fails to log when log file is a fifo pipe in python 3

          Commited (with little difference) in r10295.

          Thanks.

          landy Mike Stupalov added a comment - Commited (with little difference) in r10295. Thanks.

          We need all logs to go to the systemd journal, and not to file. We accomplish that by using the following systemd files to create and then listen on special FIFO files:

          observium-logs@.socket

          [Unit]
          Description=Plain-Old-Stupid-Log-File to Journal Transfer Assistant (%I.log)
           
          [Socket]
          SocketUser=observium
          SocketGroup=apache
          SocketMode=0660
          ExecStartPre=/usr/bin/rm -f /var/www/observium/logs/%I.log
          ListenFIFO=/var/www/observium/logs/%I.log
          RemoveOnStop=true
           
          [Install]
          WantedBy=sockets.target

          observium-logs@.service

          [Unit]
          Description=Observium Logs Proxy (%I.log)
          Requires=observium-logs@%i.socket
           
          [Service]
          User=observium
          ExecStart=/usr/bin/cat /dev/fd/3

          We then enable, and start those services like this:

          commands

          systemctl enable --now observium-logs@cleanup.socket
          systemctl enable --now observium-logs@db.socket
          systemctl enable --now observium-logs@debug.socket
          systemctl enable --now observium-logs@debug_auth.socket
          systemctl enable --now observium-logs@housekeeping.socket
          systemctl enable --now observium-logs@observium.socket
          systemctl enable --now observium-logs@poller.socket
          systemctl enable --now observium-logs@rrd.socket
          systemctl enable --now observium-logs@update\x2derrors.socket

          At this point, we're done and ready for logs. We get all log messages into the systemd journal, and are able to view them like so:

          example

          # journalctl -u observium-logs@db --since='2 minutes ago'
          -- Logs begin at Sat 2019-12-28 12:52:44 UTC, end at Tue 2020-02-18 18:22:56 UTC. --
          Feb 18 18:21:08 obs3 cat[696]: [2020/02/18 18:21:08 +0000] poller.php(119863): Failed dbQuery (#1146 - Table 'observium.lsp' doesn't exist), Query: SELECT * FROM `lsp` WHERE `device_id` = '61'
          Feb 18 18:21:53 obs3 cat[696]: [2020/02/18 18:21:53 +0000] poller.php(119859): Failed dbQuery (#1146 - Table 'observium.lsp' doesn't exist), Query: SELECT * FROM `lsp` WHERE `device_id` = '72'

          or, a different example:

          example

          # journalctl -u 'observium-logs@*' --since='2 minutes ago'
          -- Logs begin at Sat 2019-12-28 12:52:44 UTC, end at Tue 2020-02-18 18:23:44 UTC. --
          Feb 18 18:22:26 obs3 cat[700]: [2020/02/18 18:22:26 +0000] poller.php(119859): /var/www/observium/poller.php: device.example.com - 1 devices polled in 145.3 secs
          Feb 18 18:22:26 obs3 cat[700]: [2020/02/18 18:22:26 +0000] poller-wrapper2.py(119848): /var/www/observium/poller-wrapper2.py: processed 15 devices in 146.38 seconds with 10 threads, load average (5min) 0.19
          Feb 18 18:22:26 obs3 cat[700]: [2020/02/18 18:22:26 +0000] discovery.php(120878): /var/www/observium/discovery.php: new - 0 devices discovered in 0.002 secs

          gunnarg Gunnar Guðvarðarson added a comment - We need all logs to go to the systemd journal, and not to file. We accomplish that by using the following systemd files to create and then listen on special FIFO files: observium-logs@.socket [Unit] Description=Plain-Old-Stupid-Log-File to Journal Transfer Assistant (%I.log)   [Socket] SocketUser=observium SocketGroup=apache SocketMode=0660 ExecStartPre=/usr/bin/rm -f /var/www/observium/logs/%I.log ListenFIFO=/var/www/observium/logs/%I.log RemoveOnStop=true   [Install] WantedBy=sockets.target observium-logs@.service [Unit] Description=Observium Logs Proxy (%I.log) Requires=observium-logs@%i.socket   [Service] User=observium ExecStart=/usr/bin/cat /dev/fd/3 We then enable, and start those services like this: commands systemctl enable --now observium-logs@cleanup.socket systemctl enable --now observium-logs@db.socket systemctl enable --now observium-logs@debug.socket systemctl enable --now observium-logs@debug_auth.socket systemctl enable --now observium-logs@housekeeping.socket systemctl enable --now observium-logs@observium.socket systemctl enable --now observium-logs@poller.socket systemctl enable --now observium-logs@rrd.socket systemctl enable --now observium-logs@update\x2derrors.socket At this point, we're done and ready for logs. We get all log messages into the systemd journal, and are able to view them like so: example # journalctl -u observium-logs@db --since='2 minutes ago' -- Logs begin at Sat 2019-12-28 12:52:44 UTC, end at Tue 2020-02-18 18:22:56 UTC. -- Feb 18 18:21:08 obs3 cat[696]: [2020/02/18 18:21:08 +0000] poller.php(119863): Failed dbQuery (#1146 - Table 'observium.lsp' doesn't exist), Query: SELECT * FROM `lsp` WHERE `device_id` = '61' Feb 18 18:21:53 obs3 cat[696]: [2020/02/18 18:21:53 +0000] poller.php(119859): Failed dbQuery (#1146 - Table 'observium.lsp' doesn't exist), Query: SELECT * FROM `lsp` WHERE `device_id` = '72' or, a different example: example # journalctl -u 'observium-logs@*' --since='2 minutes ago' -- Logs begin at Sat 2019-12-28 12:52:44 UTC, end at Tue 2020-02-18 18:23:44 UTC. -- Feb 18 18:22:26 obs3 cat[700]: [2020/02/18 18:22:26 +0000] poller.php(119859): /var/www/observium/poller.php: device.example.com - 1 devices polled in 145.3 secs Feb 18 18:22:26 obs3 cat[700]: [2020/02/18 18:22:26 +0000] poller-wrapper2.py(119848): /var/www/observium/poller-wrapper2.py: processed 15 devices in 146.38 seconds with 10 threads, load average (5min) 0.19 Feb 18 18:22:26 obs3 cat[700]: [2020/02/18 18:22:26 +0000] discovery.php(120878): /var/www/observium/discovery.php: new - 0 devices discovered in 0.002 secs
          landy Mike Stupalov added a comment - - edited

          we already use python3 for years, but never seen your case

          landy Mike Stupalov added a comment - - edited we already use python3 for years, but never seen your case

          Hey, I attached a patch as discussed ... months ago on IRC. It basically fixes the issue our poller wrapper mitigated.

          p.s. maybe a good idea to start switching to python3?

          gunnarg Gunnar Guðvarðarson added a comment - Hey, I attached a patch as discussed ... months ago on IRC. It basically fixes the issue our poller wrapper mitigated. p.s. maybe a good idea to start switching to python3?

          Hey Mike, did you see my response, ticket has been in Status:PENDING RESPONSE for a while now...

          gunnarg Gunnar Guðvarðarson added a comment - Hey Mike, did you see my response, ticket has been in  Status: PENDING RESPONSE  for a while now...
          1. remove current poller log file
          2. run `mkfifo <log file path>`
          3. run `cat <log file path>`
          4. (in a different window) run poller-wrapper using python 2, observe that it works
          5. (in a different window) run poller-wrapper using python 3, observe that it crashes with an "Illegal seek"
          gunnarg Gunnar Guðvarðarson added a comment - remove current poller log file run `mkfifo <log file path>` run `cat <log file path>` (in a different window) run poller-wrapper using python 2, observe that it works (in a different window) run poller-wrapper using python 3, observe that it crashes with an "Illegal seek"

          I asked, show me real case when required this wrapper.
          Ie, write steps to reproduce the error..

          I use python3 and not have such troubles when wrapper started from cron.

          landy Mike Stupalov added a comment - I asked, show me real case when required this wrapper. Ie, write steps to reproduce the error.. I use python3 and not have such troubles when wrapper started from cron.
          gunnarg Gunnar Guðvarðarson added a comment - - edited

          The only reason we use the wrapper is because your wrapper is expecting the 'append' mode to work for the configured log files, it does on files but when the logs are set to a fifo pipe or char device like /dev/stdout it results in an invalid seek because python 3 opens file in write mode and then seeks to the end which special files do not allow, python 2 does not do that, and opens special files correctly.

          Optimally python 3 would be fixed, but it hasn't, so my suggestion is something similar to our stat and conditional mode flag for the open call.

          gunnarg Gunnar Guðvarðarson added a comment - - edited The only reason we use the wrapper is because your wrapper is expecting the 'append' mode to work for the configured log files, it does on  files but when the logs are set to a fifo pipe or char device like /dev/stdout it results in an invalid seek because python 3 opens file in write mode and then seeks to the end which special files do not allow, python 2 does not do that, and opens special files correctly. Optimally python 3 would be fixed, but it hasn't, so my suggestion is something similar to our stat and conditional mode flag for the open call.

          Which real case for use this additional wrapper?

          In our normal usage (from cron) this issue not happened..

          landy Mike Stupalov added a comment - Which real case for use this additional wrapper? In our normal usage (from cron) this issue not happened..

          People

            landy Mike Stupalov
            gunnarg Gunnar Guðvarðarson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: