Uploaded image for project: 'Observium'
  1. Observium
  2. OBS-985

Open pipes after Observium matches a filtered syslog event

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • None
    • None
    • Ubuntu 14.04.01 LTS

    Description

      [First ever bug report on here, go easy on me!]

      Since switching to rsyslogd we’ve had several days where syslogging into Observium has simply ‘stopped’ without reason at random points. This is on Ubuntu 14.04.01 LTS. After some investigation we’ve found that it is running out of FIFO pipes, for example:

      lsof | grep rsyslogd –c
      10542

      Made up of 10,500 lines like:
      rsyslogd 57002 syslog 10w FIFO 0,8 0t0 122857431 pipe
      rsyslogd 57002 syslog 11w FIFO 0,8 0t0 122860687 pipe
      rsyslogd 57002 syslog 12w FIFO 0,8 0t0 122849736 pipe
      rsyslogd 57002 syslog 13w FIFO 0,8 0t0 122862727 pipe
      <and so on>

      You then have to restart rsyslogd and everything works again for up to 10-20 hours.

      Immediately after the restart if you issue:

      lsof -c rsyslogd

      You get a few open pipes at the bottom of the list, this number increases quickly and over time it grows to many pages and hits the limit around 10,000+ pipes, then it breaks again.

      After trial and error, I’ve narrowed it down to a specific feature within the Observium syslog.php script module (i.e. /etc/rsyslog.d/30-observium.conf).

      I noticed that the rate of increase is directly proportional to the number of messages which match the syslog filtering configuration entries:

      $config['syslog']['filter'][]

      If I remove ALL the filter entries then the rate drops significantly, from 50 new pipes every few minutes, to only a couple of new pipes per hour.

      Soooo... I believe that when a syslog event matches a filter (or is for whatever reason considered ‘not suitable’) for insertion into the Observium database, there is a pipe left open by something within the import process into Obserivum.

      I’m not particularly hot on this whole ‘pipe’ thing (had to google it to get this far), but the findings above should be easy to replicate. Just setup a filter which matches a common event and then generate the event, you should see an open pipe left for every event which matches a filter.

      I think we are seeing this quite a lot as we are using the filters to pull out a lot of crap which we are not interested in logging. So there are a lot of matches against the filters, for reference (and in case there is something in this syntax which is the actual issue) our full syslog config is:

      $config['enable_syslog'] = 1;
      $config['syslog']['filter'][] = 'Connection from UDP: [';
      $config['syslog']['filter'][] = 'ipSystemStatsTable node ipSystemStatsOutFragOKs not implemented';
      $config['syslog']['filter'][] = 'diskio.c';
      $config['syslog']['filter'][] = 'check pass; user unknown';
      $config['syslog']['filter'][] = 'bad username';
      $config['syslog']['filter'][] = 'Authentication failed';
      $config['syslog']['filter'][] = 'Refused user';
      $config['syslog']['filter'][] = 'flowset length error';
      $config['syslog']['filter'][] = 'field ignored';
      $config['syslog']['filter'][] = 'was not found when attempting to remove it';

      I’m afraid I’m not much help beyond identifying the steps required to generate the issue and how to see that it is occurring. No hope of me finding the actual code responsible, but - maybe this is enough for someone to see what is wrong as it's a very specific set of circumstances that cause it; and it's very easy to reproduce.

      Attachments

        Activity

          People

            landy Mike Stupalov
            robertw Robert Williams
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: