2018年5月21日 星期一

How to configure Nagios to monitor your systems and network

Before we start monitoring something with Nagios, we need to first understand its configuration structure.

# cd /usr/local/nagios/etc
# ls -l
-rw-rw-r-- 1 nagios nagios 12999 Apr 24 21:55 cgi.cfg
-rw-r--r-- 1 root   root      50 May 12 11:55 htpasswd.users
-rw-rw-r-- 1 nagios nagios 44868 May 12 14:46 nagios.cfg
drwxrwxr-x 2 nagios nagios  4096 May 14 01:22 objects
-rw-rw---- 1 nagios nagios  1312 Apr 24 21:55 resource.cfg

nagios.cfg is the main configuration file of Nagios.  It contains global parameters and is used to include other user customized configuration files. e.g.
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg


Let's get started by example:

First, we define something for Nagios to montior.  The basic unit is a host, which may have many services
/usr/local/nagios/etc/objects/localhost.cfg
define host{
        use                     linux-server  ; Name of host template to use
                                                        ; This host definition will inherit all variables that are defined
                                                        ; in (or inherited by) the linux-server host template definition.
        host_name         localhost
        alias                   localhost
        address              127.0.0.1
}
define service{
        use                             local-service         ; Name of service template to use
        host_name                 localhost
        service_description    PING
        check_command        check_ping!100.0,20%!500.0,60%
}

The highlighted part statement tells the host and service to use templates defined in templates.cfg, so let's have a look.  Note dhat "linux-server" itself is the child of another template "generic-host"

/usr/local/nagios/etc/objects/templates.cfg
define host{
        name                                        generic-host ; The name of this host template
        notifications_enabled              1                   ; Host notifications are enabled
        event_handler_enabled           1                   ; Host event handler is
enabled
        flap_detection_enabled           1                   ; Flap detection is enabled
        process_perf_data                   1                   ; Process performance data
        retain_status_information       1                   ; Retain status information across program restarts
        retain_nonstatus_information 1                   ; Retain non-status information across program restarts
        notification_period                  24x7            ; Send host notifications at any time
        register                                     0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
define host{
        name                          linux-server        ; The name of this host template
        use                             generic-host        ; This template inherits other values from the generic-host template
        check_period             24x7                    ; By default, Linux hosts are checked round the clock
        check_interval           5                          ; Actively check the host every 5 minutes
        retry_interval             1                          ; Schedule host check retries at 1 minute intervals
        max_check_attempts 0                          ; Check each Linux host 10 times (max)
        check_command        check-host-alive ; Default command to check Linux hosts
        notification_period     workhours          ; Linux admins hate to be woken up, so we only notify during the day
         ; Note that the notification_period variable is being overridden from
         ; the value that is inherited from the generic-host template!

        notification_interval   120             ; Resend notifications every 2 hours
        notification_options    d,u,r           ; Only send notifications for specific host states
        contact_groups            admins       ; Notifications get sent to the admins by default
        register                        0                 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

What a template does is to define common parameters that would be used over and over again by many hosts and services.  So, instead of including these parameter in every host and service definition, we create a template.  The template basically tells Nagios how and how often to check on  the host or service, and what to do in case there is a state change.  Most parameters are pretty self-explanatory, for example, "check_period  24x7" and "check_interval  5" is saying this host should be monitored 24 hours a day, 7 days a week, and  Nagios should check on the host every 5 minutes.

The paramters below may not be obvious on how they work, so I will talk more about them

"notification_options" - In which situations should Nagios send out notifications?  If we don't specify any, Nagios will send out notifications in all situations, but sometimes that may not be what we wanted.  So in the example above, "d,u,r" would mean "send me notifications when host is DOWN, UNREACHABLE, and RECOVER from d or u".  Flapping means the host/service is flapping between bad(d,u) and good(r), we would probably talk more about that later.

d = DOWN state
u = UNREACHABLE state
r = recoveries (OK state)
f = starts and stops flapping
s = scheduled downtime starts and ends
n (none) as an option, no host notifications will be sent out


check_command        check-host-alive
This is the command Nagios would call to determine the host's state.  To find out what it does, we would have to look at another configuraiton file - commands.cfg

define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
 }

Ok, what is "$USER1$"?  What is "check_ping"?  etc...   Again, we would need to yet look at another configuration file - resource.cfg, which is quite simple:
$USER1$=/usr/local/nagios/libexec

Let's now run the command: /usr/local/nagios/libexec/check_ping

# /usr/local/nagios/libexec/check_ping
check_ping: Could not parse arguments
Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%  [-p packets] [-t timeout] [-4|-6]

In Nagios, you may set WARNING and CRITICAL when there is problems detected, so in most commands, -w usually means warning criteria, -c means critical criteria.  When there is time unit involved, usually it would be in ms. In check_ping, rta is "rta" is round trip average, '"pl" is packet loss.  So let's get back to the command_line 
$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
This would mean we ping the host 5 times (-p 5), and mark "warning" if rta is > 3000ms or there is 80% packet loss; mark critical if rta>5000ms or there is 100% packet loss.


notification_period     workhours
This time we goto "timeperiods.cfg".  You would find a few examples in this file, such as work hours, specific holidays etc.

define timeperiod{
        timeperiod_name 24x7
        alias           24 Hours A Day, 7 Days A Week
        sunday        00:00-24:00
        monday      00:00-24:00
        tuesday       00:00-24:00
        wednesday  00:00-24:00
        thursday      00:00-24:00
        friday          00:00-24:00
        saturday      00:00-24:00
}


contact_groups            admins
Contact is how Nagios notify you when there are state changes.  Let's have a look at contacts.cfg

define contact{
        contact_name            nagiosadmin             ; Short name of user
        use                             generic-contact         ; this is from templates.cfg
        alias                           Nagios Admin            ; Full name of user
        email                          your_email_address@your_domain
        }
define contactgroup{
        contactgroup_name       admins
        alias                   Nagios Administrators
        members                 nagiosadmin
}


NOTE that "generic-contact" is in templates.cfg
define contact{
        name                                            generic-contact    ; The name of this contact template
        service_notification_period         24x7                    ; service notifications can be sent anytime
        host_notification_period              24x7                    ; host notifications can be sent anytime
        service_notification_options        w,u,c,r,f,s           ; send notifications for all service states, flapping events, and scheduled downtime events
        host_notification_options             d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events
        service_notification_commands   notify-service-by-email 
        host_notification_commands        notify-host-by-email 
        register                        0
}

NOTE that "notify-host-by-email" and "notify-service-by-email" are in commands.cfg.  These are simply using the "/bin/mail" command that comes with the OS to send out the emails.  You can certinaly use other means to send out the notifications other than email.  For instance, we can talk about how to use Telegram to send out the alarms.

define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
 }

# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
 }


Not sure if you are already feeling a bit dizzy as we are always jumping around configuration files...  I got that feeling at first too, but once you get used to the templated type of configuration, it is actually not that difficult.  Next, we will use more real example as I found it the esaier way to learn Nagios.

2018年5月12日 星期六

Install Nagios (network monitoring system) in Ubuntu 16.04

Nagios is basically a free and open source monitoring and alerting system.  It has many built-in tools to monitor various network services, such as HTTP, ICMP, SNMP. SMTP, POP3, FTP, SSH, etc etc.  Even when a tool is not available in the community, you can easily create a script for Nagios to use for monitoring with your favorite language, such as php, python, perl etc.

Nagios is powerful, but it's free version lacks an easy and intuitive UI to help you configure the system.    Although it does have quite a learning curve, you will still be able to get a hold of it if you learn it step by step patiently.

Installation
# install prerequisites packages
sudo -i
sudo apt-get update
sudo apt-get install -y autoconf gcc libc6 make wget unzip apache2 php libapache2-mod-php7.0 libgd2-xpm-dev


# create nagios user and group
useradd nagios
groupadd nagcmd
usermod -a -G nagios,nagcmd www-data


# download nagios source (you may check which is the latest release here: (https://www.nagios.org/downloads/nagios-core/)

wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.3.4.tar.gz
# extract the source, e.g. tar xzf nagios-4.3.4.tar.gz
# cd to source dir and compile
./configure --with-nagios-group=nagios --with-command-group=nagcmd --with-httpd-conf=/etc/apache2/sites-enabled

*** Configuration summary for nagios 4.3.4 2017-08-24 ***:

 General Options:
 -------------------------
        Nagios executable:  nagios
        Nagios user/group:  nagios,nagios
       Command user/group:  nagios,nagcmd
             Event Broker:  yes
        Install ${prefix}:  /usr/local/nagios
    Install ${includedir}:  /usr/local/nagios/include/nagios
                Lock file:  /run/nagios.lock
   Check result directory:  ${prefix}/var/spool/checkresults
           Init directory:  /etc/init.d
  Apache conf.d directory:  /etc/apache2/sites-enabled
             Mail program:  /bin/mail
                  Host OS:  linux-gnu
          IOBroker Method:  epoll

 Web Interface Options:
 ------------------------
                 HTML URL:  http://localhost/nagios/
                  CGI URL:  http://localhost/nagios/cgi-bin/
 Traceroute (used by WAP):

make all
make install
make install-init
make install-commandmode
make install-config
make install-webconf



# enable apache2 rewrite and cgi module 
a2enmod rewrite
a2enmod cgi


# create password to restrict access to nagios web
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
systemctl restart apache2.service

# copy event handler scripts to nagios dir (this is not a necessary step)
cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/
chown -R nagios:nagios /usr/local/nagios/libexec/eventhandlers

# in case you are curious, what is event handler?  It is basically a script that are run when host or service change states.  For example, when a http service is detected DOWN, you may want to call a script automatically to restart service, or to reboot a machine.  Or just to create a trouble ticket in helpdesk system to notify front-line colleagues.


# install Nagios plugins
wget https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz
 ./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-openssl
make
make install

# when something you want to monitor cannot be done by the standard plugins, search the community first, which may save you a lot of time. (https://exchange.nagios.org/directory/Plugins/)

# Now, let's start Apache and Nagios
systemctl start apache2
systemctl enable nagios
systemctl start nagios

# Open a browser and point to http://your_ip/nagios, that's it!

You may see now in "Hosts" and "Services", there is only a localhost and a few local services.  Yes, it's monitoring the Ubuntu host itself...  Next, let's start monitoring something more interesting...

2018年4月20日 星期五

Converting LibreOffice ODP to Office PPT/PPTX with videos

Since I have not bought Microsoft Office, I would use Libre Office Impress to make presentations.  It is easy to export it to PowerPoint format (ppt or pptx) and send it to others who work on PowerPoint.

However, the above statement is only true as long as you do NOT have videos in your presentations.  Libre Office can deal with the videos fine as long as it is in odp format.  When export it to ppt or pptx, the videos are simply lost...

After researching for a while, there are two things you should pay attention to when making presentation with videos, and especially when you are like me, using free Libre Office.

(1) Video format - You probably should stick with WMV if your target is playing the presentation with PowerPoint, because these formats would have greater compatibility with older verions of Office/Windows.  So what you need is a video converting software when what you have is not wmv/avi.  Just google around, you should be able to find some free ones from time to time, but note that some free software might leave watermark on your video. I tried the following two, and both are ok.
Pavtube Free version from here: http://www.multipelife.com/free-video-dvd-converter-ultimate
Any Video Converter Free Edition: http://www.any-video-converter.com/download-avc-free.php

(2) If you have PowerPoint, then fine, just insert your wmv files into the slides.  But if you are also using LibreOffice, remember to check the "Link" box (shown below) when inserting a video into slide.


Also put the video files in the same folder as the ODP/PPT file.  Then save the presentation as "PPT" format.  Now you can verify if the PPT works by sending to someone with PowerPoint, or simply download the free PowerPoint viewer from Microsoft.  Note that do NOT save as "PPTX", when you do that, the videos would disappear... I still cannot figure out why.  It does not matter you check the "Link" or not, PPTX simply do not work.  I guess we might have to wait for an update form LibreOffice for that.

2018年4月13日 星期五

Buying an air purifier (update)

After using the air purifier for a few months, I'm quite sure it can actually lower the PM2.5 level.  How do I know?  Coz I've also bought a PM2.5/10 detector (You can get a cheap one for around US$40-50).  Does it help my cats and myself to relief allergic symptoms... hmm I'm not too sure.  Sometimes it seems a little better, but the result not significant enough to convince me yet.  I did not turn it on 24 hours/day though, usually only 8-12 hours/day when I'm at home.

Anyway, one thing for sure, the HEPA filter is expensive, so I bought some cheaper filters from 3M and put them in front of it in the hope that the HEPA can last longer.  And it looks good as the 3M filter got really dirty after 3 months, and the HEPA still look relatively clean.  I actually also put some sponge, which is really cheap, in the very front to reduce large particles from getting inside.

Most economic air purifier probably has simliar struture: (1) front filter to block large particles (in my opinions, the holes are usually too large, and still a lot of dust can get in),  (2) Active carbon filter, (3) HEPA filter

My setup: Front filter -> Sponge (really cheap) -> Active carbon filter -> 3M filter (more costly, but still cheaper than HEAP replacement) -> HEPA

HEPA with 3M filter:

Sponge