Log book: 2018

2018年5月21日星期一

How to configure Nagios to monitor your systems and network

Before we start monitoring something with Nagios, we need to first understand its configuration structure.

# cd /usr/local/nagios/etc
# ls -l
-rw-rw-r-- 1 nagios nagios 12999 Apr 24 21:55 cgi.cfg
-rw-r--r-- 1 root root 50 May 12 11:55 htpasswd.users
-rw-rw-r-- 1 nagios nagios 44868 May 12 14:46 nagios.cfg
drwxrwxr-x 2 nagios nagios 4096 May 14 01:22 objects
-rw-rw---- 1 nagios nagios 1312 Apr 24 21:55 resource.cfg

nagios.cfg is the main configuration file of Nagios. It contains global parameters and is used to include other user customized configuration files. e.g.

cfg_file=/usr/local/nagios/etc/objects/commands.cfg

cfg_file=/usr/local/nagios/etc/objects/contacts.cfg

cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg

cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host

cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

Let's get started by example:

First, we define something for Nagios to montior. The basic unit is a host, which may have many services

/usr/local/nagios/etc/objects/localhost.cfg

define host{

use linux-server ; Name of host template to use

; This host definition will inherit all variables that are defined

; in (or inherited by) the linux-server host template definition.

host_name localhost

alias localhost

address 127.0.0.1

}

define service{

use local-service ; Name of service template to use

host_name localhost

service_description PING

check_command check_ping!100.0,20%!500.0,60%

}

The highlighted part statement tells the host and service to use templates defined in templates.cfg, so let's have a look. Note dhat "linux-server" itself is the child of another template "generic-host"

/usr/local/nagios/etc/objects/templates.cfg

define host{

name generic-host ; The name of this host template

notifications_enabled 1 ; Host notifications are enabled

event_handler_enabled 1 ; Host event handler is

enabled

flap_detection_enabled 1 ; Flap detection is enabled

process_perf_data 1 ; Process performance data

retain_status_information 1 ; Retain status information across program restarts

retain_nonstatus_information 1 ; Retain non-status information across program restarts

notification_period 24x7 ; Send host notifications at any time

}

define host{

name linux-server ; The name of this host template

use generic-host ; This template inherits other values from the generic-host template

check_period 24x7 ; By default, Linux hosts are checked round the clock

check_interval 5 ; Actively check the host every 5 minutes

retry_interval 1 ; Schedule host check retries at 1 minute intervals

max_check_attempts 0 ; Check each Linux host 10 times (max)

check_command check-host-alive ; Default command to check Linux hosts

notification_period workhours ; Linux admins hate to be woken up, so we only notify during the day

; Note that the notification_period variable is being overridden from

; the value that is inherited from the generic-host template!

notification_interval 120 ; Resend notifications every 2 hours

notification_options d,u,r ; Only send notifications for specific host states

contact_groups admins ; Notifications get sent to the admins by default

}

What a template does is to define common parameters that would be used over and over again by many hosts and services. So, instead of including these parameter in every host and service definition, we create a template. The template basically tells Nagios how and how often to check on the host or service, and what to do in case there is a state change. Most parameters are pretty self-explanatory, for example, "check_period 24x7" and "check_interval 5" is saying this host should be monitored 24 hours a day, 7 days a week, and Nagios should check on the host every 5 minutes.

The paramters below may not be obvious on how they work, so I will talk more about them

"notification_options" - In which situations should Nagios send out notifications? If we don't specify any, Nagios will send out notifications in all situations, but sometimes that may not be what we wanted. So in the example above, "d,u,r" would mean "send me notifications when host is DOWN, UNREACHABLE, and RECOVER from d or u". Flapping means the host/service is flapping between bad(d,u) and good(r), we would probably talk more about that later.

d = DOWN state
u = UNREACHABLE state
r = recoveries (OK state)
f = starts and stops flapping
s = scheduled downtime starts and ends
n (none) as an option, no host notifications will be sent out

check_command check-host-alive
This is the command Nagios would call to determine the host's state. To find out what it does, we would have to look at another configuraiton file - commands.cfg

define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}

Ok, what is "$USER1$"? What is "check_ping"? etc... Again, we would need to yet look at another configuration file - resource.cfg, which is quite simple:
$USER1$=/usr/local/nagios/libexec

Let's now run the command: /usr/local/nagios/libexec/check_ping

# /usr/local/nagios/libexec/check_ping
check_ping: Could not parse arguments
Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>% [-p packets] [-t timeout] [-4|-6]

In Nagios, you may set WARNING and CRITICAL when there is problems detected, so in most commands, -w usually means warning criteria, -c means critical criteria. When there is time unit involved, usually it would be in ms. In check_ping, rta is "rta" is round trip average, '"pl" is packet loss. So let's get back to the command_line

$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5

This would mean we ping the host 5 times (-p 5), and mark "warning" if rta is > 3000ms or there is 80% packet loss; mark critical if rta>5000ms or there is 100% packet loss.

notification_period workhours

This time we goto "timeperiods.cfg". You would find a few examples in this file, such as work hours, specific holidays etc.

define timeperiod{
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}

contact_groups admins
Contact is how Nagios notify you when there are state changes. Let's have a look at contacts.cfg

define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; this is from templates.cfg
alias Nagios Admin ; Full name of user
email your_email_address@your_domain
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
}

NOTE that "generic-contact" is in templates.cfg
define contact{
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
register 0
}

NOTE that "notify-host-by-email" and "notify-service-by-email" are in commands.cfg. These are simply using the "/bin/mail" command that comes with the OS to send out the emails. You can certinaly use other means to send out the notifications other than email. For instance, we can talk about how to use Telegram to send out the alarms.

define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}

# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}

Not sure if you are already feeling a bit dizzy as we are always jumping around configuration files... I got that feeling at first too, but once you get used to the templated type of configuration, it is actually not that difficult. Next, we will use more real example as I found it the esaier way to learn Nagios.

2018年5月12日星期六

Install Nagios (network monitoring system) in Ubuntu 16.04

Nagios is basically a free and open source monitoring and alerting system. It has many built-in tools to monitor various network services, such as HTTP, ICMP, SNMP. SMTP, POP3, FTP, SSH, etc etc. Even when a tool is not available in the community, you can easily create a script for Nagios to use for monitoring with your favorite language, such as php, python, perl etc.

Nagios is powerful, but it's free version lacks an easy and intuitive UI to help you configure the system.    Although it does have quite a learning curve, you will still be able to get a hold of it if you learn it step by step patiently.

Installation
# install prerequisites packages
sudo -i
sudo apt-get update
sudo apt-get install -y autoconf gcc libc6 make wget unzip apache2 php libapache2-mod-php7.0 libgd2-xpm-dev

# create nagios user and group
useradd nagios
groupadd nagcmd
usermod -a -G nagios,nagcmd www-data

# download nagios source (you may check which is the latest release here: (https://www.nagios.org/downloads/nagios-core/)

wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.3.4.tar.gz
# extract the source, e.g. tar xzf nagios-4.3.4.tar.gz
# cd to source dir and compile
./configure --with-nagios-group=nagios --with-command-group=nagcmd --with-httpd-conf=/etc/apache2/sites-enabled

*** Configuration summary for nagios 4.3.4 2017-08-24 ***:

General Options:
-------------------------
        Nagios executable: nagios
        Nagios user/group: nagios,nagios
       Command user/group: nagios,nagcmd
             Event Broker: yes
        Install ${prefix}: /usr/local/nagios
    Install ${includedir}: /usr/local/nagios/include/nagios
                Lock file: /run/nagios.lock
   Check result directory: ${prefix}/var/spool/checkresults
           Init directory: /etc/init.d
Apache conf.d directory: /etc/apache2/sites-enabled
             Mail program: /bin/mail
                  Host OS: linux-gnu
          IOBroker Method: epoll

Web Interface Options:
------------------------
                 HTML URL: http://localhost/nagios/
                  CGI URL: http://localhost/nagios/cgi-bin/
Traceroute (used by WAP):

make all
make install
make install-init
make install-commandmode
make install-config
make install-webconf

# enable apache2 rewrite and cgi module
a2enmod rewrite
a2enmod cgi

# create password to restrict access to nagios web
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
systemctl restart apache2.service

# copy event handler scripts to nagios dir (this is not a necessary step)
cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/
chown -R nagios:nagios /usr/local/nagios/libexec/eventhandlers

# in case you are curious, what is event handler? It is basically a script that are run when host or service change states. For example, when a http service is detected DOWN, you may want to call a script automatically to restart service, or to reboot a machine. Or just to create a trouble ticket in helpdesk system to notify front-line colleagues.

# install Nagios plugins
wget https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz
./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-openssl
make
make install

# when something you want to monitor cannot be done by the standard plugins, search the community first, which may save you a lot of time. (https://exchange.nagios.org/directory/Plugins/)

# Now, let's start Apache and Nagios
systemctl start apache2
systemctl enable nagios
systemctl start nagios

# Open a browser and point to http://your_ip/nagios, that's it!

You may see now in "Hosts" and "Services", there is only a localhost and a few local services. Yes, it's monitoring the Ubuntu host itself... Next, let's start monitoring something more interesting...

2018年4月20日星期五

Converting LibreOffice ODP to Office PPT/PPTX with videos

Since I have not bought Microsoft Office, I would use Libre Office Impress to make presentations. It is easy to export it to PowerPoint format (ppt or pptx) and send it to others who work on PowerPoint.

However, the above statement is only true as long as you do NOT have videos in your presentations. Libre Office can deal with the videos fine as long as it is in odp format. When export it to ppt or pptx, the videos are simply lost...

After researching for a while, there are two things you should pay attention to when making presentation with videos, and especially when you are like me, using free Libre Office.

(1) Video format - You probably should stick with WMV if your target is playing the presentation with PowerPoint, because these formats would have greater compatibility with older verions of Office/Windows. So what you need is a video converting software when what you have is not wmv/avi. Just google around, you should be able to find some free ones from time to time, but note that some free software might leave watermark on your video. I tried the following two, and both are ok.
Pavtube Free version from here: http://www.multipelife.com/free-video-dvd-converter-ultimate
Any Video Converter Free Edition: http://www.any-video-converter.com/download-avc-free.php

(2) If you have PowerPoint, then fine, just insert your wmv files into the slides. But if you are also using LibreOffice, remember to check the "Link" box (shown below) when inserting a video into slide.

Also put the video files in the same folder as the ODP/PPT file. Then save the presentation as "PPT" format. Now you can verify if the PPT works by sending to someone with PowerPoint, or simply download the free PowerPoint viewer from Microsoft. Note that do NOT save as "PPTX", when you do that, the videos would disappear... I still cannot figure out why. It does not matter you check the "Link" or not, PPTX simply do not work. I guess we might have to wait for an update form LibreOffice for that.

2018年4月13日星期五

Buying an air purifier (update)

After using the air purifier for a few months, I'm quite sure it can actually lower the PM2.5 level. How do I know? Coz I've also bought a PM2.5/10 detector (You can get a cheap one for around US$40-50). Does it help my cats and myself to relief allergic symptoms... hmm I'm not too sure. Sometimes it seems a little better, but the result not significant enough to convince me yet. I did not turn it on 24 hours/day though, usually only 8-12 hours/day when I'm at home.

Anyway, one thing for sure, the HEPA filter is expensive, so I bought some cheaper filters from 3M and put them in front of it in the hope that the HEPA can last longer. And it looks good as the 3M filter got really dirty after 3 months, and the HEPA still look relatively clean. I actually also put some sponge, which is really cheap, in the very front to reduce large particles from getting inside.

Most economic air purifier probably has simliar struture: (1) front filter to block large particles (in my opinions, the holes are usually too large, and still a lot of dust can get in), (2) Active carbon filter, (3) HEPA filter

My setup: Front filter -> Sponge (really cheap) -> Active carbon filter -> 3M filter (more costly, but still cheaper than HEAP replacement) -> HEPA

HEPA with 3M filter:

Sponge

2018年5月21日 星期一

How to configure Nagios to monitor your systems and network

2018年5月12日 星期六

Install Nagios (network monitoring system) in Ubuntu 16.04

2018年4月20日 星期五

Converting LibreOffice ODP to Office PPT/PPTX with videos

2018年4月13日 星期五

Buying an air purifier (update)

2018年5月21日星期一

2018年5月12日星期六

2018年4月20日星期五

2018年4月13日星期五