2018年5月21日 星期一

How to configure Nagios to monitor your systems and network

Before we start monitoring something with Nagios, we need to first understand its configuration structure.

# cd /usr/local/nagios/etc
# ls -l
-rw-rw-r-- 1 nagios nagios 12999 Apr 24 21:55 cgi.cfg
-rw-r--r-- 1 root   root      50 May 12 11:55 htpasswd.users
-rw-rw-r-- 1 nagios nagios 44868 May 12 14:46 nagios.cfg
drwxrwxr-x 2 nagios nagios  4096 May 14 01:22 objects
-rw-rw---- 1 nagios nagios  1312 Apr 24 21:55 resource.cfg

nagios.cfg is the main configuration file of Nagios.  It contains global parameters and is used to include other user customized configuration files. e.g.
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg


Let's get started by example:

First, we define something for Nagios to montior.  The basic unit is a host, which may have many services
/usr/local/nagios/etc/objects/localhost.cfg
define host{
        use                     linux-server  ; Name of host template to use
                                                        ; This host definition will inherit all variables that are defined
                                                        ; in (or inherited by) the linux-server host template definition.
        host_name         localhost
        alias                   localhost
        address              127.0.0.1
}
define service{
        use                             local-service         ; Name of service template to use
        host_name                 localhost
        service_description    PING
        check_command        check_ping!100.0,20%!500.0,60%
}

The highlighted part statement tells the host and service to use templates defined in templates.cfg, so let's have a look.  Note dhat "linux-server" itself is the child of another template "generic-host"

/usr/local/nagios/etc/objects/templates.cfg
define host{
        name                                        generic-host ; The name of this host template
        notifications_enabled              1                   ; Host notifications are enabled
        event_handler_enabled           1                   ; Host event handler is
enabled
        flap_detection_enabled           1                   ; Flap detection is enabled
        process_perf_data                   1                   ; Process performance data
        retain_status_information       1                   ; Retain status information across program restarts
        retain_nonstatus_information 1                   ; Retain non-status information across program restarts
        notification_period                  24x7            ; Send host notifications at any time
        register                                     0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
define host{
        name                          linux-server        ; The name of this host template
        use                             generic-host        ; This template inherits other values from the generic-host template
        check_period             24x7                    ; By default, Linux hosts are checked round the clock
        check_interval           5                          ; Actively check the host every 5 minutes
        retry_interval             1                          ; Schedule host check retries at 1 minute intervals
        max_check_attempts 0                          ; Check each Linux host 10 times (max)
        check_command        check-host-alive ; Default command to check Linux hosts
        notification_period     workhours          ; Linux admins hate to be woken up, so we only notify during the day
         ; Note that the notification_period variable is being overridden from
         ; the value that is inherited from the generic-host template!

        notification_interval   120             ; Resend notifications every 2 hours
        notification_options    d,u,r           ; Only send notifications for specific host states
        contact_groups            admins       ; Notifications get sent to the admins by default
        register                        0                 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

What a template does is to define common parameters that would be used over and over again by many hosts and services.  So, instead of including these parameter in every host and service definition, we create a template.  The template basically tells Nagios how and how often to check on  the host or service, and what to do in case there is a state change.  Most parameters are pretty self-explanatory, for example, "check_period  24x7" and "check_interval  5" is saying this host should be monitored 24 hours a day, 7 days a week, and  Nagios should check on the host every 5 minutes.

The paramters below may not be obvious on how they work, so I will talk more about them

"notification_options" - In which situations should Nagios send out notifications?  If we don't specify any, Nagios will send out notifications in all situations, but sometimes that may not be what we wanted.  So in the example above, "d,u,r" would mean "send me notifications when host is DOWN, UNREACHABLE, and RECOVER from d or u".  Flapping means the host/service is flapping between bad(d,u) and good(r), we would probably talk more about that later.

d = DOWN state
u = UNREACHABLE state
r = recoveries (OK state)
f = starts and stops flapping
s = scheduled downtime starts and ends
n (none) as an option, no host notifications will be sent out


check_command        check-host-alive
This is the command Nagios would call to determine the host's state.  To find out what it does, we would have to look at another configuraiton file - commands.cfg

define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
 }

Ok, what is "$USER1$"?  What is "check_ping"?  etc...   Again, we would need to yet look at another configuration file - resource.cfg, which is quite simple:
$USER1$=/usr/local/nagios/libexec

Let's now run the command: /usr/local/nagios/libexec/check_ping

# /usr/local/nagios/libexec/check_ping
check_ping: Could not parse arguments
Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%  [-p packets] [-t timeout] [-4|-6]

In Nagios, you may set WARNING and CRITICAL when there is problems detected, so in most commands, -w usually means warning criteria, -c means critical criteria.  When there is time unit involved, usually it would be in ms. In check_ping, rta is "rta" is round trip average, '"pl" is packet loss.  So let's get back to the command_line 
$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
This would mean we ping the host 5 times (-p 5), and mark "warning" if rta is > 3000ms or there is 80% packet loss; mark critical if rta>5000ms or there is 100% packet loss.


notification_period     workhours
This time we goto "timeperiods.cfg".  You would find a few examples in this file, such as work hours, specific holidays etc.

define timeperiod{
        timeperiod_name 24x7
        alias           24 Hours A Day, 7 Days A Week
        sunday        00:00-24:00
        monday      00:00-24:00
        tuesday       00:00-24:00
        wednesday  00:00-24:00
        thursday      00:00-24:00
        friday          00:00-24:00
        saturday      00:00-24:00
}


contact_groups            admins
Contact is how Nagios notify you when there are state changes.  Let's have a look at contacts.cfg

define contact{
        contact_name            nagiosadmin             ; Short name of user
        use                             generic-contact         ; this is from templates.cfg
        alias                           Nagios Admin            ; Full name of user
        email                          your_email_address@your_domain
        }
define contactgroup{
        contactgroup_name       admins
        alias                   Nagios Administrators
        members                 nagiosadmin
}


NOTE that "generic-contact" is in templates.cfg
define contact{
        name                                            generic-contact    ; The name of this contact template
        service_notification_period         24x7                    ; service notifications can be sent anytime
        host_notification_period              24x7                    ; host notifications can be sent anytime
        service_notification_options        w,u,c,r,f,s           ; send notifications for all service states, flapping events, and scheduled downtime events
        host_notification_options             d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events
        service_notification_commands   notify-service-by-email 
        host_notification_commands        notify-host-by-email 
        register                        0
}

NOTE that "notify-host-by-email" and "notify-service-by-email" are in commands.cfg.  These are simply using the "/bin/mail" command that comes with the OS to send out the emails.  You can certinaly use other means to send out the notifications other than email.  For instance, we can talk about how to use Telegram to send out the alarms.

define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
 }

# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
 }


Not sure if you are already feeling a bit dizzy as we are always jumping around configuration files...  I got that feeling at first too, but once you get used to the templated type of configuration, it is actually not that difficult.  Next, we will use more real example as I found it the esaier way to learn Nagios.

2018年5月12日 星期六

Install Nagios (network monitoring system) in Ubuntu 16.04

Nagios is basically a free and open source monitoring and alerting system.  It has many built-in tools to monitor various network services, such as HTTP, ICMP, SNMP. SMTP, POP3, FTP, SSH, etc etc.  Even when a tool is not available in the community, you can easily create a script for Nagios to use for monitoring with your favorite language, such as php, python, perl etc.

Nagios is powerful, but it's free version lacks an easy and intuitive UI to help you configure the system.    Although it does have quite a learning curve, you will still be able to get a hold of it if you learn it step by step patiently.

Installation
# install prerequisites packages
sudo -i
sudo apt-get update
sudo apt-get install -y autoconf gcc libc6 make wget unzip apache2 php libapache2-mod-php7.0 libgd2-xpm-dev


# create nagios user and group
useradd nagios
groupadd nagcmd
usermod -a -G nagios,nagcmd www-data


# download nagios source (you may check which is the latest release here: (https://www.nagios.org/downloads/nagios-core/)

wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.3.4.tar.gz
# extract the source, e.g. tar xzf nagios-4.3.4.tar.gz
# cd to source dir and compile
./configure --with-nagios-group=nagios --with-command-group=nagcmd --with-httpd-conf=/etc/apache2/sites-enabled

*** Configuration summary for nagios 4.3.4 2017-08-24 ***:

 General Options:
 -------------------------
        Nagios executable:  nagios
        Nagios user/group:  nagios,nagios
       Command user/group:  nagios,nagcmd
             Event Broker:  yes
        Install ${prefix}:  /usr/local/nagios
    Install ${includedir}:  /usr/local/nagios/include/nagios
                Lock file:  /run/nagios.lock
   Check result directory:  ${prefix}/var/spool/checkresults
           Init directory:  /etc/init.d
  Apache conf.d directory:  /etc/apache2/sites-enabled
             Mail program:  /bin/mail
                  Host OS:  linux-gnu
          IOBroker Method:  epoll

 Web Interface Options:
 ------------------------
                 HTML URL:  http://localhost/nagios/
                  CGI URL:  http://localhost/nagios/cgi-bin/
 Traceroute (used by WAP):

make all
make install
make install-init
make install-commandmode
make install-config
make install-webconf



# enable apache2 rewrite and cgi module 
a2enmod rewrite
a2enmod cgi


# create password to restrict access to nagios web
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
systemctl restart apache2.service

# copy event handler scripts to nagios dir (this is not a necessary step)
cp -R contrib/eventhandlers/ /usr/local/nagios/libexec/
chown -R nagios:nagios /usr/local/nagios/libexec/eventhandlers

# in case you are curious, what is event handler?  It is basically a script that are run when host or service change states.  For example, when a http service is detected DOWN, you may want to call a script automatically to restart service, or to reboot a machine.  Or just to create a trouble ticket in helpdesk system to notify front-line colleagues.


# install Nagios plugins
wget https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz
 ./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-openssl
make
make install

# when something you want to monitor cannot be done by the standard plugins, search the community first, which may save you a lot of time. (https://exchange.nagios.org/directory/Plugins/)

# Now, let's start Apache and Nagios
systemctl start apache2
systemctl enable nagios
systemctl start nagios

# Open a browser and point to http://your_ip/nagios, that's it!

You may see now in "Hosts" and "Services", there is only a localhost and a few local services.  Yes, it's monitoring the Ubuntu host itself...  Next, let's start monitoring something more interesting...

2018年4月20日 星期五

Converting LibreOffice ODP to Office PPT/PPTX with videos

Since I have not bought Microsoft Office, I would use Libre Office Impress to make presentations.  It is easy to export it to PowerPoint format (ppt or pptx) and send it to others who work on PowerPoint.

However, the above statement is only true as long as you do NOT have videos in your presentations.  Libre Office can deal with the videos fine as long as it is in odp format.  When export it to ppt or pptx, the videos are simply lost...

After researching for a while, there are two things you should pay attention to when making presentation with videos, and especially when you are like me, using free Libre Office.

(1) Video format - You probably should stick with WMV if your target is playing the presentation with PowerPoint, because these formats would have greater compatibility with older verions of Office/Windows.  So what you need is a video converting software when what you have is not wmv/avi.  Just google around, you should be able to find some free ones from time to time, but note that some free software might leave watermark on your video. I tried the following two, and both are ok.
Pavtube Free version from here: http://www.multipelife.com/free-video-dvd-converter-ultimate
Any Video Converter Free Edition: http://www.any-video-converter.com/download-avc-free.php

(2) If you have PowerPoint, then fine, just insert your wmv files into the slides.  But if you are also using LibreOffice, remember to check the "Link" box (shown below) when inserting a video into slide.


Also put the video files in the same folder as the ODP/PPT file.  Then save the presentation as "PPT" format.  Now you can verify if the PPT works by sending to someone with PowerPoint, or simply download the free PowerPoint viewer from Microsoft.  Note that do NOT save as "PPTX", when you do that, the videos would disappear... I still cannot figure out why.  It does not matter you check the "Link" or not, PPTX simply do not work.  I guess we might have to wait for an update form LibreOffice for that.

2018年4月13日 星期五

Buying an air purifier (update)

After using the air purifier for a few months, I'm quite sure it can actually lower the PM2.5 level.  How do I know?  Coz I've also bought a PM2.5/10 detector (You can get a cheap one for around US$40-50).  Does it help my cats and myself to relief allergic symptoms... hmm I'm not too sure.  Sometimes it seems a little better, but the result not significant enough to convince me yet.  I did not turn it on 24 hours/day though, usually only 8-12 hours/day when I'm at home.

Anyway, one thing for sure, the HEPA filter is expensive, so I bought some cheaper filters from 3M and put them in front of it in the hope that the HEPA can last longer.  And it looks good as the 3M filter got really dirty after 3 months, and the HEPA still look relatively clean.  I actually also put some sponge, which is really cheap, in the very front to reduce large particles from getting inside.

Most economic air purifier probably has simliar struture: (1) front filter to block large particles (in my opinions, the holes are usually too large, and still a lot of dust can get in),  (2) Active carbon filter, (3) HEPA filter

My setup: Front filter -> Sponge (really cheap) -> Active carbon filter -> 3M filter (more costly, but still cheaper than HEAP replacement) -> HEPA

HEPA with 3M filter:

Sponge

2017年11月6日 星期一

Buying a Air Purifier

Why? Because I have two cats, and I have allergic problems.  Well, nothing I can't live with, but the worse part is my cat has feline asthma.

After google-ing for a while, it seems those with multiple filters should be better, e.g. front filter for larger particles, active carbon filter in the middle for finer particles, and HEPA filter for PM2.5 particles.  HEPA is the crucial component, but it also comes with a limited lifespan, and it is expensive...  You cannot wash as it would ruin its structure and makes it useless.  Some people mentions you can vacuum clean it, but usually is not very effective.

Some brands have claimed their HEPA filters have 10 years lifespan... Obviously the filter would have a limited capacity for PM2.5 particles, and you can't wash it.  Can it really last that long?  Many brands would let you download their operation manual online.  I found one famous brand has the following assumptions:

Filter life is based on smoking 5 cigarettes/per day (10 years).   But according to the Specification of The Japan Electrical Manufacturer's Association JEM1467 if based on the conidtion of smoking 10 cigarettes per day, the replacement period is about 5 years.

Ok, more google-ing finds that 1 cigarette is around 12 mg of PM2.5 (I'm actually not too sure if this is correct: http://www.myhealthbeijing.com/pollution/is-pm2-5-from-air-pollution-the-same-as-from-smoking/)

The air purifier spec should tell you how much cubic meter of air it would handle per hour, the particular one I'm looking at has 150 cubic meter/hour.  So let's do some math

Air flow volume: 150 cubic meter/hour
PM2.5 in the place I live: ~ 45 μg/cubic meter

150 x 45 = 6750μg = 6.75mg/hour = 0.5625 cigarette

open 24 hour/day = 13.5 cigarettes/day =>  3.7 years
75% of theoretical figure ~ 2.8 years
50% of theoretical fiture ~ 1.85 years

open 12 hour/day = 6.75 cigarettes/day => 7.4 years
75% of theoretical figure ~ 5.6 years
50% of theoretical fiture ~ 3.7 years

Well, the figures doesn't seem too sad although it could quite far off the claimed lifespan under certain conditions and assumptions T_T


I couldn't help to further find out how much PM2.5 I'm inhaling per day though it's off topic...

an average person inhales ~ 0.5 cubic meter/hour at rest, 0.7 cubic meter/hour at light activity (maybe walking)
https://en.wikipedia.org/wiki/Respiratory_minute_volume

Take the average: 0.6 x 45 x 24 = 0.648mg  :D

2017年3月10日 星期五

Cacti - Using Data Query


Cacti's documentation is actually pretty comprehensive, and I would strongly recommend it.  I still write this only because I feel some parts may not be detailed enough in the doc.
Here is the URL for "data queries"
http://docs.cacti.net/manual:100:3a_advanced_topics.3_data_queries#data_queries

Data Query is a very useful and powerful tool to create graphs. There are basically two ways to do it in Cacti - (1) With SNMP, (2) With a script

With an external script, you can virtually do anything with it to accomplish some very complicated tasks.  But whenever it is possible to be done in SNMP query, please do it.  Running spine with pure snmp calls is often much more efficient than calling external scripts.

SNMP Query

(1) XML Template File:
  • Define an index
    <oid_index>, <oid_index_parse>, <oid_suffix>
  • Define input parameters
  • Define output parameters
IF-MIB example:
This is by far the simplest type and is bundled with Cacti in "resource/snmp_queries.interface.xml".  First, it has a OID that describes all its indexes
<oid_index>.1.3.6.1.2.1.2.2.1.1</oid_index>
e.g. in my VM, snmpwalk -v2c -c xxxxxx localhost .1.3.6.1.2.1.2.2.1.1
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2

And then we would use the index obtained from <oid_index> to reference other OIDs
e.g. Description
snmpwalk -v2c -c xxxxx localhost .1.3.6.1.2.1.2.2.1.2
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: enp0s3

e.g. OutOctets
snmpwalk -v2c -c xxxxx localhost .1.3.6.1.2.1.2.2.1.16
IF-MIB::ifOutOctets.1 = Counter32: 2792348  <= this is "lo" out octets
IF-MIB::ifOutOctets.2 = Counter32: 8936144  <= this is "enp0s3" out octets


However, reality is cruel and the MIB world isn't always so nice...  what if the index OID doesn't exist?  Take IF-MIB as an example, let's pretend that "<oid_index>.1.3.6.1.2.1.2.2.1.1</oid_index>" isn't available, what can we do?  If you look at the "Description" OID again, you may have noticed that it already contains the "index" as part of its OID!
snmpwalk -v2c -c xxxxx localhost .1.3.6.1.2.1.2.2.1.2
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: enp0s3

Let's convert it to raw format to have a clearer look (-On)
snmpwalk -v2c -c public -On localhost .1.3.6.1.2.1.2.2.1.2
.1.3.6.1.2.1.2.2.1.2.1 = STRING: lo
.1.3.6.1.2.1.2.2.1.2.2 = STRING: enp0s3

Here comes the "<oid_index_parse>" parameter, we want to extract part of the OID as the index. One thing though, to use this parameter you should at least have some basic knowledge on regular expression.
e.g.
<oid_index>.1.3.6.1.2.1.2.2.1.2</oid_index>
<oid_index_parse>OID/REGEXP:.*\.([0-9]{1,})$</oid_index_parse>

Now, in interface.xml, if you replace:
        <oid_index>.1.3.6.1.2.1.2.2.1.1</oid_index>
        <oid_num_indexes>.1.3.6.1.2.1.2.1.0</oid_num_indexes>
With:
        <oid_index>.1.3.6.1.2.1.2.2.1.2</oid_index>
        <oid_index_parse>OID/REGEXP:.*\.([0-9]{1,})$</oid_index_parse>

It should just work fine with the same behavior.

Below is some further examples from Cacti Doc, strongly recommended to go through it.
http://docs.cacti.net/howto:data_query_templates

After you are done with the index part, the rest is relatively simple. Input are parameters that would help you create and describe your graph, while output are values that would actually appear in your graphs. Read through interface.xml, and create some graphs then you would fully understand.  To do it, just create a new device in "Management->devices", or simply use "Local Linux Machine" as a starting point.  Go to "Associated Data Queries" at the bottom, select "SNMP - Interface Statistics" and click "Add".  Then click "Create Graphs for this Device" near the top right corner.

One more note on the output parameter, some OIDs may not simply append the "index" part to the end... e.g. 1.2.3.4.5.6.index.1
<Sample> 
<name>Sample1</name> 
<method>get</method> 
<source>value</source> 
<direction>output</direction> 
<oid>.1.2.3.4.5.6</oid> 
<oid_suffix>1</oid_suffix> </Sample> 


Script Data Query

The idea is the same, but we do it in a script rather than the built-in snmp engine.  Take "resource/script_server/host_cpu.xml" and "scripts/ss_host_cpu.php" as example.

script to run
<script_path>|path_cacti|/scripts/ss_host_cpu.php</script_path>

function name
<script_function>ss_host_cpu</script_function>

arguments
<arg_prepend>|host_hostname| |host_id| |host_snmp_version|:|host_snmp_port|:|host_snmp_timeout|:|host_ping_retries|:|host_max_oids|:|host_snmp_community|:|host_snmp_username|:|host_snmp_password|:|host_snmp_auth_protocol|:|host_snmp_priv_passphrase|:|host_snmp_priv_protocol|:|host_snmp_context|</arg_prepend>

These refer to "$cmd" in the script
<arg_index>index</arg_index>
<arg_query>query</arg_query>
<arg_get>get</arg_get>
<arg_num_indexes>num_indexes</arg_num_indexes>

The <query_name> paramters refer to "$arg1" in the script
<hrProcessorFrwID>
    <name>Processor Index Number</name>
    <direction>input</direction>
    <query_name>index</query_name>
</hrProcessorFrwID>
<hrProcessorLoad>
    <name>Processor Usage</name>
    <direction>output</direction>
    <query_name>usage</query_name>
</hrProcessorLoad>



Now we look at the "scripts/ss_host_cpu.php", you will find it has a funciton:
function ss_host_cpu($hostname, $host_id, $snmp_auth, $cmd, $arg1 = '', $arg2 = '')

Let's take a look at the argument part, $hostname refers to |hostname|, $hostid to |host_id|, $snmp_auth to the long colon separated snmp parameters. $cmd is the action, $arg1 is the "query_name" and $arg2 is the index obtained from "index" cmd.

Test by calling from command line:
# php -q scripts/ss_host_cpu.php 127.0.0.1 1 1:161:500:1:5:"my_community":"":"":"":"":"":"" index
0
1
2

3

# php -q scripts/ss_host_cpu.php 127.0.0.1 1 1:161:500:1:5:"my_community":"":"":"":"":"":"" num_indexes
4

Query CPU indexes
# php -q scripts/ss_host_cpu.php 127.0.0.1 1 1:161:500:1:5:"my_community":"":"":"":"":"":"" query index
0!0
1!1
2!2
3!3

Query CPU usage
# php -q scripts/ss_host_cpu.php 127.0.0.1 1 1:161:500:1:5:"my_community":"":"":"":"":"":"" query usage
0!1
1!1
2!1
3!1

Well, does it look right?  Let's make the CPU busy and try again.
run a busy loop by
# while [ 1 ]; do echo 123 > /dev/null; done

# php -q scripts/ss_host_cpu.php 127.0.0.1 1 1:161:500:1:5:"my_community":"":"":"":"":"":"" query usage
0!1
1!1
2!41
3!1

Get individual
# php -q scripts/ss_host_cpu.php 127.0.0.1 1 1:161:500:1:5:"my_community":"":"":"":"":"":"" get usage 0
1


Once again, do read the official doc if you have time.
http://docs.cacti.net/manual:100:3a_advanced_topics.3d_script_data_query_walkthrough#script_data_query_walkthrough


After you're done with the xml file, go to "Data Queries" and create a new one that points to your xml file.

Unfortunately, we are still few steps away from creating a graph with "Data Query", but the rest parts are pretty straight forward. The easiest way to understand it is to go through the network interface examples again, or read through Cacti's official doc.  I will leave out the details as I am lazy...
http://docs.cacti.net/manual:100:3a_advanced_topics.3_data_queries#data_queries

(2) Data Source Template
  • Interface - Errors/Discards
  • Interface - Non-Unicast Packets
  • Interface - Traffic
  • Interface - Unicast Packets


(3) Graph Template
  • Interface - Errors/Discards
  • Interface - Non-Unicast Packets
  • Interface - Traffic (bits/sec, 95th Percentile)
  • Interface - Traffic (bits/sec, Total Bandwidth)
  • Interface - Traffic (bits/sec)
  • Interface - Traffic (bytes/sec, Total Bandwidth)
  • Interface - Traffic (bytes/sec)
  • Interface - Unicast Packets


DEBUG: Use the "Verbose Query" feature in Device, it will help a lot when you are stuck.

2017年2月18日 星期六

Install Cacti 1.0, spine in Ubuntu 16 with MySQL 5.7

NOTE:
As of Feb 18, 2017, Ubuntu is still packaged with Cacti 0.8.8f and spine 0.8.8b, so I will show here how to install the latest Cacti 1.0.x manually.  Well, it is not that 0.8.8 is no good anymore, but many of it's useful plugins are quite broken under php 7 and MySQL 5.7... So if you really need to stick with 0.8.8 version, I would suggest not to do that in Ubuntu 16; otherwise, just install the latest 1.0.x.

Cacti official setup documentation: http://docs.cacti.net/manual:100:1_installation#requirements

Packages required:
gcc
apache2, libapache2-mod-php
snmpd, snmp, snmp-mibs-downloader, libsnmp, libsnmp-dev
libssl-dev
libc-dev
libc6-dev
php, php-snmp, php-xml, php-gd, php-common, php-gmp, php-ldap, php-mbstring, php-mysql,
mysql-server, mysql-client, libmysqlclient-dev
help2man
rrdtools
git

NOTE: make sure your system timezone and php.ini timezone is setup correctly (I have carelessly set a wrong system timezone and took me a lot of time to find out my data in rrd file is set in the wrong timeslot...)
In my case, type the following command "timedatectl set-timezone Asia/Hong_Kong"
vi /etc/php/7.0/apache2/php.ini
  => date.timezone = Asia/Hong_Kong

Download the sources
Visit Cacti official home and check on latest versions and patches: http://www.cacti.net/

wget http://www.cacti.net/downloads/cacti-1.0.3.tar.gz
tar xvzf cacti-1.0.3.tar.gz
mv cacti-1.0.3 /var/www/html/cacti
## make sure apache have full access to the folder
chown -R www-data:www-data /var/www/html/cacti

vi /etc/apache2/conf-enabled/cacti.conf

Alias /cacti /var/www/html/cacti
<Directory /var/www/html/cacti>
        Options +FollowSymLinks
        AllowOverride None
        <IfVersion >= 2.3>
                Require all granted
        </IfVersion>
        <IfVersion < 2.3>
                Order Allow,Deny
                Allow from all
        </IfVersion>

        AddType application/x-httpd-php .php

        <IfModule mod_php.c>
                php_flag magic_quotes_gpc Off
                php_flag short_open_tag On
                php_flag register_globals Off
                php_flag register_argc_argv On
                php_flag track_vars On
                # this setting is necessary for some locales
                php_value mbstring.func_overload 0
                php_value include_path .
        </IfModule>

        DirectoryIndex index.php
</Directory>


## Configure database and install cacti
# set root password
mysqladmin password

cd /var/www/html/cacti
vi include/config.php  ## change username/password as you like, which you would be using for database privileges setup
mysql -p
  > CREATE DATABASE cacti
  > grant all privileges on cacti.* to cactiuser@'localhost' identified by 'cactiuser';
  > grant select on mysql.time_zone_name to cactiuser@'localhost' identified by 'cactiuser';


## tuning MySQL configurations
## this is just an example, you may need much larger values if you have a large site
## heap table size is specifically for performance enhancement, please refer to the following URL for details
http://logch.blogspot.hk/2017/02/tuning-cacti-10-performance-for-medium.html
/etc/mysql/mysql.conf.d/mysqld.cnf 
[mysqld]
collation-server = utf8_general_ci
character-set-server = utf8
max_heap_table_size = 256M
max_allowed_packet = 16777216
tmp_table_size = 64M
join_buffer_size = 64M
innodb_file_per_table = on
innodb_doublewrite = off
innodb_additional_mem_pool_size = 80M
innodb_flush_log_at_trx_commit = 2


## create tables
mysql -p cacti < cacti.sql
## populate mysql time_zone_name table
mysql_tzinfo_to_sql  /usr/share/zoneinfo | mysql -p mysql

## now open a browser and open "http://your_server_ip/cacti/" and just follow the instructions.  If it says you are missing some modules, just apt install them.
https://www.youtube.com/watch?v=rqK5OnbF1BY (although this video is done under CentOS 7, but the web setup process is mostly the same)

## Compile spine
wget http://www.cacti.net/downloads/spine/cacti-spine-1.0.3.tar.gz
tar xvzf cacti-spine-1.0.3.tar.gz
cd cacti-spine-1.0.3
./configure
make
make install

## Download thold plugin
# chdir to cacti plugins folder
cd /var/www/html/cacti/plugins
# download the source
git clone -b master https://github.com/Cacti/plugin_thold.git
# rename it to thold, yes it matters...
mv plugin_thold thold



Install Thold from the UI and a simple example can be found in video below (start at around 9:15)


Upgrade from Cacti 1.0.3 to Cacti 1.0.x
# cd /var/www/html
# wget http://www.cacti.net/downloads/cacti-1.0.x.tar.gz
# tar xvzf cacti-1.0.3.tar.gz
# mv cacti cacti-1.0.3-old
# mv cacti-1.0.x cacti
Update cacti/include/config.php on the database username and password
Open a web browser and open cacti url, e.g. http://x.x.x.x/cacti/, follow the steps and choose upgrade.

# cp -p cacti-1.0.3-old/rra/* cacti/rra/
Also copy whatever you have added by yourself from cacti-1.0.3-old/scripts/* and cacti-1.0.3-old/resource/* to cacti/scripts and cacti/resource.  That's it.