Monitorización de sistemas XEN con Nagios o ZABBIX

Más de uno nos hemos preguntado cómo monitorizar sistemas de virtualización basados en tecnología XEN con Nagios o ZABBIX.

Aquí os dejo una documentación que incluye unos scripts que nos pueden servir de base para realizar tareas de monitorización para sistemas centralizados:

To do and read before you begin

  • Almost everything is based on “xentop -i -b”, but be caution, the output of this command can change depending on your server OS (redhat/centos or debian). In particular, the number of line before domain data can change, so you may to adjust the first sed to have what you want. Bellow are the settings for Xen 3 on RHEL/CentOS/SELinux 5.x.

  • To make this work, you allow the zabbix user to run xentop as root, to do this, just add with visudo the following line :
zabbix          ALL=NOPASSWD: /usr/sbin/xentop
  • You may find strange that I have added sed ‘s/^……//’ to the output of xentop, this is because I have some domU name that are so long that the first character of some line was not always a space, thus changing the number of the column to add and leading to bad data… So I remove the first 6 characters from all line to prevent this problem.

The UserParameters themselves

Number of running domains

UserParameter=xen.domUs_number,sudo /usr/sbin/xentop -i 1 -b |sed 1,5d|wc -l

Total memory seen by Xen

UserParameter=xen.total_memory,sudo /usr/sbin/xentop -i 1 -b| grep "Mem:"|sed 's/[ ][ ]*/;/g'|cut -d';' -f2 |sed 's/k//g'

Memory currently used by the system

UserParameter=xen.used_memory,sudo /usr/sbin/xentop -i 1 -b| grep "Mem:"|sed 's/[ ][ ]*/;/g'|cut -d';' -f4 |sed 's/k//g'

Free memory for domUs

UserParameter=xen.free_memory,sudo /usr/sbin/xentop -i 1 -b| grep "Mem:"|sed 's/[ ][ ]*/;/g'|cut -d';' -f6 |sed 's/k//g'

CPU load on the server in %, add all domUs + dom0 cpu load

  • This begins to get complicated… because the first output of /usr/sbin/xentop doesn’t refresh the cpu (/usr/sbin/xentop -i 1 -b will always give you a CPU load of 0.0), you need to get the second iteration to have the correct values. It will slow the process because you need to wait 2 seconds, and it also add an other line to get the number of line to remove from a /usr/sbin/xentop -i 2 -b
  • You can speed up the process by changing the delay between update with the “-d” parameter, so you can theoretically use /usr/sbin/xentop -i 2 -b -d 0.01 but I saw that, on some server, this was returning a very high load for the CPU… because the xentop itself was eating resources. So to be safe, keep it at 1 seconds : /usr/sbin/xentop -i 2 -b -d 1
  • So to be short, what I did was :
  1. get the number of core seen by Xen, we will need it at the end to have a % of all CPU usages
  2. get the number of domains, this will be needed to remove the correct number of lines when we call xentop with “-i 2” (2 iteration)
  3. add the CPU usage of each domain to have a grand total
  4. divide the grand total by the number of core to have the % of usage of all core, thus a % of CPU power used, that can be compared with other servers and is readable immediately because it actual means something
  • I have put it on several lines for readability, put everything on the same line to make it work :
UserParameter=xen.total_load,core=`sudo /usr/sbin/xentop -i 1 -b| grep "CPUs:"|sed 's/[ ][ ]*/;/g'|cut -d';' -f9`;
line=`sudo /usr/sbin/xentop -i 1 -b|grep "domains:"|awk '{print $1}'`;
total=0;for cpu in `sudo /usr/sbin/xentop -i 2 -b -d 1|sed 1,8d|sed 1,${line}d|sed 's/^......//'|sed 's/[ ][ ]*/;/g'|cut -d';' -f4`; do total=`echo $total + $cpu|bc`;done;
echo `echo $total / $core|bc`

CPU load on the server in %, all domUs, excluding the dom0

  • The process is the same as before, but I delete an additional line to remove the CPU usage of the dom0. Again, put this on one line to make it work
UserParameter=xen.domUs_load,core=`sudo /usr/sbin/xentop -i 1 -b| grep "CPUs:"|sed 's/[ ][ ]*/;/g'|cut -d';' -f9`;
line=`sudo /usr/sbin/xentop -i 1 -b|grep "domains:"|awk '{print $1}'`;
total=0;for cpu in `sudo /usr/sbin/xentop -i 2 -b -d 1|sed 1,9d|sed 1,${line}d|sed 's/^......//'|sed 's/[ ][ ]*/;/g'|cut -d';' -f4`; do total=`echo $total + $cpu|bc`;done;
echo `echo $total / $core|bc`

Memory used on the server, in KB, excluding the dom0

  • Since I need to make a grand total, the line is -again- rather long, so I broke it into several lines, don’t forget to put everything on the same line to make it work:
for cpu in `sudo /usr/sbin/xentop -i 1 -b|sed 1,5d|sed 's/^......//'|sed 's/[ ][ ]*/;/g'|cut -d';' -f5`; do total=`echo $total + $cpu|bc`;done;
echo $total

Memory used on the server, in %, excluding the dom0

  • Same as earlier :
for cpu in `sudo /usr/sbin/xentop -i 1 -b|sed 1,5d|sed 's/^......//'|sed 's/[ ][ ]*/;/g'|cut -d';' -f6`; do total=`echo $total + $cpu|bc`;done;
echo $total

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *