FreeBSD virtual environment management and repository

Attention! I apologize for the automatic translation of this text. You can improve it by sending me a more correct version of the text or fix html pages via GITHUB repository.

Export and display jail and bhyve statistic metrics with CBSD, Grafana and Prometheus

Often there is a need to know how much and by what components your containers and virtual environments (within this article we are talking about FreeBSD jail(8) and bhyve(8) virtual machies) consume resources at different points in time and have visualization of this in the form of graphs. The main metrics that are interesting in the first place are, as a rule, the basic stats:

  • CPU usage;
  • Memory usage;
  • Storage I/O bandwitch;
  • Storage IOPS;
  • Traffic bandwitch;

Sources of information on these metrics in FreeBSD OS are redundant in the sense that you can receive them from different places - CPU usage can be obtained from bhyve, from rctl(4) framework, from top(1) and ps(1) tools (kvm_getprocs(3) in particular); I/O information via GEOM or 'zfs iostat', vmstat(8)/iostat(8); traffic usage and bandwith from netstat(1) or ipfw(8) counters and pf(4) and etc.

The CBSD scripts can collect and accumulate this statistics and provide an interface for further processing. As an example, you can see information about the receipt of traffic in CBSD for a particular container: trafstat This article describes how the metrics can be exported to a graphical representation for a more visual form. Starting with version 11.1.0, CBSD allows to export metrics from jrctl and trafstat utilites into unified format for both jail and bhyve in the prometheus format.

Prometheus Configuration

So, our components is:

  • FreeBSD as an OS of general use, on which everything happens;
  • bhyve and jail as a hypervisor and a container, in which certain services and metrics are launched from which we will shoot;
  • CBSD - the bhyve and jail management framework, which also allows you to obtain statistics on a hypervisor or container in the prometheus format;
  • Prometheus - система сбора метрик в формате key-value pairs с возможностью делать выборку запросами, с эффективным движком хранения данных (leveldb), имеющую встроенную интеграцию с Grafana, а также, имеющая клиентские библиотеки для популярных языков программирования;
  • Grafana - a popular and flexible system of Dashboards with analytics and graphs on any metrics;

To get information about a virtual machine or container by their name in CBSD, you can use the 'cbsd jrctl' command. This script uses the metrics provided by the RACCT framework, so make sure that your /boot/loader.conf has the appropriate setting:

kern.racct.enable=1

By default, 'cbsd jrctl' show information in key-value format:

% cbsd jrctl mode=show jname=f11
datasize=2080K
stacksize=264K
coredumpsize=0
memoryuse=63M
memorylocked=0
maxproc=1
openfiles=40
vmemoryuse=1099M
nthr=29
nsemop=0
wallclock=6206
pcpu=0
readbps=0
writebps=0
readiops=0
writeiops=0

If you omit the jname= argument, you will see information on all available environments. If this command is called with the prometheus=1 key, the output will be in the format that is expected prometheus targets.

Exporting metrics in the prometheus can be in different ways. One way is to send the metric directly to prometheus using any convenient programming language, where there is a library of prometheus client.

We implement the second option, when the prometheus server itself will bypass the points and collect the necessary statistics. To do this, you can write a simple daemon, proxy output of the command 'cbsd jrctl mode=show prometheus=1' on specific tcp ports. Or, to avoid potential security problems (you need to secure this daemon from the actions of possible intruders, since working with the command 'cbsd' requires a privileged user) and redirect the output 'cbsd jrctl' into an intermediate file that will be available for upload through any WEB server, for example NGINX.

Schematically, it might look like this:

As the collector and generator of the index.html file can act as a self-written daemon, or a script called from cron, either demonized sh-scripts, which is described here.

Each node will export its metrics via http://<FQDN>/rctl/ , and prometheus job - to collect statistics and give to grafana.

Prometheus and Grafana we will also install in the jail, for which we run the dialog to create a container:

As jname, we enter an arbitrary name, for example: grafana

% cbsd jconstruct-tui

The remaining parameters are changed at their discretion (if necessary) and press 'Go PROCEED!'

Let's go into the jail and install the prometheus and grafana packages:

% cbsd jstart grafana
% cbsd jlogin grafana
% pkg install -y net-mgmt/prometheus www/grafana4

Install services to startup:

% sysrc grafana_enable="YES" 
% sysrc prometheus_enable="YES" 

Editing the configuration file /usr/local/etc/prometheus.yml:

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
      monitor: 'codelab-monitor'

rule_files:

scrape_configs:
  - job_name: rctl
    metrics_path: /rctl/
    static_configs:
      - targets: ['rctl.olevole.ru:80']

where:

  • rctl.olevole.ru:80 - Directly to our host's FQDN and nginx port, which we will configure below.
  • scrape_interval: 15s - The interval with which the prometheus will interrogate the metrics. We have chosen a frequency of 15 seconds.

Run services:

% service grafana start
% service prometheus start

And on it with the container and setting up of services - it's all.

If you do not have access to the container directly, we will run the commands on the host to forward required ports:

% cbsd expose mode=add in=3000 jname=grafana
% cbsd expose mode=add in=9090 jname=grafana

Next, we'll write a script that will be run when the server starts and, after a necessary interval, will regenerate the index.html file with metrics based on the data from 'cbsd jrctl'. The time interval is logical to put such, with what frequency takes away prometheus - earlier we fixed this interval in 15 seconds.

The script itself can look like this (download):

#!/bin/sh
while getopts "r:i:" opt; do
        case "$opt" in
                r) root_dir="${OPTARG}" ;;
                i) interval="${OPTARG}" ;;
        esac
        shift $(($OPTIND - 1))
done

export TMPDIR="${root_dir}"
unset jname

if [ -z "${root_dir}" ]; then
        echo "Empty root_dir, please use $0 -r path"
        exit 1
fi

[ -z "${interval}" ] && interval="15"
[ ! -d "${root_dir}" ] && mkdir -p ${root_dir}

while [ true ]; do
        INDEX_OLD=$( readlink ${root_dir}/index.html )
        INDEX_TMP=$( mktemp )

        trap "/bin/rm -f ${INDEX_TMP}" HUP INT ABRT BUS TERM EXIT

        chmod 0644 ${INDEX_TMP}

        truncate -s0 ${INDEX_TMP}
        /usr/bin/lockf -s -t0 /tmp/cbsd-rctl.lock /usr/local/bin/cbsd jrctl mode=show prometheus=1 > ${INDEX_TMP}

        ln -sf ${INDEX_TMP} ${root_dir}/index.html
        rm -f ${INDEX_OLD}
        sleep ${interval}
done

At the input script through the parameter -r we specify the root directory in which we will store index.html, and through the argument -i - adjust the refresh rate.

As we can see, the script enters an infinite loop with a pause at the end.

Let's place this file into /root/bin directory by the name of cbsdrctl: /root/bin/cbsdrctl and sets executable flags:

% chmod +x /root/bin/cbsdrctl

It remains to create an rc.d-script that will control the start and stop /root/bin/cbsdrctl

Let's write the following script (download):

#!/bin/sh
#
# PROVIDE: cbsdrctl
# REQUIRE: LOGIN FILESYSTEMS sshd
# KEYWORD: shutdown
#
# cbsd_rctl_enable="YES"
#

. /etc/rc.subr

name=cbsdrctl
rcvar=cbsdrctl_enable
load_rc_config $name

start_cmd=${name}_start
stop_cmd=${name}_stop
status_cmd="${name}_status"
restart_cmd=${name}_restart
extra_commands="restart"

# Set defaults
: ${cbsdrctl_enable:="NO"}
: ${cbsdrctl_interval:="15"}
: ${cbsdrctl_root:="/tmp/metrics/rctl"}

cbsdrctl_start()
{
        if [ -r ${pidfile} ]; then
                echo "Already running: `cat ${pidfile}`"
                exit 0
        fi
        /usr/sbin/daemon -f -p ${pidfile} /root/bin/cbsdrctl -r ${cbsdrctl_root} -i ${cbsdrctl_interval}
        echo "STARTED: -r ${cbsdrctl_root} -i ${cbsdrctl_interval}"
}

cbsdrctl_status()
{
        if [ -f "${pidfile}" ]; then
                pids=$( pgrep -F ${pidfile} 2>&1 )
                _err=$?
                if [ ${_err} -eq  0 ]; then
                        echo "Running"
                else
                        echo "Not running"
                fi
        else
                echo "Not running"
        fi
}

cbsdrctl_restart()
{
        cbsdrctl_stop
        cbsdrctl_start
}

cbsdrctl_stop()
{
        if [ -f "${pidfile}" ]; then
                pids=$( pgrep -F ${pidfile} 2>&1 )
                _err=$?
                if [ ${_err} -eq  0 ]; then
                        kill -9 ${pids} && /bin/rm -f ${pidfile}
                else
                        echo "pgrep: ${pids}"
                        return ${_err}
                fi
        fi
}

pidfile=/var/run/$name.pid
run_rc_command "$1"

Save this script as /usr/local/etc/rc.d/cbsdrctl and sets executable flags:

% chown root:wheel /usr/local/etc/rc.d/cbsdrctl
% chmod 0555 /usr/local/etc/rc.d/cbsdrctl

Interval and working directory parameters are moved to variables:

cbsdrctl_interval (value by default: 15)

cbsdrctl_root (value by default: /tmp/metrics/rctl)

If these values do not suit you, you can redefine in rc.conf:

% sysrc cbsdrctl_interval="15"
% sysrc cbsdrctl_root="/tmp/metrics/rctl"

Install the script as a startup script:

% sysrc cbsdrctl_enable="YES"

And finally, let's start:

% service cbsdrctl start

NGINX Configuration

If in the previous step you succeeded, in the directory /tmp/metrics/rctl will start updating the symbolic link index.html pointing to the file with the metric

We need to configure the WEB server to show these files. Install nginx on our node:

% pkg install -y nginx
% sysrc nginx_enable=YES

Add to NGINX configuration file /usr/local/etc/nginx/nginx.conf description of our virtual host (in this sample, FQDN of the server: rctl.olevole.ru)

server {
        listen       80;
        listen      [::]:80;
        server_name  rctl.olevole.ru;

        location /rctl {
                root /tmp/metrics;
        }
}

Then run NGINX:

% service nginx start

At this stage, when you going to http://<FQDN>/rctl you should see the contents of the file /tmp/metrics/rctl/index.html

Check the Work

We can make sure that prometheus gets the metrics correctly. To do this, open the browser interface provided by the process prometheus on the 9090 port.

Opposite Execute in the drop-down list, you will see all the metrics that CBSD provides. If the metric refers to a container, then the metric will start with the prefix jail_, if params for virtual machine prefix is bhyve_.

It remains for us to take advantage of the work in Grafana and create a separate Dashboard for our virtual environments.

GRAFANA

To work with GRAFANA, open the server in the browser server on port 3000. GRAFANA installation by default has a user 'admin' with password: 'admin':

On the first screen, we are offered to configure various components - to get the user, add data sources and add additional plug-ins:

Since we receive data from prometheus, we add the appropriate source. In this example, since prometheus works in the same environment as grafana, we can set the url value as http://localhost:9090. Click Save&Test and proceed to create the first graph.

To add an individual metric, you can start typing it in the Metric lookup field. Here works the autocompletion and it will be easier for you to find the desired parameter

Study on the man page rctl, in which measures statistics are given - in percent, in bytes, bit/sec or parrots and adjust the metric according to this number system:

As we can see, it's quite easy to export the available parameters for both jail and bhyve. New servers will automatically fall into the metric database and configured scripts you do not have to touch. In this example, we get the metric from FreeBSD RACCT and it is available for removal from each server along the path /rctl. You can increase and expand the metrics, for example, on the path /trafstat you can set metrics for the traffic of containers, and for /billing adjust metrics with a schedule of tariffs for consumed resources. This work is limited only by your imagination and the need for information about the status of your systems and possible trends for the disposal of resources. Information allows you to keep your hand on the pulse and predict potential problems ahead of time in the case of FreeBSD and CBSD - you can get this information easily enough.