Basic introduction to Prometheus (Part 1)

Disclaimer: This article is effectively nothing more than me jotting down my own discovery of how to use Prometheus and what it is for. If you happen to be in the same situation, where you’ve heard of this tool, but never used it, feel free to read along. Knowledge of PHP and awk is useful for following along, but not essential.

This writeup took place on a Debian system, so keep in mind that some commands here might be distro specific.

What is Prometheus?

It is a system monitoring and alerting system. The basic premise of operation is that it can be told to “scrape” (make http requests) for data, and store the data as a time series format. This means that the information that is scraped is stored together with the timestamp at which it was recorded, so the internal data storage is not relational, and does not internally use SQL. Likewise, one would not use SQL to query and understand the data that lives in its time series database.

The basic building blocks

The basic idea that needs to be understood here is that data does not get “fed” into Prometheus, but rather than Prometheus goes and makes HTTP requests to “exporters” on a periodic basis.

An exporter can be any script or application that can produce an output that matches the expectations of Prometheus (more on this later). It simply needs to be reachable via http on some port. The endpoints scraped and frequency of scraping can be configured by a prometheus.yml file that lives in its root directory.

The neat takeaway here is that, once prometheus is running, it can take a text based input (by sending http requests towards exporters), and provide a text based output (e.g. simply send a http request via curl to Prometheus), meaning that the services/applications/scripts built around it can be as simple, or complex as you like.

Some exporter (i.e. some script or compiled application)
        ↓
HTTP /metrics (text format e.g. PHP code exposed via apache, or compiled Golang app with built-in http server)
        ↓
Prometheus scraper
        ↓
Prometheus TSDB (internal storage)
        ↓
PromQL query engine (can play around with this on the basic frontend provided by Prometheus itself)
        ↓
HTTP JSON API exposed by the Prometheus engine
        ↓
curl / some custom frontend (e.g. Grafana)

Besides this, prometheus also contains the building blocks to create alerting systems, but this will be a separate concern for Part 2 of this intro.

Installing and running Prometheus

Installing can be done via tarball if it’s not in your package manager. I’ve opted for the tarball option. For this, simply browse their github page, pick the most recent stable version, wget the relevant tar.gz of the release, move it to a sensible place on your system, and tar xvf to unpack (“install”) it.

This means I’ve got it all installed at:

/opt/prometheus/prometheus-3.9.1.linux-amd64

In this directory, you will see a prometheus.yml file. This file is a configuration file governing how and when prometheus should scrape data. By default, prometheus is configured to scrape data about itself, and the way to get it to scrape (send http requests to) custom applications is to add new entries here:

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
       # The label name is added as a label `label_name=<label_value>` to any timeseries scraped from this config.
        labels:
          app: "prometheus"

You will also see a ./data directory. This is basically where Prometheus will save all the data it scrapes. There is usually little reason to manually dive into this directory. As with the prometheus.yml file, these are just the defaults, and other locations can be set when running the application via flags.

Running prometheus:

./prometheus --config.file=prometheus.yml --storage.tsdb.path=./data --storage.tsdb.retention.time=2d --web.listen-address="0.0.
0.0:9090"

A quick run-through of these flags:

-> --config.file=prometheus.yml : what file to use for configuration, configuring global parameters, scrape configurations (what targets to montor), alerting (where to send alerts to once conditions are triggered), and rule files (e.g. pre-calculating complex queries)
-> --storage.tsdb.path=./data : where prometheus should store its data
-> --storage.tsdb.retention.time=2d : for how long prometheus should keep data around for, in real life the value would be something longer than 2d
-> --web.listen-address="0.0.0.0:9090 : how Prometheus should expose itself for data retrieval, this value means that it is open to accept requests on port 9090 from any IP address

Creating a simple exporter (using PHP)

As established, one does not “push data into” Prometheus. We need to set up some service that Prometheus can hit via http requests. PHP is a simple solution, with apache as the webserver for handling http requests.

Set up a simple script to generate fake data

For the sake of basic understanding, we’ll just use rand() to generate some random values, just to see what we can observe about them in Prometheus.

<?php
header('Content-Type: text/plain; version=0.0.4');

$request_count = rand(100, 500);
$temperature   = round(rand(200, 300) / 10, 1);
$active_users  = rand(0, 50);

echo "# HELP fake_request_count Total number of requests\n";
echo "# TYPE fake_request_count counter\n";
echo "fake_request_count $request_count\n\n";

echo "# HELP fake_temperature A fake temperature gauge\n";
echo "# TYPE fake_temperature gauge\n";
echo "fake_temperature $temperature\n\n";

echo "# HELP fake_active_users Currently active users\n";
echo "# TYPE fake_active_users gauge\n";
echo "fake_active_users $active_users\n";
?>

and place it in

/var/www/html/prometheus_lab/metrics.php

Exposing the script via apache

Create a basic file like:

<VirtualHost *:80>
        ServerAdmin root
        DocumentRoot /var/www/html/prometheus_lab
        Options -Indexes
        ErrorLog ${APACHE_LOG_DIR}/error.log
        CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

in /etc/apache2/sites-enabled/

Once that’s done, run systemctl restart apache2 to start exposing the php file we just created. In fact, this will also expose any subsequent php files we might drop into /var/www/html/prometheus_lab. For the sake of learning, there’s no real need to develop some kind of API.

Update `prometheus.yml` and restart Prometheus

Now we can add this new php “exporter” into prometheus.yml, so that Prometheus can start scraping it. Add:

  - job_name: "php_fake_data"
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:80']
    metrics_path: '/prometheus_lab/metrics.php'

and restart Prometheus.

Verify that it works

Once Prometheus is up and running again, we can simply use curl to check this. On the system where prometheus is installed, we can send the request against localhost and port 9090.

If successful, we should receive a JSON output that contains a timestamp and a value (and some metadata):

root@debian-test:/opt/prometheus/prometheus-3.9.1.linux-amd64# curl -s "http://localhost:9090/api/v1/query?query=fake_temperature"
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"fake_temperature","instance":"localhost:80","job":"php_fake_data"},"value":[1771722682.058,"22.3"]}]}}

If we wait around for a couple of minutes, we can also check for historical values, e.g.:

root@debian-test:/opt/prometheus/prometheus-3.9.1.linux-amd64# curl -s "http://localhost:9090/api/v1/query_range?query=fake_temperature&start=$(date -d '5 minutes ago' +%s)&end=$(date +%s)&step=15"
{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"fake_temperature","instance":"localhost:80","job":"php_fake_data"},"values":[[1771722742,"25.7"],[1771722757,"29"],[1771722772,"26.9"],[1771722787,"22.1"],[1771722802,"22.1"],[1771722817,"26.1"],[1771722832,"29.8"],[1771722847,"20.4"],[1771722862,"24.7"],[1771722877,"21.9"],[1771722892,"29"],[1771722907,"24.4"],[1771722922,"22"],[1771722937,"28.5"],[1771722952,"30"],[1771722967,"25.8"],[1771722982,"27.1"],[1771722997,"23.1"],[1771723012,"21.5"],[1771723027,"29.4"],[1771723042,"23.7"]]}]}}

With this understanding, we are ready to try and replicate the output of some monitoring script with the use of Prometheus.

Quick note on “gauge” and “counter”

Counter: a number that can only increase, or be reset, but can never decrease in value. It’s meant for metrics where a total/aggregate value is useful.

Gauge: a number that can go either up or down, like the number of connected users to a database at any given moment in time.

There are other types of metrics available, but for basic understanding, these two will do fine.

Converting a basic monitoring script into an exporter

In a previous article, I wrote a basic script to monitor connection usage on MariaDB. Let’s see if we can reproduce this.

In my case, MariaDB is already installed and present as follows:

root@linuxpc:~# maxctrl list servers
┌─────────┬──────────────┬──────┬─────────────┬─────────────────┬──────────┬─────────────────┐
│ Server  │ Address      │ Port │ Connections │ State           │ GTID     │ Monitor         │
├─────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────┼─────────────────┤
│ server1 │ 192.168.2.37 │ 3306 │ 0           │ Master, Running │ 0-1-2418 │ MariaDB-Monitor │
├─────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────┼─────────────────┤
│ server2 │ 192.168.2.99 │ 3306 │ 0           │ Slave, Running  │ 0-1-2418 │ MariaDB-Monitor │
└─────────┴──────────────┴──────┴─────────────┴─────────────────┴──────────┴─────────────────┘

Since the article is quite lengthy, here’s a summary of what the script does:

root@debian-test:/opt/monitoring/data# awk 'BEGIN{FS=","}{print $1,$2,$3}' file1
user 1771718101 1771718401
maxscaleuser 1 1
system user 2 2
root 2 2

Every time it executes, it counts how many connections each user has, and adds the results as a new column with its own timestamp. This script is bash/awk, so it is not very https friendly.

The same logic as a PHP exporter looks the following way:

<?php
header('Content-Type: text/plain; version=0.0.4; charset=utf-8');

$mysqli = new mysqli("127.0.0.1", "[USER]", "[PASS]", "information_schema");
if ($mysqli->connect_errno) {
    echo "# HELP mariadb_exporter_up 1 if connection succeeded, 0 otherwise\n";
    echo "# TYPE mariadb_exporter_up gauge\n";
    echo "mariadb_exporter_up 0\n";
    exit;
}

echo "# HELP mariadb_exporter_up 1 if connection succeeded, 0 otherwise\n";
echo "# TYPE mariadb_exporter_up gauge\n";
echo "mariadb_exporter_up 1\n\n";

$sql = <<<SQL
SELECT p.USER, s.ATTR_VALUE, p.ID
FROM information_schema.PROCESSLIST p
LEFT JOIN performance_schema.session_connect_attrs s
  ON p.ID = s.PROCESSLIST_ID;
SQL;

$res = $mysqli->query($sql);
if (!$res) {
    echo "# HELP mariadb_exporter_query_ok 1 if query succeeded, 0 otherwise\n";
    echo "# TYPE mariadb_exporter_query_ok gauge\n";
    echo "mariadb_exporter_query_ok 0\n";
    exit;
}

echo "# HELP mariadb_exporter_query_ok 1 if query succeeded, 0 otherwise\n";
echo "# TYPE mariadb_exporter_query_ok gauge\n";
echo "mariadb_exporter_query_ok 1\n\n";

$byId = [];
while ($row = $res->fetch_row()) {
    [$user, $attr, $id] = $row;

    if ($id === null) {
        continue;
    }
    $id = (string)$id;

    if (isset($byId[$id])) {
        continue;
    }

    $user = $user !== null ? trim($user) : '';
    $attr = $attr !== null ? trim($attr) : '';

    $byId[$id] = ["user" => $user, "attr" => $attr];
}

$counts = [];
$total = 0;

foreach ($byId as $entry) {
    $k = $entry["user"];
    $counts[$k] = ($counts[$k] ?? 0) + 1;
    $total++;
}

echo "# HELP mariadb_userphp_connections Active MariaDB connections\n";
echo "# TYPE mariadb_userphp_connections gauge\n";

foreach ($counts as $user => $count) {
    $label = addcslashes($user, "\\\"\n");
    echo "mariadb_userphp_connections{user=\"$label\"} $count\n";
}

echo "\n# HELP mariadb_userphp_connections_total Total active MariaDB connections counted \n";
echo "# TYPE mariadb_userphp_connections_total gauge\n";
echo "mariadb_userphp_connections_total $total\n";
?>

This simply needs to be dropped into /var/www/html/prometheus_lab, alongside the other script. I’ve named this file mariadb_user_from_php.php.

There’s no need to restart apache for this, this is going to be exposed as soon as you save the file.

On the other hand, for Prometheus, we’ll need to update the prometheus.yml file, and then restart it:

  - job_name: "mariadb_user_from_php"
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:80']
    metrics_path: '/prometheus_lab/mariadb_user_from_php.php'

Comparing the two results:

-> Latest output of old script

root@debian-test:/opt/monitoring/data# cat file2
user,1771725901
maxscaleuser,1
system user,2
root,1

-> curl against the new exporter

root@debian-test:/opt/monitoring/data# curl -s "http://localhost:9090/api/v1/query?query=mariadb_userphp_connections" | read_response
 {
    "status":"success",
    "data":
    {
        "resultType":"vector",
        "result":
        [
            {
                "metric":
                {
                    "__name__":"mariadb_userphp_connections",
                    "instance":"localhost:80",
                    "job":"mariadb_user_from_php",
                    "user":"devel"
                },
                "value":
                [
                    1771725909
                    253,
                    "1"
                ]
            },
            {
                "metric":
                {
                    "__name__":"mariadb_userphp_connections",
                    "instance":"localhost:80",
                    "job":"mariadb_user_from_php",
                    "user":"maxscaleuser"
                },
                "value":
                [
                    1771725909
                    253,
                    "1"
                ]
            },
            {
                "metric":
                {
                    "__name__":"mariadb_userphp_connections",
                    "instance":"localhost:80",
                    "job":"mariadb_user_from_php",
                    "user":"system user"
                },
                "value":
                [
                    1771725909
                    253,
                    "2"
                ]
            }
        ]
    }
 }

We can indeed see that converting some existing script to work with Prometheus is possible, and not too difficult!

Note: read_response is a custom function I use for pretty printing json. You may opt for some solution, as the output otherwise appears in a single line with no spacing.

Important Note

Scrape time is very important. Prometheus treats each scrape as an atomic operation: if the exporter does not respond within scrape_timeout, the entire scrape fails. If an SQL query takes longer to respond than the scrape_timeout in prometheus.yml, it will result in failed scrapes and gaps in the data. For this small toy database this isn’t an important consideration, but be mindful on real production datasets.

Final Sidenote on the `up` metric

While Prometheus can track this out of the box, it can be handy to also include a custom up metric in the exporter. For example, in this case, it’s possible that the PHP script itself runs fine, but the database is down. In that case, autogenerated up metric will be 1. This custom metric can help draw a more clear boundary on where the failure is.

Conclusion

Even without any dive into how Prometheus works internally, it is possible to set up basic monitoring tasks in a fast and convenient manner. Prometheus does not care how you generate the data (in terms of script/application language of choice, or complexity), as long as you follow the simple contract of giving it a stable endpoint to hit. It exposes an API for querying its stored time series data, given as a JSON output, with which you can do as you like: log it, visualize it, etc.