MariaDB connection usage metrics

Seeing the connection limit maxed out in a database is always frustrating, especially if the DBA themselves are also locked out by the time the problem is noticed.

Without actively monitoring, and capturing trends, this can happen. Approximately two years ago, I wrote an article on handling things once it’s already happened. This is about seeing it coming, creating a watchful eye, so that there’s time to react before the locking out occurs.

Note: This article assumes some basic familiarity with awk. If you are completely new to awk, some sections may be challenging to follow.

The Problem

There’s basically two things we want to achieve here:

MariaDB doesn’t retain any connection history, so there’s no built-in log of which user connected or disconnected at what time. The only tool we have is to check information_schema.PROCESSLIST to get a current snapshot of connections. This means every time we take a snapshot, we need to append the results to an existing dataset of the previous snapshot(s).

We need to keep in mind that connections are transient: someone might connect to a Database, do a few queries, disconnect, and then reconnect again later. If we want to capture this accurately, we need to make sure, that the way we join a new file onto the existing dataset takes this into account properly.

Getting the connection data

This is the simple part. The first step, is of course to query the database for the base information.

SELECT p.USER, s.ATTR_VALUE, p.ID FROM information_schema.PROCESSLIST p LEFT JOIN performance_schema.session_connect_attrs s ON p.ID = s.PROCESSLIST_ID;

This query will give a list of who is connecting, and also some details on the connection’s attributes, like the connector library, or the OS, and reported program name. These extra details are not essential, but in my situation, often useful enough to check, so I left them in my script for an easy copy-pasteable.

With a little bit of awk it is trivial to group together everything by connection id. Simply pipe the SQL statement into mysql client, output it as a csv, and do a simple grouping on column 3 with awk:

BEGIN{
    FS=",";
    OFS=",";
}
(NR>1){
    if(a[$3]){
        a[$3]=a[$3];
    } else {
        a[$3]=$1","$2;
    }
}
END{
    for(x in a){
        print x,a[x];
    }
}

This here basically collapses all the extra rows spawned by having the session_connect_attrs into a single row per connection id, creating a compact, deduplicated snapshot of active connections.

Since it is possible (and common) for large applications to establish many connections at once, we’ll need to aggregate this data by user. The following snippet groups all rows by user, and counts how many connection each user currently has:

BEGIN{
    FS=",";
    OFS=",";
}
{
    a[$2]++;
}
END{
    print "user,"systime();
    for (x in a){
        print x,a[x];
    }
}

This prints a simple two column snapshot, the username, and the number of active connections, with the timestamp of the stampshot as the header above the number of connections.

With this background, we can lay out the framework of our monitoring and trend collecting script:

#!/bin/bash

DATA_DIR=$HOME/monitoring/data
mkdir -p $DATA_DIR
DATE_STR=$(date '+%Y%m%d')

# the session connect attrs stuff is not necessary, but I like to have this query ready as a copy-pasteable, because in my case, inquiries into who/what specifically connects is common
function first_file() {
    echo "SELECT p.USER, s.ATTR_VALUE, p.ID FROM information_schema.PROCESSLIST p LEFT JOIN performance_schema.session_connect_attrs s ON p.ID = s.PROCESSLIST_ID;" | mysql -u root -A | sed 's/\t/,/g' | awk 'BEGIN{FS=","; OFS=",";}(NR>1){if(a[$3]){a[$3]=a[$3]} else {a[$3]=$1","$2} }END{for(x in a){print x,a[x];} }' | awk 'BEGIN{FS=","; OFS=",";}{a[$2]++}END{print "user,"systime(); for (x in a){print x,a[x]}}' > $DATA_DIR/file1
}

function incremential_columns() {
    echo "SELECT p.USER, s.ATTR_VALUE, p.ID FROM information_schema.PROCESSLIST p LEFT JOIN performance_schema.session_connect_attrs s ON p.ID = s.PROCESSLIST_ID;" | mysql -u root -A | sed 's/\t/,/g' | awk 'BEGIN{FS=","; OFS=",";}(NR>1){if(a[$3]){a[$3]=a[$3]} else {a[$3]=$1","$2} }END{for(x in a){print x,a[x];} }' | awk 'BEGIN{FS=","; OFS=",";}{a[$2]++}END{print "user,"systime(); for (x in a){print x,a[x]}}' > $DATA_DIR/file2
    awk -f $HOME/monitoring/incremential.awk $DATA_DIR/file1 $DATA_DIR/file2 > $DATA_DIR/temp
    cat $DATA_DIR/temp > $DATA_DIR/file1
    rm $DATA_DIR/temp
}


function send_result() {
    echo "some mailx command here..."
}

function dummy() {
    echo "..."
}

while getopts fis OPT; do
    case $OPT in
        f) first_file ;;
        i) incremential_columns ;;
        s) send_result ;;
        *) dummy ;;
    esac
done

This simple bash script outlines the framework of what we need:

The idea is to run first_file once at the beginning of the day from cron, and then run incremential_columns every x minutes until the end of day, and call send_result once the final incremential run is over.

Joining the incremential files

Now the final task remains. To join the incremential file back onto the baseline file, while keeping in mind, that connections can be, and often are transient. If someone, or some application connects and disconnects frequently, we want to make sure that it is captured, and allocated accurately in the file.

Unfortunately, the most commonly quoted ways of simple file joins with awk will not work in this situation, because it’s usually assumed that during a join, that we want to keep only the rows that appear in both files, but this is not the case here. The join logic in this case needs to handle new and missing entries, and can not just discard them.

At this point, think of file1 as our history and the file2 as a new snapshot. The script incremential.awk’s job is to merge them into a timeline.

Let’s take a look at the incremential.awk script, that handles the join as we expect:

#!/bin/awk

BEGIN{
    FS=",";
    OFS=",";
}
(NR == FNR) {
    row[$1] = $0;
    col[$1] = NF - 1;
    max = (NF - 1 > max) ? NF - 1 : max;
    a[$1] = 1;
    next;
}
{
    k = $1;
    if (k in row) {
        print row[k]","$2;
        delete a[k];
    } else {
        printf "%s", k;
        for (y = 1; y <= max; y++) {
            printf ",";
        }
        print $2;
   }
}
END{
    for (k in a) {
        printf "%s", row[k];
        for (z = col[k]; z <= max; z++) {
            printf ",";
        }
        print ";
    }
}

There’s quite a lot to unpack here, since this script handles users who were and are still connected, newly connected users, and re-connecting users. So let’s break down what happens here:

Here’s how it looks in action:

root@debian-test:~/monitoring# bash connection_monitor.sh -f
root@debian-test:~/monitoring# cat data/file1
user,1761421260
devel,1
maxscaleuser,1
root,2
root@debian-test:~/monitoring# bash connection_monitor.sh -i
root@debian-test:~/monitoring# cat data/file1
user,1761421260,1761421278
maxscaleuser,1,1
root,2,2
devel,1
root@debian-test:~/monitoring# bash connection_monitor.sh -i
root@debian-test:~/monitoring# cat data/file1
user,1761421260,1761421278,1761421293
devel,1,,1
maxscaleuser,1,1,1
root,2,2,2

Conclusion

Tracking connection metrics means that you can react to connection usage spikes before you get locked out of the Database yourself. If time and resources allow it, you can feed the dataset into a tool such as grafana to visualize trends and manage connection monitoring in a more intuitive way. In my situation, the usage of simple unix tools made the most sense, as installing any new off the shelf tooling requires an arduous gauntlet of approval and permission processes.

Appendix

If you just need a quick understanding of the database connection totals in a more “let’s just eyeball this” fashion, you can sum each column, and transpose it as a count against each time of snapshot:

#!/bin/awk

BEGIN{
    FS = ","
}

(NR == 1) {
    for (i = 1; i <= NF; i++) {
        if ($i ~ /^[0-9]+$/) {
            numeric[i] = 1;
            header[i] = $i;
            cols++;
        }
    }
    if (cols == 0) {
        exit 1;
    }
    next;
}
{
    for (i = 1; i <= NF; i++) {
        if (i in numeric && $i ~ /^[0-9]+$/) {
            sum[i] += $i;
        }
    }
}
END{
    printf "Column totals:\n"
    for (i = 1; i <= NF; i++) {
        if (i in numeric) {
            print strftime("%F %T", substr(header[i],1,10)),": "sum[i];
        }
    }
}

So this way, you can get an easy overview on connection totals at each timestamp:

Column totals:
2025-10-25 21:41:00 : 4
2025-10-25 21:41:18 : 3
2025-10-25 21:41:33 : 4