Intermediate awk: Counting Pattern Matches in Logs with Regex

In the previous article, we explored how a pattern or two can be used to take a slice of logs. But sometimes, we don’t want to simply inspect logfiles, but instead wish to analyze them. Whether it’s for KPIs, audits, or incident review, we need a way to count how often certain events have occured.

A quick recap: all arrays in awk associative. This means that once we’ve seen something, we can use that something as an array key, and add the newly seen occurance of said something to a matching key. This is given by the following awk idiom:

awk 'a[$0]++' file

While this is generally useful, we musn’t forget that $0 refers to the entire line, and in the context of log files, we most likely do not want to use an entire line as an array key.

Let’s say we’re using maxscale logs to count how many times each replica database has gone down. An easy way to identify a replica uniquely is by their ip address, meaning that the ip addresses would make excellent array keys. Therefore, we need a way to extract ip addresses from $0.

Fortunately, awk provides a match() function, which allows us to use a regular expression to look through a string, and save the result to an array:

match(<input string>, /regex/, <output array>);

Here, if we called our array arr, then the first match found by our regex in the input string is given by arr[0].

Lastly, it’s useful to know that awk allows us to redefine the value of $0 at anytime. So once, we’ve grabbed the part of the logline we need, we can assign $0 to that value. This keeps the code cleaner and shorter.

For reference, we’ll be looking to count log lines like

root@linuxpc:/var/log/maxscale# grep "slave_down" maxscale_test.log | head -n2
2025-04-17 22:40:53.712   notice : (log_state_change): Server changed state: server2[192.168.2.99:3306]: slave_down. [Slave, Running] -> [Down]
2025-04-17 23:03:36.176   notice : (log_state_change): Server changed state: server2[192.168.2.99:3306]: slave_down. [Slave, Running] -> [Down]
root@linuxpc:/var/log/maxscale# grep "slave_up" maxscale_test.log | head -n2
2025-04-17 23:04:00.186   notice : (log_state_change): Server changed state: server2[192.168.2.99:3306]: slave_up. [Down] -> [Slave, Running]
2025-04-17 23:08:05.291   notice : (log_state_change): Server changed state: server2[192.168.2.99:3306]: slave_up. [Down] -> [Slave, Running]

While we’re at it, we might as well also count how many times replicas have come back online.

#!/bin/awk
{
    if($0 ~ /slave_down/) {match($0,/([0-9]+\.){3}[0-9]+/,m); $0=m[0]; sd[$0]++}
    if($0 ~ /slave_up/) {match($0,/([0-9]+\.){3}[0-9]+/,m); $0=m[0]; su[$0]++}
}
END{
    print "Slaves down counter:"
    for (x in sd) {
        print x","sd[x];
    }
    print "Slaves up counter:"
    for (y in su) {
        print y","su[y];
    }
}

Example output:

root@linuxpc:/var/log/maxscale# awk -f log_analysis.awk maxscale.log
Slaves down counter:
192.168.2.94,1
192.168.2.99,13
Slaves up counter:
192.168.2.99,13

This concept is easy to extend to other contexts: counting 404’s in apache logs, failed ssh logins, or even just to check for the number of open ports through an nmap output etc.

With that, this post concludes.