Exploring a simple deployment pipeline with Terraform
This set of notes is a set of mental notes on working with terraform. The goal here was to take an existing maxscale cluster and be able to add/remove mariadb instances running on docker containers. The focus was to explore how to build a system around terraform, and have an idea of what should, and what should not be done within it.
The Problem
Assuming we already have a basic set-up, we might want to add, or remove new servers:
root@linuxpc:~# maxctrl list servers
┌─────────┬──────────────┬──────┬─────────────┬─────────────────┬───────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │
├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────┼─────────────────┤
│ server1 │ 192.168.2.37 │ 3306 │ 0 │ Master, Running │ 0-1-61043 │ MariaDB-Monitor │
├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────┼─────────────────┤
│ server2 │ 192.168.2.99 │ 3306 │ 0 │ Slave, Running │ 0-1-61043 │ MariaDB-Monitor │
├─────────┼──────────────┼──────┼─────────────┼─────────────────┼───────────┼─────────────────┤
│ server3 │ 192.168.2.98 │ 3306 │ 0 │ Slave, Running │ 0-1-61043 │ MariaDB-Monitor │
└─────────┴──────────────┴──────┴─────────────┴─────────────────┴───────────┴─────────────────┘
We want to make sure that terraform itself will not try to interfere with these existing servers.
This means that we will need a way to tell terraform about “where to start” based on the output of maxscale, and create the server variables dynamically for terraform.
Operational Workflow Plan
- Have a simple command to trigger the workflow
- Discover existing servers
- Compute a boundary, to determine which server number is our last server
- generate a
locals.tfin case we plan to add new servers - we should have an option for
terraform applyto add new servers, and an option forterraform destroyto remove them - in either case, we need to update maxscale config
- Running
terraform applytwice with different params should not destroy any resources, this requires us to identify created resources/servers with batching
The intended layout of the project is:
project/
├── Makefile
├── config.mk
├── scripts/
│ ├── new_batch.sh
│ ├── deploy_to_new_batch.sh
│ ├── destroy_in_batch.sh
│ ├── delete_batch_dir.sh
│ └── list_batches.sh
├── batches/ (generated)
└── terraform-template/
├── awk_scripts/
│ ├── service_discovery.awk
│ ├── port_check.awk
│ ├── create_maxscale.awk
│ └── destroy_maxscale.awk
├── docker/
│ ├── Dockerfile
│ ├── entrypoint.sh
│ └── my.cnf
├── dump/
│ └── *.sql (generated)
├── create_maxscale.sh
├── destroy_maxscale.sh
├── init.sh
├── Makefile
├── config.mk
├── boundary.mk (generated)
├── boundary.txt (generated)
├── created_servers.txt (generated)
├── main.tf
├── locals.tf (generated)
└── misc files generated by terraform like the terraform state file
By batching, what is meant is that the top-level Makefile acts as an orchestrator to allow running terraform in distinct directories located under batches/. This way, it is possible to create resources, and then use terraform again to create new resources, without terraform wanting to delete the existing resources, as they are in separate state files, managed separately.
Before we talk further about batching, it is good to walk through what happens during a single batch/simple usage of terraform.
Workflow of a single batch
Since the tasks have a lot of distinct moving parts to support terraform, such as taking and copying database dumps, copying dockerfiles, creating docker images, inspecting maxscale, modifying maxscale configurations, and restarting maxscale, it is imperative to have a way to easily tie commands and scripts together, and orchestrate execution. As a result, Makefile is to be used for this orchestration. It’s mostly helpful in avoiding the creation of some overcomplicated bash spaghetti.
The other advantage is that we can keep all the important variables across the project easily configurable within config.mk. There are only a couple:
root@linuxpc:/opt/terraform/project# cat config.mk.example
ROOT_SRC=$(CURDIR)
TARGET_SRV="{{IP OF SERVER WHERE DOCKER CONTAINERS SHOULD BE CREATED}}"
REMOTE_DOCKER_DIR={{/path/to/directory/with/dockerfile}}
REMOTE_DOCKER_SCRIPTS_DIR=$(REMOTE_DOCKER_DIR)/docker-entrypoint-initdb.d
DOCKER_BUILD="{{docker_image:tag}}"
We will go through explaining each of the steps one by one, and loop back to the orchestration at the end.
Deploying MariaDB with Docker
For the sake of simplicity, the deployment of extra MariaDB servers will be done by spinning up docker containers on server2/192.168.2.99. To make things easier to organize, I’ve put everything that is strictly docker related into a docker directory.
The Dockerfile can be quite barebones, we just need some starting point and install mariadb:
FROM debian:12-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
mariadb-server mariadb-client \
ca-certificates tzdata \
&& rm -rf /var/lib/apt/lists/*
COPY docker-entrypoint-initdb.d/ /docker-entrypoint-initdb.d/
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod +x /usr/local/bin/entrypoint.sh
VOLUME ["/var/lib/mysql"]
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
CMD ["mariadbd"]
In the entrypoint.sh script, we will check if this is the first time the docker container is about to be started, and if it is, we will import some sql files, like the dump from master, set up replication with gtid position from master, and apply a new my.cnf before startup, to allocate a unique server-id to the mariadb in the container:
#!/bin/bash
MARIADB_PORT=${MARIADB_PORT:-3311}
MARIADB_SERVER_ID=${MARIADB_SERVER_ID:-4}
DATADIR="/var/lib/mysql"
RUNDIR="/run/mysqld"
LOCK_FILE=/var/lib/mysql/setup.lock
if ! [ -f "$LOCK_FILE" ]; then
#Don't do all this in case of an accidental docker restart
sed -E -i "s/(server-id=)[0-9]+/\1${MARIADB_SERVER_ID}/" /docker-entrypoint-initdb.d/my.cnf
cat /docker-entrypoint-initdb.d/my.cnf > /etc/mysql/my.cnf
echo "lock" > $LOCK_FILE
mkdir -p "$RUNDIR"
chown -R mysql:mysql "$RUNDIR"
chown -R mysql:mysql "$DATADIR"
if [ ! -d "${DATADIR}/mysql" ]; then
echo "Initializing MariaDB data directory..."
mariadb-install-db --user=mysql --datadir="$DATADIR" >/dev/null
fi
#Initialize DB here for importing SQL files, don't allow connections
echo "Starting temporary MariaDB (no networking) for init..."
mariadbd --user=mysql --datadir="$DATADIR" --skip-networking --socket=/tmp/mysqld.sock &
pid="$!"
#Wait until mariadb server finished initializing before trying to import anything
for i in {1..60}; do
if mariadb-admin --protocol=socket --socket=/tmp/mysqld.sock ping >/dev/null 2>&1; then
break
fi
sleep 0.5
done
echo "Running init scripts in /docker-entrypoint-initdb.d ..."
shopt -s nullglob
for f in /docker-entrypoint-initdb.d/*; do
case "$f" in
*.sql)
echo " -> importing $f"
mariadb --protocol=socket --socket=/tmp/mysqld.sock < "$f"
;;
*.sql.gz)
echo " -> importing $f"
gunzip -c "$f" | mariadb --protocol=socket --socket=/tmp/mysqld.sock
;;
*.sh)
echo " -> running $f"
bash "$f"
;;
*)
echo " -> ignoring $f"
;;
esac
done
echo "Shutting down temporary MariaDB..."
mariadb-admin --protocol=socket --socket=/tmp/mysqld.sock shutdown
wait "$pid" || true
fi
echo "Starting MariaDB on port ${MARIADB_PORT}..."
exec "$@" --user=mysql --datadir="$DATADIR" --bind-address=0.0.0.0 --port="$MARIADB_PORT"
The most important aspect here is to create a lock file on first time this script runs, so that if the docker container is accidentally restarted, we don’t try to load the dump again.
Given that terraform isn’t the best tool to manage the transport and creation of mariadb dumps, and docker images, we’ll also need a small initializer script that can take a dump from master, record the gtid position, and build the docker image:
#!/bin/bash
TARGET_SRV=$1
ROOT_SRC=$2
REMOTE_DOCKER_DIR=$3
REMOTE_DOCKER_SCRIPTS_DIR=$4
DOCKER_BUILD=$5
DUMP_DIR=$ROOT_SRC/dump
DOCKER_DIR=$ROOT_SRC/docker
mkdir -p $DUMP_DIR
mkdir -p $DOCKER_DIR
if [[ $(ps aux | grep -v grep | grep -c mariadbd) -lt 1 ]]; then
echo "Can not find active mariadb instance"
exit 1
fi
sed -E -i "s/(gtid_slave_pos=)'[0-9-]+'/\1'%SLAVE%'/" $DUMP_DIR/01_queries.sql
mysqldump --all-databases --master-data=2 --single-transaction -u root > $DUMP_DIR/00_dump.sql
mysql -u root --skip-column-names -e "SELECT @@gtid_binlog_pos;" > $DUMP_DIR/gtid_pos.txt
GTID=$(cat $DUMP_DIR/gtid_pos.txt)
sed -E -i "s/%SLAVE%/${GTID}/" $DUMP_DIR/01_queries.sql
# The dump captures a consistent snapshot at a point in time, and the GTID position captured alongside it marks exactly where that snap
shot ends. The replication can then pick up from this marker.
scp $DUMP_DIR/00_dump.sql root@$TARGET_SRV:$REMOTE_DOCKER_SCRIPTS_DIR
scp $DUMP_DIR/01_queries.sql root@$TARGET_SRV:$REMOTE_DOCKER_SCRIPTS_DIR
scp $DOCKER_DIR/Dockerfile root@$TARGET_SRV:$REMOTE_DOCKER_DIR
scp $DOCKER_DIR/entrypoint.sh root@$TARGET_SRV:$REMOTE_DOCKER_DIR
scp $DOCKER_DIR/my.cnf root@$TARGET_SRV:$REMOTE_DOCKER_SCRIPTS_DIR
ssh root@$TARGET_SRV "docker build -t $DOCKER_BUILD $REMOTE_DOCKER_DIR"
The --master-data=2 flag captures the binary log filename and position, it is needed when setting up the replication on the docker container’s mariadb.
The other important thing here is to record a gtid position, so that the replication knows where to catch up from, once the dump is loaded into the replica. I do not have docker installed on my host PC, so this is why I copy over the docker files and build the images on the VM instead.
Important Caveat
This is a terribly slow way of doing things in practice. For a reasonably big dataset, both the dump and the loading of the dump would take a considerable amount of time. But for the sake of learning terraform, this is OK. In practice, it would be better to work with e.g. ZFS snapshots on VMs.
The main.tf terraform file
terraform {
required_version = ">= 1.5.0"
required_providers {
docker = {
source = "kreuzwerker/docker"
version = "~> 3.0"
}
local = {
source = "hashicorp/local"
version = "~> 2.0"
}
}
}
variable "target_ip" { type = string }
variable "docker_image" { type = string }
variable "batch_id" {type = string}
provider "docker" {
host = "ssh://root@${var.target_ip}"
}
resource "docker_network" "lab" {
name = "tf-mariadb-lab-net${var.batch_id}"
}
resource "docker_container" "server" {
for_each = local.servers
name = each.key
image = "${var.docker_image}"
hostname = each.key
env = [
"MARIADB_PORT=${each.value.port}",
"MARIADB_SERVER_ID=${each.value.server_id}",
]
ports {
internal = each.value.port
external = each.value.port
}
networks_advanced {
name = docker_network.lab.name
}
labels {
label = "deployed_by"
value = "terraform"
}
}
The task here is simple. Set up a docker network on the target server, where each docker container should have a unique network port assigned (remembering that docker containers share kernel resources, such as network ports), as well as a unique IP address each, so that they would be reachable externally.
Note that local.servers is not present here. This is because the number of servers that could be added or removed is not a constant number, so this block needs to be generated on the fly, before terraform itself runs. This can be put into a separate locals.tf file, as terraform will compile all .tf files together anyway.
If we want to add two new servers, locals.tf should look like this:
locals {
servers = {
"server4" = { port = 3314, server_id = 4 },
"server5" = { port = 3315, server_id = 5 }
}
}
Service Discovery
The first problem that we need to tackle is service discovery. Before running any terraform commands, we need to make sure that we actually know what’s running. If we’re adding new servers, we need to make sure that terraform will ignore any server that already exists. If we’re destroying servers, we will assume that working “backwards” is good enough (i.e. destroy only terraform managed resources, in the reverse of the order they were created).
The simplest way to do this, is to parse the output of maxctrl list servers. It outputs a table, as shown in the introduction, so we can parse this by piping it into an awk script like (service_discovery.awk):
#!/bin/awk -f
BEGIN{
FS="│";
max_id = 0;
if (port == "){port = 3313;}
if (number == "){number = 2;}
}
{
if($2 ~ /server[0-9]+/){
for (i=1; i<=NF; i++) {
gsub(/^[ \t]+|[ \t]+$/, ", $i);
}
match($2, /([0-9]+)$/, current_id);
if (current_id[1] > max_id) {
max_id = current_id[1];
}
}
}
END{
boundary = max_id+1;
print boundary;
print "locals {";
print " servers = {";
for(i = boundary; i < boundary+number; i++) {
port++;
print " \"server"i"\" = { port = "port", server_id = "i" },";
}
print " }"
print "}";
}
Here we are doing two things:
-> grab how many servers are know to maxscale, and keep track of it as
max_id. The next value aftermax_idis our boundary. It means that if we add a new server, the firstserver-idshould be this boundary value. We’ll write the result to the top of the output. If we want to destroy servers, then this boundary value is a good starting point for calculations to walk backwards from.-> write a locals block, this will become our
locals.tfonce we remove that boundary value from the top, the number of servers created depends on the input “number”
The output is not perfect, as it adds an extra , at the final server definition, but this can be piped into a simple sed expression sed -e 'H;1h;$!d;x;s/\(.*\),/\1/' to remove that final comma. Quick breakdown of this one-liner:
#sed -e
H; #append the current line from pattern space to hold space
1h; #on line one, copy the line into the hold space, this is just because H otherwise leaves a blank line on line one
$!d; # for every line that is not the last line, delete the pattern space and starts the next cycle , basically means not to print anything until the very end, essentially holding the whole file in memory
x; # once the last line is reached, swap everything from hold space back into the pattern space
s/\(.*\),/\1/ #now perform the substitution, match everything that is not a comma into a capture group, match a comma, add the capture group into the replacement pattern
With that out of the way, we can put the awk script into an awk_scripts folder, and build the service discovery script (service_discovery.sh):
#!/bin/bash
NUMBER=$1
PORT=$2
OPERATION=$3
ROOT_SRC=$4
TMP=$(mktemp)
PORT_CHECK=$(maxctrl list servers | awk -f $ROOT_SRC/awk_scripts/port_check.awk)
DESIRED_PORT=$(echo "$PORT + 1" | bc)
if [[ $DESIRED_PORT -le $PORT_CHECK && $OPERATION =~ "APPLY" ]]; then
echo "Desired port of $DESIRED_PORT is unavailable, failed check against $PORT_CHECK, try setting a higher port!"
exit 1
fi
maxctrl list servers | awk -v number=$NUMBER -v port=$PORT -f $ROOT_SRC/awk_scripts/service_discovery.awk | sed -e 'H;1h;$!d;x;s/\(.*\),/\1/' > $TMP
if ! head -n1 "$TMP" | grep -Eq '^[0-9]+$'; then
echo "Failed to derive boundary from maxctrl output" >&2
exit 1
fi
cat $ROOT_SRC/boundary.txt > $ROOT_SRC/boundary.txt.last
awk '(NR==1){print $0}' $TMP > $ROOT_SRC/boundary.txt
sed -i '1d' $TMP
cat $ROOT_SRC/locals.tf > $ROOT_SRC/locals.tf.last
cat $TMP > $ROOT_SRC/locals.tf
rm $TMP
Here we tie it all together, we invoke maxctrl, pipe it into awk and sed as described above, quit the script if the boundary value is nonsense. If that check passes, we save the boundary value to another file, and then cut it off, so that we can have our locals.tf with the appropriate server definitions. The boundary.txt file will be turned into a boundary.mk to be usable in Makefile.
Essentially, in the orchestration phase, we need to make sure this script always runs well before we run terraform itself. We need both the updated boundary value to be fresh, as well as the locals.tf file to be refreshed.
When deploying new resources, we also need a check that we aren’t trying to use ports that are already busy (port_check.awk):
#!/bin/awk -f
BEGIN{
FS="│";
max_port = 0;
}
{
if($4 ~ /[0-9]+/) {
match($4, /[0-9]+/, port);
if (port[0] > max_port) {
max_port = port[0];
}
}
}
END{
print max_port;
}
Maxscale
Whenever we create or destroy a mariadb container, we need to update the awareness of the situation for maxscale accordingly. This means we will need one script to update /etc/maxscale.cnf when we add servers, and another script for when we destroy them.
The main assumption I have made here is that /etc/maxscale.cnf keeps its default layout that comes with the package manager. So there’s a big block of comments between each of the sections.
There’s three major parts that we want to edit:
- -> Server Definitions
############################################################################
# Server definitions #
# #
# Set the address of the server to the network address of a MariaDB server.#
############################################################################
[server1]
type=server
address=192.168.2.37
port=3306
protocol=MariaDBBackend
[server2]
type=server
address=192.168.2.99
port=3306
protocol=MariaDBBackend
[server3]
type=server
address=192.168.2.98
port=3306
protocol=MariaDBBackend
We want our create script to add more like these, for each new server, or the delete script to delete blocks like these, so long as it will not delete these original ones.
- -> Monitor Settings
servers=server1,server2,server3
We just want to add/remove servers on this line depending on whether we create or destroy servers. Whenever we add a server, we want maxscale to monitor it, and whenever we destroy one, we want maxscale to stop monitoring it.
- -> Service Definitions
servers=server2,server3
All the new servers would be replicas, so they’d just need to be added to the service definition of the read-only listener when creating them, and removed from here when destroying them.
Create
Since this is just some basic text manipulation, awk is an excellent tool to achieve the above goals (create_maxscale.awk):
#!/bin/awk -f
BEGIN{
found = 0;
if(boundary == ") {boundary = 4;}
if(number == ") {number = 2;}
if(port == ") {port = 3313;}
if(address == ") {address="192.168.2.99";}
pat="\\[server"(boundary-1)"\\]";
pat2=";
repl=";
for(i=boundary-2; i<boundary; i++) {
pat2=pat2"server"i",";
}
pat2=substr(pat2, 1, length(pat2) - 1);
for(i=boundary; i<boundary+number; i++) {
repl=repl",server"i;
}
}
{
if($0 ~ pat2) {
sub(pat2, pat2"repl, $0);
print $0;
} else {
print $0;
}
if($0 ~ pat) {
found = 1;
}
if((found == 1) && ($0 ~ /protocol=MariaDBBackend/)) {
for(i = boundary; i < (boundary+number); i++) {
port++;
print "\n[server"i"]";
print "type=server";
print "address="address;
print "port="port;
print "protocol=MariaDBBackend\n";
}
found = 0;
next;
}
}
END{
}
Handles creating the server blocks, and updates the server list references. We can call this from a bash script (create_maxscale.sh), and restart maxscale after the modification:
#!/bin/bash
BOUNDARY=$1
NUMBER=$2
PORT=$3
ROOT_SRC=$4
TARGET_SRV=$5
TMP=$(mktemp)
CREATED_SERVERS_FILE=$ROOT_SRC/created_servers.txt
if [[ $BOUNDARY -lt 4 ]]; then
exit 1
fi
CREATED_SERVERS_VAR="
ENDPOINT=$(echo "$BOUNDARY + $NUMBER" | bc)
for ((i = $BOUNDARY; i < $ENDPOINT; i++ )); do
CREATED_SERVERS_VAR="${CREATED_SERVERS_VAR}server${i},"
done
echo "$CREATED_SERVERS_VAR" | sed 's/,$//' > $CREATED_SERVERS_FILE
awk -v boundary=$BOUNDARY -v number=$NUMBER -v port=$PORT -v address=$TARGET_SRV -f $ROOT_SRC/awk_scripts/create_maxscale.awk /etc/maxscale.cnf > $TMP
cat /etc/maxscale.cnf > /etc/maxscale.cnf.last
cat $TMP > /etc/maxscale.cnf
systemctl restart maxscale
rm $TMP
We are outputting a created_servers.txt to make deletion logic easier (also useful for the batching mentioned at the beginning). Essentially, we want to have a record of what resources we are creating, so that we can reference it when destroying the resources. This avoids having to rely on boundary calculations for deletions.
Destroy
Similarly, awk can take care of removing things from maxscale config (destroy_maxscale.awk):
#!/bin/awk -f
BEGIN{
if(number == ") {exit 1;}
if(boundary == ") {exit 1;}
startpos = boundary - number;
endpos = boundary - 1;
skip = 0;
blank_seen = 0;
for (i = startpos; i <= endpos; i++) {
managed["[server"i"]"] = 1;
}
}
{
if(skip > 0) {
skip--;
next;
}
if($0 in managed) {
skip = 4;
next;
}
if ($0 ~ /^[[:space:]]*$/) {
if (blank_seen) {
next;
}
blank_seen = 1;
print $0;
next;
}
blank_seen = 0;
print $0;
}
Basically, the script receives the boundary value, what would be the next server, and computes which server-id’s are to be deleted, by identifying the relevant [serverX] lines, deleting them together with the next 4 lines. Here we also take care of any stray newlines and collapse them. The server list references are updated in the bash caller (destroy_maxscale.sh):
#!/bin/bash
ROOT_SRC=$1
if [[ ! -s "$ROOT_SRC/created_servers.txt" ]]; then
echo "No created servers file found, aborting"
exit 1
fi
MANAGED_SERVERS=$(awk '(NR==1){print $0}' $ROOT_SRC/created_servers.txt)
echo "DELETING SERVERS ${MANAGED_SERVERS} FROM maxscale.cnf"
LAST_SERVER_ID=$(echo $MANAGED_SERVERS | grep -Eo 'server[0-9]+$' | sed 's/server//')
NEXT_SERVER_ID=$(echo "$LAST_SERVER_ID + 1" | bc)
NUMBER=$(awk 'BEGIN{FS=","}{if(NR==1){print NF}}' $ROOT_SRC/created_servers.txt)
if [[ $LAST_SERVER_ID -lt 4 ]]; then
exit 1
fi
TMP=$(mktemp)
awk -v boundary=$NEXT_SERVER_ID -v number=$NUMBER -f $ROOT_SRC/awk_scripts/destroy_maxscale.awk /etc/maxscale.cnf > $TMP
cat /etc/maxscale.cnf > /etc/maxscale.cnf.last
cat $TMP > /etc/maxscale.cnf
sed -E -i "s/,$MANAGED_SERVERS//g" /etc/maxscale.cnf
systemctl restart maxscale
rm $TMP
Here we can simply reuse the created_servers.txt file from the creation/apply stage, to easily determine what we should be removing from service.
The Orchestration
Now that all the elements are in place, we can put all of this together into a Makefile. As established earlier, make helps us with not having to write a complex bash script to act as the orchestration and entry point of the project. We can ensure that all distinct elements (e.g. bash, awk, terraform) of the projects can rely on the same set of initial variables, making the project more portable and configurable.
This way, we can also set up simple commands to operate the project with, e.g. make OPERATION=APPLY/DESTROY without having to think about chaining commands manually based on exit codes etc.
-include config.mk
-include boundary.mk
OPERATION ?=
VALID_OPERATIONS := APPLY DESTROY
NUMBER ?=
PORT ?=
BATCH_ID ?=
ifeq ($(NUMBER),)
NUMBER = 2
endif
ifeq ($(PORT),)
PORT = 3313
endif
ifeq ($(BATCH_ID),)
BATCH_ID = 0
endif
ifneq ($(OPERATION),APPLY)
ifneq ($(OPERATION),DESTROY)
$(error Invalid value for OPERATION: "$(OPERATION)". Must be "APPLY" or "DESTROY")
endif
endif
IS_MAXSCALE_ACTIVE := $(shell systemctl is-active maxscale)
ifneq ($(IS_MAXSCALE_ACTIVE),active)
$(error Maxscale service is not running, can not do service discovery)
endif
ifeq ($(OPERATION),APPLY)
TARGETS := build apply
else ifeq ($(OPERATION),DESTROY)
TARGETS := destroy
endif
operate: service_discovery
$(MAKE) execute_operation
#we need a dynamically generated value from a file, so we must call make from make to refresh state
execute_operation:
$(MAKE) $(TARGETS)
#checks maxctrl, tells what's currently running gives the number of current servers, and builds a locals.tf file
service_discovery:
rm -f boundary.mk
bash service_discovery.sh $(NUMBER) $(PORT) $(OPERATION) $(ROOT_SRC)
@echo "BOUNDARY=$$(cat boundary.txt)" > boundary.mk
#builds and transports the docker images to the target server, along with a fresh mysqldump
build:
bash init.sh $(TARGET_SRV) $(ROOT_SRC) $(REMOTE_DOCKER_DIR) $(REMOTE_DOCKER_SCRIPTS_DIR) $(DOCKER_BUILD)
#runs the new docker containers, updates maxscale
apply:
terraform init
terraform apply -var="target_ip=$(TARGET_SRV)" -var="docker_image=$(DOCKER_BUILD)" -var="batch_id=$(BATCH_ID)"
bash create_maxscale.sh $(BOUNDARY) $(NUMBER) $(PORT) $(ROOT_SRC) $(TARGET_SRV)
#destroys the docker containers, updates maxscale
destroy:
bash destroy_maxscale.sh $(ROOT_SRC)
terraform destroy -var="target_ip=$(TARGET_SRV)" -var="docker_image=$(DOCKER_BUILD)"
.PHONY: service_discovery execute_operation operate build apply destroy
The main key takeaways:
->
ifneq ($(OPERATION),APPLY)/ifneq ($(OPERATION),DESTROY), we want to restrict the user to simple commands, and give a helpful error message->
IS_MAXSCALE_ACTIVE := $(shell systemctl is-active maxscale), if maxscale isn’t running, the operation is fundamentally impossible, so we should crash out in that case->
operate: service_discovery, we want to ensure service discovery is always done, so we make it a dependency of operate->
$(MAKE), recursive make, we need this to be able to get the updated boundary value, in other words, we generate a configuration fileboundary.mkin an early target, and callmakeagain to use the fresh file for the rest of the targets-> even without much knowledge of
Makefile, any user can easily configure the project viaconfig.mk
Batching, aka the top-level orchestration
Now that the lifecycle of a single terraform apply/destroy is clear, we can look at top-level orchestration. The goal is that we should be able to run terraform apply multiple times, and it should not destroy resources created in the past. It should append. We should be able to destroy specific resources, and leave the rest unaffected.
There are five things we might want to do on a top-level:
-> establish a new batch, but not deploy any resource
-> deploy a resource belonging to some batch
-> destroy a resource belonging to some batch
-> delete a batch structure, regardless of what is deployed
-> list all the batches and their members
It makes sense to keep each of these things as a script, and tie operations together with a Makefile.
Establishing a new batch
Basically, we just want the system to be ready for terraform commands to come in, but without actually executing any terraform yet (new_batch.sh):
#!/bin/bash
BATCH_NAME=$1
THIS_BATCH=$2
BATCH_DIR=$3
TEMPLATE_DIR=$4
echo "Creating batch: ${BATCH_NAME}"
if [ -d "${THIS_BATCH}" ]; then
echo "Error: Batch ${BATCH_NAME} already exists"
exit 1
fi
mkdir -p "$BATCH_DIR" && mkdir -p "$THIS_BATCH"
cp -r ${TEMPLATE_DIR} ${THIS_BATCH}
cd ${THIS_BATCH} && rm -f terraform.tfstate* .terraform.lock.hcl boundary.txt boundary.mk boundary.txt.last locals.tf locals.tf.last created_servers.txt dump/00_dump.sql dump/gtid_pos.txt
cd ${THIS_BATCH} && rm -rf .terraform/
echo "Batch created @ ${THIS_BATCH}"
Takes the terraform-template/ directory, creates a copy with the desired batch name, and deletes any files and directories that are generated, so that the batch will start with a clean slate. At this point, it is possible to direct terraform commands at this directory and expect them to work.
Deploying resources into a batch
If we have created a batch, then this is where we want to be able to cd into the batch directory, and execute a make OPERATION=APPLY on this batch. This means that terraform will manage only the resources belonging to this batch, and it will not attempt to manage/destroy/etc resources belonging to other batches. A clear separation of concerns (deploy_to_new_batch.sh).
#!/bin/bash
BATCH_NAME=$1
THIS_BATCH=$2
WORKING_DIR=$3
NUMBER=$4
PORT=$5
TEMPLATE_DIR=$6
BATCH_REGISTRY=$7
if [ ! -d "${THIS_BATCH}" ]; then
echo "Error: Batch ${BATCH_NAME} not found"
exit 1
fi
if [ ! -d "${WORKING_DIR}" ]; then
echo "Error: Batch ${BATCH_NAME} has no template dir!"
exit 1
fi
echo "Deploying batch: ${BATCH_NAME} with ${NUMBER} servers"
cd ${WORKING_DIR} && make OPERATION=APPLY NUMBER=${NUMBER} PORT=${PORT} BATCH_ID=${BATCH_NAME}
echo "${BATCH_NAME}: ${NUMBER} servers" >> ${BATCH_REGISTRY}
echo "Batch deployed successfully!"
We’ll also keep a top-level registry of batches, so that we can have a simple bird’s eye view of what’s been deployed per batch.
Destroying resources in a batch
Basically, if we create resources in a batch, we should be able to destroy them too (`destroy_in_batch.sh):
#!/bin/bash
BATCH_NAME=$1
THIS_BATCH=$2
WORKING_DIR=$3
NUMBER=$4
TEMPLATE_DIR=$5
BATCH_REGISTRY=$6
if [ ! -d "${THIS_BATCH}" ]; then
echo "Error: Batch ${BATCH_NAME} not found"
exit 1
fi
if [ ! -d "${WORKING_DIR}" ]; then
echo "Error: Batch ${BATCH_NAME} has no template dir!"
exit 1
fi
echo "Destroying batch: ${BATCH_NAME}"
cd ${WORKING_DIR} && make OPERATION=DESTROY NUMBER=${NUMBER}
sed -i "/^${BATCH_NAME}:/d" ${BATCH_REGISTRY}
echo "Batch destroyed successfully!"
Once again, good to update the top level registry.
Deleting a batch structure
Self-explanatory. This script deletes the batch structure, but not the resources (delete_batch_dir.sh):
#!/bin/bash
BATCH_NAME=$1
THIS_BATCH=$2
TEMPLATE_DIR=$3
BATCH_REGISTRY=$4
if [ ! -d "${THIS_BATCH}" ]; then
echo "Error: Batch ${BATCH_NAME} not found!"
exit 1;
fi
echo "Removing batch directory: ${THIS_BATCH}"
read -p "Are you sure? [y/N] " CONFIRM;
if [ "$CONFIRM" = "y" ] || [ "$CONFIRM" = "Y" ]; then
rm -rf ${THIS_BATCH};
sed -i "/^${BATCH_NAME}:/d" ${BATCH_REGISTRY}
echo "Batch removed!";
else
echo "Cancelled";
fi
Once again, good to keep the top level registry informed.
Displaying a simple top level registry (list)
If we do not do this, we rely purely on maxscale, or /etc/maxscale.cnf to see this information.
#!/bin/bash
BATCH_NAME=$1
THIS_BATCH=$2
BATCH_DIR=$3
TEMPLATE_DIR=$4
BATCH_REGISTRY=$5
echo "[=== Deployed Batches ===]"
cat $BATCH_REGISTRY | awk -v dir="$(echo $BATCH_DIR/)" 'BEGIN{FS=":"}{print "echo "$1" && cat "dir$1"/terraform-template/created_servers.txt"}' | bash
The output is very simple:
root@linuxpc:/opt/terraform/part2# make list
[=== Deployed Batches ===]
batch-1
server4,server5
batch-2
server6,server7
Makefile
Given that we’ve offsourced the logic to bash, the makefile to hold all this together becomes very simple, and largely self-explanatory:
-include config.mk
BATCH_NAME ?= batch-$(shell date +%Y%m%d-%H%M%S)
THIS_BATCH := $(BATCH_DIR)/$(BATCH_NAME)
WORKING_DIR := $(THIS_BATCH)/terraform-template
NUMBER ?= 2
PORT ?= 3313
help:
@echo "Batch Deployment Orchestrator"
@echo "
@echo "Targets:"
@echo " make new [BATCH_NAME=name] - Create new batch"
@echo " make deploy BATCH_NAME=name [NUMBER=2] - Deploy batch"
@echo " make destroy BATCH_NAME=name [NUMBER=2] - Destroy batch"
@echo " make list - List all batches"
@echo " make clean BATCH_NAME=name - Delete batch directory"
@echo "
@echo "Examples:"
@echo " make new BATCH_NAME=batch-1"
@echo " make deploy BATCH_NAME=batch-1"
@echo " make destroy BATCH_NAME=batch-1"
new:
@bash $(SCRIPTS_DIR)/new_batch.sh $(BATCH_NAME) $(THIS_BATCH) $(BATCH_DIR) $(TEMPLATE_DIR)
deploy:
@bash $(SCRIPTS_DIR)/deploy_to_new_batch.sh $(BATCH_NAME) $(THIS_BATCH) $(WORKING_DIR) $(NUMBER) $(PORT) $(TEMPLATE_DIR) $(BATCH_REGISTRY)
destroy:
@bash $(SCRIPTS_DIR)/destroy_in_batch.sh $(BATCH_NAME) $(THIS_BATCH) $(WORKING_DIR) $(NUMBER) $(TEMPLATE_DIR) $(BATCH_REGISTRY)
list:
@bash $(SCRIPTS_DIR)/list_batches.sh $(BATCH_NAME) $(THIS_BATCH) $(BATCH_DIR) $(TEMPLATE_DIR) $(BATCH_REGISTRY)
clean:
@bash $(SCRIPTS_DIR)/delete_batch_dir.sh $(BATCH_NAME) $(THIS_BATCH) $(TEMPLATE_DIR) $(BATCH_REGISTRY)
.PHONY: new deploy destroy list clean help
.DEFAULT_GOAL := help
We just need to add targets that map to our bash scripts and push through the relevant variables. The config file just keeps a few variables that I felt might be useful to toggle (config.mk):
ROOT_SRC=$(CURDIR)
TEMPLATE_DIR=$(ROOT_SRC)/terraform-template
BATCH_DIR=$(ROOT_SRC)/batches
BATCH_REGISTRY=$(BATCH_DIR)/batch_registry.txt
SCRIPTS_DIR=$(ROOT_SRC)/scripts
And that brings us to the end here. It is now possible to keep deploying new resources of the same type in a way such that terraform will not try to destroy existing resources.
Conclusion
This exercise helps to see that while terraform is an amazing tool for resource provision, it is not intended as a tool to implement an d configure application deployments, nor is it a good orchestrator, and it is in practice used in tandem with other tools to achieve en d-to-end deployments.