Basic differences between MariaDB vs PostgreSQL, from a MariaDB PoV
CONFIG FILES AND CONNECTION BASICS
postgresql.conf (~ my.cnf / mariadb.cnf)
# Linux:
cat /etc/postgresql/15/main/postgresql.conf
Key sections to understand:
shared_buffers = 128MB # MariaDB: innodb_buffer_pool_size
Shared memory for all connections, RoT ~25% of system RAM
work_mem = 4MB # MariaDB: sort_buffer_size (per operation)
Memory for sorts, hash joins, etc. this value is per operation, so 100 connections doing sorts = 100 times work_mem Start conservative on this one, increase if needed
maintenance_work_mem = 64MB # MariaDB: no direct equivalent
No MariaDB equivalent, this is memory for things like VACUUM, CREATE INDEX, these are relatively rare operations
effective_cache_size = 4GB # Hint to planner (not actual allocation)
Tells the query planner how much OS cache is available, RoT ~75% of RAM
max_connections = 100 # MariaDB: max_connections
Very different implications in PostgreSQL! (See section on CONNECTIONS)
pg_hba.conf ( ~ mysql.user table)
# Linux:
cat /etc/postgresql/15/main/pg_hba.conf
PostgreSQL manages host-based authentication in a dedicated configuration file, MariaDB manages these rules directly within its internal grant tables.
MariaDB: Every user account is explicitly tied to a host or IP range at the time of creation, use GRANT/REVOKE statements to determine privileges related to database objects. Basically:
CREATE USER 'username'@'192.168.1.100' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON my_schema.* TO 'username'@'192.168.1.100'
PostgreSQL:
PostgreSQL separates identity and access. You create a role (user) inside the database, but you must define which hosts that role can connect from in the external pg_hba.conf file. E.g. To allow the equivalent of @’%’ in MariaDB:
# TYPE DATABASE USER ADDRESS METHOD
host my_database my_user 0.0.0.0/0 scram-sha-256
Don’t forget to do a SELECT pg_reload_conf(); after updating this file.
Rules are processed from TOP to BOTTOM, first match wins.
Main difference is that, in PostgreSQL you don’t create users with a host, you just create it like this:
-- Create the user with a password
CREATE USER my_user WITH PASSWORD 'secure_password';
-- 1. Grant connection to the specific database
GRANT CONNECT ON DATABASE my_database TO my_user;
-- 2. Grant USAGE on the schema (allows the user to "see" it)
GRANT USAGE ON SCHEMA my_schema TO my_user;
-- 3. Grant privileges on all existing tables in that schema
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA my_schema TO my_user;
-- 4. Grant privileges on all future tables automatically
ALTER DEFAULT PRIVILEGES IN SCHEMA my_schema
GRANT SELECT, INSERT, UPDATE ON TABLES TO my_user;
The 5 Columns:
-> TYPE: local - Unix socket ~ root@localhost in MariaDB, host - TCP/IP (a la root@%, i.e. needs password)
-> DATABASE: all - ALL, mydb - specific schema, mysdb,otherdb - list of schemas, sameuser - database matching username (the oracle style), replication - Special keyword for replication connections
-> USER: all - ANYONE, postgres - Specific user, appuser - Application user, @admins - All users in admins file (group)
-> ADDRESS: (empty for local type), 127.0.0.1/32 - localhost only, 10.0.1.0/24 - Subnet (10.0.1.0 - 10.0.1.255), 10.0.1.50/32 - specific IP, 0.0.0.0/0 - ANY
-> METHOD: trust - No password, reject - Always deny, md5 - Password hashed with MD5, scram-sha-256 - stronger hash method, peer - Use OS username (unix socket)
CONNECTIONS
PostgreSQL connections work differently from MySQL, new processes are created for each connection.
MariaDB: uses a single process with multiple threads to handle its connections (mysqld), all database operations from query execution to buffer management happen within one unified process space
This makes MariaDB comparatively lightweight since it doesn’t fork new processes for new connections. All threads are in the same memory space and share resources like the buffer pool, query cache, and table metadata, enabling efficient resource utilization, great for concurrency
PostgreSQL: Each connection is completely isolated from the others, meaning higher memory usage since each process needs its own space.
A supervisor process listens for incoming connections on the configured port (default 5432) and acts as the primary coordinator for all database activity, after authentication, uses the unix fork() to create a new process dedicated specifically to that connection, with its own memory space and execution context. The newly forked process handles all queries until this connection is closed/disconnecte, maintaining session state, transaction context, and query execution buffers independently from all other connections.
Though each connection gets an isolated space in memory for general query execution, a shared common space for caching data pages, write ahead logging, lock tables is still needed.
NOTE: PostgreSQL also runs some helper processes for WAL, checkpointing to flush dirty buffers, autovacuuming, and query statistics.
Comparatively to MariaDB:
- -> Connections: Each connection is its own thing, if something causes a fatal crash, it is more likely to just crash that connection only, but not the entire database. In MariaDB a query encountering a fatal error can take down the entire database (have first hand experience!).
- -> Connections Speed: Creating a thread is much faster than forking a new process.
- -> Connection Pooling: Can get away without it in MariaDB in many cases, this is much less true in PostgreSQL.
- -> Communication: Communication between processes is slower than communication between threads (less system calls)
- -> Memory: Each process needs a few MB of memory, a much bigger overhead than for MariaDB. In MariaDB, threads share memory so the overhead of having many connections is smaller, meaning it can handle more concurrent connections.
Connection Pooling notes: this can be done either application side, many languages come with connection pooler libraries (e.g. HikariCP), basically fire up a bunch of connections, keep them alive, and the application manages them for itself.
Or, a proxy can be used, where a middleman catches all incoming requests, and hands out readied connections. E.g. pgBouncer.
The idea is that pgBouncer multiplexes e.g. 500 app connections through 50 DB connections so PostgreSQL only spawns 50 processes leading to much more efficient resource usage.
Application connects to 6432 (pgBouncer) and NOT 5432 (PostgreSQL directly)
Sidenote on Connection Pooling
pgBouncer stats:
-- Connect to pgBouncer admin console
psql -h localhost -p 6432 -U pgbouncer pgbouncer
Then:
-- Show pools
SHOW POOLS;
-- Shows: database, user, active connections, waiting clients
-- Show stats
SHOW STATS;
-- Shows: queries per second, bytes in/out
-- Show clients
SHOW CLIENTS;
-- Shows: connected clients, their state
PostgreSQL side:
-- See actual connections
SELECT count(*) FROM pg_stat_activity;
-- See connection states
SELECT state, count(*)
FROM pg_stat_activity
GROUP BY state;
Alternatively there’s PgPool-II, which is more feature rich, and complex, comparable to MaxScale:
PgPool-II:
- -> Connection pooling is the PRIMARY feature
- -> Other features (load balancing, failover) are the bonus
- -> Deploy it mainly for pooling, get HA as extra
MaxScale:
- -> Load balancing/routing is the PRIMARY feature
- -> HA is a nice optional bonus
- -> Can do connection pooling between itself and the db servers, but this is less critical in MariaDB land
WAL Settings (vs binlogs)
wal_level = replica # Needed for replication
wal_buffers = 16MB # MariaDB: innodb_log_buffer_size
checkpoint_timeout = 5min # How often to checkpoint
max_wal_size = 1GB # MariaDB: innodb_log_file_size
What is WAL:
- -> Write-Ahead Log (like the redo logs of MariaDB)
- -> Changes written to WAL before data files
- -> Guarantees durability (if server crashes, replay WAL)
- -> Also used for replication (stream WAL to replicas)
PostgreSQL uses Write-Ahead Logging for crash recovery and replication. Changes are written to WAL first, then asynchronously to data files. If the server crashes, it replays the WAL to recover. For replication, replicas can stream WAL changes from the primary.
REPLICATION AND PITR
PostgreSQL:
- -> WAL: The WAL is the replication stream itself. It’s possible to stream physical bytes to another server for an exact replica.
- -> PITR: Use a backup + postgresql.conf ->
restore_commande.g.(cp /path/to/archive/%f %p)andrecovery_target_time, create a recovery.signal file to tell the DB that it should start in recovery mode - -> Replication: PostgreSQL streams the raw, low-level disk changes, the configuration is owned by the MASTER, the replica tracks log sequence numbers
MariaDB:
- -> binlogs : In MariaDB the redo log itself (~ WAL) is for storage engine level recovery in case of a system crash. The binary log (binlog) is a separate log used to replica statements or row changes to other servers, and it is storage engine independent.
- -> PITR: Use a backup + binlog combo with mysqlbinlog commands (
--start-positionand--stop-positionflags), can literally copy the binlogs to another server and replay them, very portable - -> Replication: The replica server reads and applies the binlogs sent over from the master, the configuration is owned by the REPLICA, the replica reads the binlog, which contains discrete events (statements or rows), and the replica tracks position (file and offset)
TL;DR: WAL is a physical, page-level change stream used for both recovery and replication, whereas binlog is a logical change stream (row or statement based) used for replication, MariaDB uses redo logs (mechanism at different layer, the storange engine) for crash recovery
For quick operational comparisons on setting up REPLICATION, see ANNEX I
STORAGE ENGINES
No pluggable storage engines in PostgreSQL!
MariaDB:
- -> Multiple storage engines per database
- -> InnoDB (default), MyISAM, MEMORY, Aria, ColumnStore, RocksDB, etc.
- -> Different tables can use different engines
- -> Choose engine based on workload
- -> Example: InnoDB for transactions, MEMORY for sessions
PostgreSQL:
- -> Single storage system (heap-based)
- -> No ENGINE clause in CREATE TABLE
- -> All tables use same underlying storage
- -> Optimize through other mechanisms: Partitioning (time-series, large tables), Temporary tables (session data), Materialized views (aggregations) also don’t forget about th e vast options of Extensions
TRANSACTION HANDLING
MySQL/MariaDB:
- -> ACID is engine dependent, InnoDB does full ACID, while other engines might not
- -> REPEATABLE READ - if a transaction reads the same data again, it should read the same data again
- -> MVCC - old versions of rows needed for transactions in progress are stored in an undo log, separate from table space, automatic cleanup, low manual maintenance, be mindful of pileups of long running transactions, may cause the undo log to bloat
- -> Phantom Reads - Gap locking (at default settings), strong protection, but may cause lock timeouts if queries are not well optimized
PostgreSQL:
- -> ACID - No pluggable storage engines, so ACID compliant by default
- -> READ COMMITTED - the second SELECT may return any data. A concurrent transaction may update the record, delete it, insert new records. The second select will see data committed by other transactions since the start of the current statement. Each statement within the transaction gets a new snapshot in READ COMMITTED mode.
- -> MVCC - old records are kept in table space, with versioning, requires VACUUM maintenance
- -> Phantom Reads - Snapshot isolation (if set to REPEATABLE READ), no gap locking is needed, since the engine can simply check the row versions in the table
NOTE ON SERIALIZABLE
MariaDB: Basically REPEATABLE READ + lock all reads, every SELECT gets shared locks
PostgreSQL: Basically tries to put every query through, errors out if a concurrent update was found, no locks on reads like MariaDB
MVCC SIDENOTE
The idea is the same for both MariaDB and PostgreSQL: keep multiple versions of data rows so that readers can access old versions, while writers can create new ones. MVCC is the solution that allows databases to handle concurrent access without locking readers out. Instead of overwriting data directly, the database keeps old versions of rows so that ongoing transactions can still see the data as it existed when they started.
This needs roughly these four things to happen:
- -> should keep old versions of data instead of simply overwriting it
- -> on row update, should preserve the original data as an older version while creating a new version separately
- -> should have a mechanism to manage and locate the different row versions
- -> should cleanup old versions of rows once no active transaction needs them
That’s where the similarities end. The actual implementation of this is totally different between the two systems.
MySQL/MariaDB (InnoDB):
Uses a file based approach. Old records do not remain in the main table, they are written to a file based log (undo log).
-> updates are in-place, meaning that whenever a row is updated, it is done directly in the main table. The current row in the actual table always represents the latest committed version (for updates to PK, delete + insert happens, but the point stands: the latest committed version remains)
-> InnoDB uses a few internal (hidden) fields:
DB_TRX_ID: ID of trx that last inserted or updated this rowDB_ROLL_PTR: pointer that points to the undo log record containing the previous versionDB_ROW_ID: row ID that is used internally if no PK exists, not referencable by usual functions accessible to users-> The old version of the data is written to a separate undo log tablespace, not the main table. In the actual table, the only reference to the undo log would be the aforementioned
DB_ROLL_PTR.-> When a trx needs to see an older version of a row, InnoDB will use this
DB_ROLL_PTRchain in the undo log to locate the row as it existed at the required point in time (actually, the undo log only contains deltas, not full rows)-> A row update only affects the relevant index(es) on that particular column
-> Cleanup is automatic, there are background purge threads that clean the undo log once no active trx needs these old versions.
PostgreSQL:
PostgreSQL doesn’t have an undo log, or any other log equivalent to it. Instead, it stores multiple versions of the rows directly in the main table itself. Basically, it keeps all revisions in the same table, with markers indicating which version is current, and which versions are old.
-> updates mean that PostgreSQL creates a completely new copy of the entire row in the table. FULL copy with all columns, not just the changed values
-> Both the old version and the new version live in the same table file, side by side, and will remain so, until a VACUUM process removes the old version
-> For this reason, rows all have a few internal hidden fields, to facilitate this process:
xmin: ID of the trx that inserted this rowxmax: ID of trx that deleted or updated this row, this is 0 for the currently latest rowt_ctid: Pointer to the newer version of this row, or to itself if this row is the currently latest row
These fields are then used to determine which transaction can see which row version and isolation level-> This way of managing row versions means that an update triggers all indexes to be updated, since they now need to point to a new location, unnecessary/not well thought out indexes are a bigger performance hit than they are in InnoDB
-> VACUUM - eventually, the old row versions should be removed, to avoid bloat and performance atrophy, the idea is to identify all rows where the
xmaxvalue stored on that row is older than the oldest active transaction in the database
REPEATABLE READ SIDENOTE
MariaDB : Uses gap locks to block any inserts that would create new rows while this transaction is being processed.
SELECT * FROM products WHERE product_id >= 100 AND product_id <= 200 FOR UPDATE;
What this does:
- -> locks rows with
product_id100 to 200 INCLUDING ALL GAPS BETWEEN these existing rows!! - -> locks the gap before
product_id = 100till the next lowest real value (e.g. 95 or something) - -> locks the gap after
product_id = 200till the next highest real value (e.g. 210 or something)
This can block inserts that are unrelated to this query, tables with huge gaps due to frequent deletes will observe this in particular.
PostgreSQL:
Does NOT use this logic at all. No such thing as gap locking. PostgreSQL matches the original WHERE clause of the query. So only the rows that really exist for WHERE product_id >= 100 AND product_id <= 200 will be locked. Nothing else. This is very nice for concurrency
INDEXES
Default behaviour is very different, though the syntax is similar, what happens under the hood is not exactly the same.
Clustered vs Non-Clustered Indexes
MariaDB/MySQL (InnoDB):
Every table MUST have a clustered index:
- -> E.g. the PRIMARY KEY
- -> Table data is physically ordered by the clustered index
- -> Secondary indexes store primary key values (not row pointers)
- -> The row pointers are stored by in the clustered index
- -> This ordering is MAINTAINED on INSERT/UPDATE
Example:
CREATE TABLE users (
id INT PRIMARY KEY, -- This becomes the clustered index
email VARCHAR(255),
created_date DATE,
INDEX idx_email (email) -- Secondary index (stores id + email)
);
Physical layout on disk:
- -> Data rows are stored in PRIMARY KEY order (id 1, 2, 3, 4…)
- -> Secondary index (email) stores: email value pointing to the primary key id
- -> To lookup by email: email index, get id, then seek to clustered index
Implications:
- -> Range scans on PRIMARY KEY are FAST (sequential disk reads)
- -> Secondary index lookups can be two-step (index, then PK lookup)
- -> Choosing wrong PRIMARY KEY has major consequences, leading to bad physical layout, and therefore slow queries
PostgreSQL:
There are NO clustered indexes:
- -> ALL indexes (including the PRIMARY KEY) are non-clustered
- -> Table data is stored in heap (in order of insertion, roughly)
- -> ALL indexes store physical row pointers
- -> Physical ordering is NOT maintained, this can have implications for range scans
Example:
CREATE TABLE users (
id INT PRIMARY KEY, -- Non-clustered index
email VARCHAR(255),
created_date DATE
);
CREATE INDEX idx_email ON users(email); -- Also non-clustered
Physical layout on disk:
- -> Data rows stored in heap (no guaranteed order)
- -> PRIMARY KEY index: id value (pointer to physical location)
- -> Email index: email value (pointer to physical location)
- -> Both indexes point directly to rows
Implications:
- -> All indexes are equal (no “primary” vs “secondary” distinction)
- -> Index lookups are one-step (index to row location)
- -> No guaranteed physical ordering (random I/O for range scans)
The CLUSTER Command (One-Time Operation)
It is possible to ask PostgreSQL to order (cluster) a table, but it won’t stick.
PostgreSQL has CLUSTER command:
-- Physically reorder table by index (one-time)
CLUSTER users USING users_pkey;
-- Or just:
CLUSTER users; -- Uses most recently clustered index
What this does:
- -> Physically rewrites the table in index order
- -> Creates a new copy of the table, ordered
- -> Drops old table, renames new one
- -> This is a ONE TIME operation, NOT PERMANENT
What it DOES NOT do:
- -> Maintain ordering on future INSERTs
- -> Auto-cluster on updates
- -> Keep table clustered over time
After CLUSTER:
INSERT INTO users VALUES (1000, 'new@email.com', '2024-01-01');
-- This row goes at the END of the heap (not in id order)
-- Clustering is now broken
To maintain clustering:
- -> Re-run CLUSTER periodically (maintenance operation)
- -> Or accept that clustering degrades over time
Operational Perspective:
MariaDB:
- -> PK choice matters A LOT (it governs the physical layout)
- -> Sequential PK = good performance (auto-increment)
- -> Random PK (e.g. UUID) = fragmentation issues
- -> Secondary indexes have overhead (two-step lookup)
Best practices:
- -> Use
AUTO_INCREMENTfor PRIMARY KEY (sequential inserts) - -> Avoid UUIDs as PK if possible (random inserts = fragmentation)
- -> Consider clustering when choosing PRIMARY KEY
PostgreSQL:
Different considerations:
- -> PK choice matters LESS (because it is just another index)
- -> No clustering by default (heap storage)
- -> Can use UUIDs as PK without fragmentation issues
- -> All indexes have same lookup cost (one-step)
But:
- -> Range scans less efficient (data not physically ordered)
- -> Can manually run
CLUSTERfor read-heavy workloads - -> Need to repeat periodically if you do
TL ; DR:
PostgreSQL:
- -> No clustering overhead means consistent insert performance, especially for concurrent insert patterns, since we don’t maintain physical ordering
- -> All indexes are completely separate from data storage, each index gets its own BTREE structure, larger disk space usage, but you get a fast index -> data lookup in general
- -> Indexes require active maintenance, since a row update means the row is marked as dead, and a new version is inserted with a new physical address: all indexes on the table must be updated to point to this new location, VACUUM operations are eventually required to purge these dead rows
MariaDB:
- -> Clustering means direct access to data on PK without extra lookup step, physical ordering benefits range scans, but non-sequential PKs such as UUIDs cause frequent page splits and reorganization (not good for performance)
- -> Secondary indexes store PK values (not the data), and since data is stored at the PK index lead level, the storage requirements are smaller than what PostgreSQL asks for, though large PKs can be storage intensive this way
- -> No special index configuration needed once table is set, uses undo logs instead of dead rows, where the old row versions are moved to an undo log in a separate space. A Purge Thread continuously cleans up these old versions from the Undo Log as soon as they are no longer needed by active transactions. (just watch out for pile up of long running write transactions!)
OPERATIONAL SPECIFICS
PostgreSQL has some indexing features out of the box that MariaDB does not have:
PostgreSQL:
- -> Expression Indexes (index pre-computed values):
CREATE INDEX idx_test ON tbl1 (lower(description) text_pattern_ops); - -> Partial Indexes (when large% of the table is irrelevant):
CREATE INDEX idx_test ON tbl1 (stamp_created) WHERE status = 'active';
MariaDB:
- -> To achieve a similar effect, we would use a virtual columns, and then add the index to the virtual column
Covering Indexes also work differently:
PostgreSQL:
- -> Uses an
INCLUDEclause, meaning the extra columns are stored in the leaf nodes of the BTREE, but not used for constraints:CREATE UNIQUE INDEX idx_user_email ON users (email) INCLUDE (user_id);, now a WHERE clause can include bothemailanduser_id, but only email is enforced to be unique
MariaDB:
- -> Uses a multi-column BTREE index, meaning that all columns are part of the uniqueness constraint of the index
DATABASE vs SCHEMA
MariaDB:
DATABASE and SCHEMA are synonyms, and every DATABASE/SCHEMA on the same server can be cross-joined, or referenced within a single query. No hard container.
PostgreSQL:
A DATABASE is a hard container, one query can not cros-join across these. DATABASES can contain SCHEMAS which are the containers for TABLES.
QUERY ANALYSIS (EXPLAIN)
Both MariaDB and PostgreSQL have an EXPLAIN statement for looking into the optimizer plan for execution. The outputs between MariaDB and PostgreSQL are very different, as well as there are differences in the optimizer approach for certain types of queries.
For this comparison, a simple three table structure is used, where tables 2 and 3 have a relation parent_id towards table 1.
root@debian-test:/root/tmp# awk -v tables=3 -v digits=8 -v gen_date=1 -v join_mode=1 -f gen_multi_table_with_dates.awk
----- TABLE 1 -----
id,stamp_created,col1,col2,col3
65581476,2023-12-10,80145896,24821729,53160890
27314164,2017-02-03,63534806,10386889,23100733
59139326,2020-02-24,31937496,79548467,82410078
10436107,2019-10-08,92748791,97199300,1144696
2437634,2016-11-13,86843132,94984995,69678641
----- TABLE 2 -----
id,parent_id,col1,col2,col3
31902428,65581476,8423775,37554689,57320794
99580925,2437634,22555129,79131929,32832900
79908125,59139326,20794347,24204127,97918800
25845521,27314164,37929588,79558799,34137298
97333279,27314164,67683785,40012518,15993387
----- TABLE 3 -----
id,parent_id,col1,col2,col3
59391619,27314164,94162607,29144644,89148135
89757852,27314164,61512988,32166739,44081521
35507451,27314164,8542435,35867322,8488144
15136464,59139326,11057072,46909228,9855358
51130852,2437634,26040480,76389303,66981533
This setup mirrors a common star shaped schema. There are no indexes on these tables.
Let’s examine the following query:
SELECT
t1.id,
t1.col1,
t2.col3,
t3.col1,
t3.col3
FROM table1 t1
JOIN table2 t2 ON t1.id = t2.parent_id
JOIN table3 t3 ON t1.id = t3.parent_id
WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
MARIADB
(root@linuxpc) [test]> EXPLAIN SELECT t1.id, t1.col1, t2.col3, t3.col1, t3.col3 FROM tbl1 t1 JOIN tbl2 t2 ON t1.id = t2.parent_id JOIN tbl3 t3 ON t1.id = t3.parent_id WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+------+-------------+-------+--------+---------------+---------+---------+-------------------+-------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+---------------+---------+---------+-------------------+-------+-------------------------------------------------+
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 77012 | Using where |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY | PRIMARY | 4 | test.t2.parent_id | 1 | Using where |
| 1 | SIMPLE | t3 | ALL | NULL | NULL | NULL | NULL | 88133 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+--------+---------------+---------+---------+-------------------+-------+-------------------------------------------------+
3 rows in set (0.000 sec)
This EXPLAIN is saying that:
t2 (ALL) - Full table scan of tbl2 (77k rows), Using where (filtering by
parent_id: using eacht2.parent_idto probe t1 by primary key)t1 (eq_ref, PRIMARY) - For each t2 row, lookup in t1 by PK, Using where (filtering by
stamp_created)t3 (ALL) - Full table scan of tbl3 (88k rows), Using join buffer (BNL - Block Nested Loop)
Join order: t2, t1, t3.
This type of query often requires optimization in MariaDB, because it does not scale well to bigger table sizes.
PostgreSQL
test=# EXPLAIN SELECT
t1.id,
t1.col1,
t2.col3,
t3.col1,
t3.col3
FROM tbl1 t1
JOIN tbl2 t2 ON t1.id = t2.parent_id
JOIN tbl3 t3 ON t1.id = t3.parent_id
WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Hash Join (cost=3674.37..5597.07 rows=3647 width=20)
Hash Cond: (t3.parent_id = t1.id)
-> Seq Scan on tbl3 t3 (cost=0.00..1544.98 rows=90998 width=12)
-> Hash (cost=3632.07..3632.07 rows=3384 width=16)
-> Hash Join (cost=2057.31..3632.07 rows=3384 width=16)
Hash Cond: (t2.parent_id = t1.id)
-> Seq Scan on tbl2 t2 (cost=0.00..1370.00 rows=78000 width=8)
-> Hash (cost=2011.52..2011.52 rows=3663 width=8)
-> Seq Scan on tbl1 t1 (cost=0.00..2011.52 rows=3663 width=8)
Filter: ((stamp_created >= '2025-06-01'::date) AND (stamp_created <= '2025-10-31'::date))
(10 rows)
Key differences on reading a PostgreSQL explain vs MariaDB explain:
- -> Cost: not exposed in MariaDB, but it is exposed in PostgreSQL and it is a measurement on the
unit of work: By default, cost=1 is the work required for reading a single 8kB page of data sequentially from disk. Other operations are calculated relative to this, see ANNEX II for TL;DR on common cost patterns - -> the
EXPLAINneeds to be read from bottom up, and not top to bottom - -> follow the
->arrows, this indicates that this step is consuming the output of the step below it - -> more explicit about mechanics of how tables connect, whereas MariaDB’s EXPLAIN focuses more heavily on the join order
- -> this distinction is important, because MariaDB tends to prefer starting with tables that have indexes for joining, as the optimizer tries to arrange joins to make inner lookups efficient
- -> PostgreSQL has ways to compensate in these situations, like hash joins, so it’s less dependent on join ordering, which can be seen in this EXPLAIN
- -> lack of indexes in MariaDB often leads to full scans and join buffering, whereas PostgreSQL may utilize hash joins to keep performance acceptable, and reinforcing that index optimization works very differently here, good indexes are still important in PostgreSQL, but expect different types of problems to manifest relative to MariaDB
The EXPLAIN format is drastically different:
Format: (cost=startup_cost..total_cost rows=X width=Y)
The EXPLAIN is saying that:
Seq Scan (~ type: ALL) on tbl1, Filter by
stamp_createdBETWEEN dates, Estimated: 3663 rows (out of ~100k total)Build Hash of t1 results (3663 rows)
Seq Scan on tbl2 (78k rows), Hash Join with t1 hash (match on
parent_id), Estimated: ~3384 rowsBuild Hash of join result
Seq Scan on tbl3 (90k rows), Hash Join with previous hash, Estimated final result: ~3647 rows
Join order: t1, t2, t3
Basically, the insight is that MariaDB and PostgreSQL will experience different pain points around query and index optimization.
In MariaDB, a missing index can lead to nested loops, and bad join orders to catastrophically slow queries.
In PostgreSQL, a missing index might not have these catastrophic outcomes, and the pain points are centered around index maintenance (bloat/unnecessary indexes on write heavy tables, insufficient VACUUMing)
Adding indexes for parent_id in tables 2 and 3 would result in a dramatic improvement for MariaDB, while it wouldn’t necessarily do the same over in PostgreSQL.
Long Running Transactions
MariaDB (InnoDB):
Often causes various lock based problems:
- -> Row locks: transactions block waiting for locks, queries pile up, applications suffer, because slow UPDATE/INSERT/DELETE is waiting on another slow UPDATE/INSERT/DELETE on the master DB
- -> Metadata locks: DDL (like ALTER, or TRUNCATE) sent to replication thread blocked by slow running SELECT query in replica server, replica lag will build up and queries pile up until the DDL can execute
PostgreSQL:
Rather than lock based problems, like in MariaDB, the situation is entirely different: VACUUM can not clean up dead rows that are still visible to long running transactions
- -> Indexes bloat, as they still point to dead rows, results in slower SELECT/JOIN performance (worse index scan performance)
- -> Disk space bloats, as the dead rows continue to occupy space
- -> xmin/xmax are 32 bit, so if dead rows are frozen for 2 billion transaction, wraparound happens, and rows that SHOULD be visible, aren’t, a lot needs to go wrong for this to happen, but it is very bad when it does
ANNEX I - REPLICATION SET UP BASICS
MariaDB:
Configuration is mostly owned by the replica server, but do need to make sure binlogging is enabled, and that a replication user on the master exists.
MASTER: make sure the basics are configured:
[mysqld]
server-id=1
gtid_strict_mode=1
log_bin = /var/lib/mysql/bin-mariadb.log
expire_logs_days=8
sync_binlog = 1
binlog_format = row
log_slave_updates = 1
TL;DR - we need binlogs, row format is safest for consistency, need a unique server id, gtid is can be a good idea, if we want things like automated failovers
Then, restart with systemctl restart mariadb, and create a user with GRANTS
(root@linuxpc) [(none)]> CREATE USER 'repl'@'%' IDENTIFIED BY 'secret';
Query OK, 0 rows affected (0.011 sec)
(root@linuxpc) [(none)]> GRANT REPLICATION SLAVE ON *.* to 'repl'@'%';
Query OK, 0 rows affected (0.012 sec)
Then take a dump:
root@linuxpc:~# mysqldump --all-databases --master-data=2 --single-transaction -u root > dump.sql
TL;DR –master-data=2 needed if using GTID, as it will include the GTID position
REPLICA:
Make sure we have a unique server id here:
[mysqld]
server_id=2
...
Import the dump taken from the master as a starting point:
root@debian-test:~# mysql -u root -p < /tmp/dump.sql
Once done, set up the replication:
MariaDB [(none)]> CHANGE MASTER TO
-> MASTER_HOST='192.168.2.37',
-> MASTER_PORT=3306,
-> MASTER_USER='repl',
-> MASTER_PASSWORD='secret',
-> MASTER_USE_GTID=slave_pos;
Query OK, 0 rows affected (0.013 sec)
MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.022 sec)
Check:
MariaDB [(none)]> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
...
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
...
Indicates that everything is well.
PostgreSQL
The configuration mostly happens on the master.
MASTER:
First, check into postgresql.conf to set up some basic requirements:
listen_addresses = '*'
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
- ->
listen_addresses = '*'~bind_address = 0.0.0.0for MariaDB - ->
wal_level~ how much history for others to read - ->
max_wal_senders~ how many seats are available for replicas to connect, this is also used bypg_basebackupso it is not recommended to really occupy all of these slots consistently
Then, in pg_hba.conf, create a rule that allows the replication user to connect:
host replication all 192.168.2.99/24 scram-sha-256
Then, login to the Database, create the user, and reload configs:
postgres=# CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secret';
CREATE ROLE
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
Don’t forget to check locale, because a mismatch of locales would prevent the replica server from starting up:
postgres=# SHOW lc_collate;
lc_collate
-------------
en_US.UTF-8
(1 row)
postgres=# SHOW lc_ctype;
lc_ctype
-------------
en_US.UTF-8
(1 row)
REPLICA:
Repeat the SHOW statements, if there’s not an exact match, make sure to fix this (as root) by:
apt install locales
sed -i 's/^# *en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen
locale-gen
Then, it should be possible to stop the database, import the dataset over from the master DB, and set up replication:
root@debian-test:~# sudo -u postgres rm -rf /var/lib/postgresql/15/main/*
root@debian-test:~# sudo -u postgres pg_basebackup -h 192.168.2.37 -D /var/lib/postgresql/15/main -U replicator -P -R -X stream -C -S replica1
could not change directory to "/root": Permission denied
Password:
23043/23043 kB (100%), 1/1 tablespace
Flag breakdown:
- -> -P = show progress
- -> -R = write recovery.conf (creates standby.signal)
- -> -X stream = stream WAL during backup
- -> -C = create replication slot (skip this if slot already exists!)
- -> -S replica1 = slot name
Start up the database:
root@debian-test:~# systemctl start postgresql
Check if it’s working:
postgres=# SELECT pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
Can check on the MASTER as well:
postgres=# SELECT * FROM pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
---------+----------+------------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------
2241757 | 16388 | replicator | 15/main | 192.168.2.99 | | 34984 | 2026-05-02 23:05:47.787436+02 |
| streaming | 0/F000060 | 0/F000060 | 0/F000060 | 0/F000060 | | | | 0 | async
| 2026-05-02 23:07:27.900017+02
(1 row)
ANNEX II - Common Optimizer Cost Patterns
Postgres always shows costs as a range: (startup..total)
-> Startup Cost (the first number): The estimated work required to return the first row. For a simple scan, this is often 0.00, but for a SORT or HASH operation, it will be high because the database must process the entire data set before it can output anything.
-> Total Cost (the second number): The estimated work to complete the entire operation and return all rows
- Sequential Scan (large table):
Seq Scan on users (cost=0.00..15000.00 rows=500000)
High total cost = many pages to read
- Index Scan (selective):
Index Scan using idx_email (cost=0.42..8.44 rows=1)
Low total cost = few pages needed
- Index Only Scan (best case):
Index Only Scan using idx_email_name (cost=0.42..4.44 rows=1)
Even lower cost = no heap access needed
- Bitmap Index Scan (medium selectivity):
Bitmap Index Scan on idx_age (cost=0.00..500.00 rows=5000)
-> Bitmap Heap Scan (cost=500.00..2500.00 rows=5000)
Medium cost = many matches, but organized
- Hash Join (large datasets):
Hash Join (cost=1500.00..8000.00 rows=10000)
High startup (building hash), but efficient
- Nested Loop (small datasets):
Nested Loop (cost=0.29..100.00 rows=10)
Low startup, but multiplies for each outer row
Note: PostgreSQL allows its cost model to be configured, e.g. for faster drives, it’s possible to lower the costs associated with random page accesses, to favour indexes)
In postgresql.conf:
- seq_page_cost = 1.0
- random_page_cost = 4.0
- cpu_tuple_cost = 0.01
- cpu_operator_cost = 0.0025