PostgreSQL EXPLAIN notes (from a MariaDB point of view)

Introduction

This article is a roadmap to build a basic mental model between MariaDB explain output and PostgreSQL output. This article is not intended as a “which is better” comparison, nor is it intended to be a comprehensive list of ecosystem differences.

The key conceptual differences in outputs:

MariaDB:

PostgreSQL:

Nested Loops

Indexed Nested Loop:

Nested Loop
  -> Seq Scan on orders o
  -> Index Scan using customers_pkey on customers c
       Index Cond: (id = o.customer_id)

Indicators:

This is O(n * log(m))

The MariaDB analogue to Index Scan is Type=ref/eq_ref whith ref={{some key}}. If the extra is empty, or contains Using index condition, then the index is used to fetch the result, but the table also needs to be visited to get more data.

Index Only Scan, is the same as above, that’s roughly analogous to seeing MariaDB’s Using index in the extra, which means that the index covers all of the columns needed for this query. It means that PostgreSQL in this case can answer the needed columns from the index itself.

Simple Nested Loop:

Nested Loop
  -> Seq Scan on orders o
  -> Seq Scan on customers c
       Filter: (id = o.customer_id)

Indicators:

This is O(n * m)

This should be relatively rare in PostgreSQL because for unindexed joins, it can fall back to hash joins.

So if this appears, it indicates either that the tables are tiny, or something has gone severely off-track.

The MariaDB analogue to Seq Scan => Type=ALL, ref=NULL (+ key=NULL), should probably see Using where in the extra. Seeing Filter: ~ seeing Using where in the extra field in MariaDB.

Simple Nested Loop with Materialize:

Nested Loop
  -> Seq Scan on orders o
  -> Materialize
       -> Seq Scan on customers c
            Filter: (id = o.customer_id)

Indicators:

This is still O(n * m), but inner table only scanned once instead of n times

Materialize here means that PostgreSQL caches the inner result set for reuse during the processing of the Nested Loop. In MariaDB a similar process may not be directly reflected in the EXPLAIN.

Sidenote on Simple Nested Loops

When joining two tables, PostgreSQL considers:

    Nested Loop with Index Scan (if inner table has usable index)
        Cost: outer_rows × log(inner_rows)
        Chosen when: Small outer table, indexed inner table

    Hash Join (if no good index, or large result set)
        Cost: outer_rows + inner_rows
        Chosen when: No suitable index, or both tables large

    Merge Join (if both inputs are sorted)
        Cost: outer_rows × log(outer_rows) + inner_rows × log(inner_rows)
        Chosen when: Pre-sorted inputs or sort is cheap

    Nested Loop with Sequential Scan (RARE - usually worst option)
        Cost: outer_rows × inner_rows
        Chosen when: Inner table is TINY (like 5-10 rows) and seq scan is cheaper than index overhead

So that is why an actual Nested Loop with a Seq Scan is a reason to consider looking into the underlying query.

Hash Joins

Basic Hash Join:

Hash Join  (cost=150..2000 rows=5000)
  Hash Cond: (o.product_id = p.id)
  -> Seq Scan on orders o  (cost=0..1000 rows=5000)
  -> Hash  (cost=100..100 rows=1000)
       -> Seq Scan on products p  (cost=0..100 rows=1000)

Indicators:

This is O(n + m), as it scans each table once

Be mindful, that if the startup cost of the hash operation is high, then it means that building the hash table itself is expensive, there may not be enough space in the working memory to one shot it.

The MariaDB analogue in practice is most likely a (Block) Nested Loop, because the direction of thinking in MariaDB is that tables should have good/relevant indexes for JOINs to be performant. As a result, there is generally no direct equivalent of optimizer preferring (or falling back) to hash joins in MariaDB. If you really need a hash join in a MariaDB based system anyway, break up the join, export the data and do a file based join (if your script/application server can handle it… ). This is not a general advice, but it may sometimes be faster or more approachable than trying to fiddle with the optimizer.