Understanding Block Nested Loop Joins in MariaDB
When reading an EXPLAIN of slow queries with joins, one particularly common sight is Using join buffer (flat, BNL join). It is worth understanding what a Block Nested Loop (BNL) actually is, and how it affects query execution speed/performance.
Context
Consider the join of two tables t1 and t2. If t2 has no indexes, then MariaDB must check every row in t2 for each row from t1. In other words, it will need to perform ~ rows(t1) times rows(t2) from t2 comparisons. If t2 is a small table (say <50k rows), this is going to be fast, but when t2 is large, MariaDB would end up re-reading a lot of rows. In order to avoid this from happening, MariaDB will use the BNL algorithm to make use of a join buffer to cache portions of the outer table in memory.
What that means in practice, that with a BNL, MariaDB will not automatically join t2. Instead, it will insert as many rows from t1 that it can into a join buffer and then scan the appropriate range of t2 once, in order to match each record in t2 to the join buffer. Then, each matched row is sent to the next join.
This is fine for a single query (other than the fact that this is very slow), but the major downside is that this is a very RAM intensive way of querying data. A join buffer is created for each join in a query, so if an application ends up slamming the database with BNL queries, it can quickly escalate. Therefore, it is often worth trying to optimize queries that trigger BNL in the execution plan.
As a sidenote, BNL uses the parameter join_buffer_size to decide the buffer size its algorithm uses for caching data from the outer table. Setting this higher can speed up BNL execution plans, but it should be considered carefully, and at best only changed as a session value for one-off analytical queries.
Getting an example BNL
Now with the understanding established, let’s see what difference optimizing a BNL away makes. For that, the first task is to create some kind of data that can be joined together. I’ve opted to use a simple awk script for this, but there’s probably more advanced tools out there for generating test data:
root@debian-test:/root/tmp# awk -v tables=3 -v digits=8 -v gen_date=1 -v join_mode=1 -f gen_multi_table_with_dates.awk
----- TABLE 1 -----
id,stamp_created,col1,col2,col3
65581476,2023-12-10,80145896,24821729,53160890
27314164,2017-02-03,63534806,10386889,23100733
59139326,2020-02-24,31937496,79548467,82410078
10436107,2019-10-08,92748791,97199300,1144696
2437634,2016-11-13,86843132,94984995,69678641
----- TABLE 2 -----
id,parent_id,col1,col2,col3
31902428,65581476,8423775,37554689,57320794
99580925,2437634,22555129,79131929,32832900
79908125,59139326,20794347,24204127,97918800
25845521,27314164,37929588,79558799,34137298
97333279,27314164,67683785,40012518,15993387
----- TABLE 3 -----
id,parent_id,col1,col2,col3
59391619,27314164,94162607,29144644,89148135
89757852,27314164,61512988,32166739,44081521
35507451,27314164,8542435,35867322,8488144
15136464,59139326,11057072,46909228,9855358
51130852,2437634,26040480,76389303,66981533
Here, tables 2 and 3 both have a parent_id column, that reference id of table 1. This setup mirrors a common star shaped schema, and provides a perfect opportunity for a BNL join to appear. Now, we can write a query that filters on table 1, and joins via the two parent_id’s to the other tables, and we should see a BNL in the execution plan.
The SQL statements needed to create the test tables will be provided in the appendix.
I’ve opted to fill each table with about 1 million rows, this ensures that there’s enough data for the optimizer to really work with.
With the tables populated, we can try a query like:
SELECT
t1.id,
t1.col1,
t2.col3,
t3.col1,
t3.col3
FROM table1 t1
JOIN table2 t2 ON t1.id = t2.parent_id
JOIN table3 t3 ON t1.id = t3.parent_id
WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
Checking the EXPLAIN of this query indeed nets us the BNL:
MariaDB [test]> EXPLAIN SELECT
-> t1.id,
-> t1.col1,
-> t2.col3,
-> t3.col1,
-> t3.col3
-> FROM table1 t1
-> JOIN table2 t2 ON t1.id = t2.parent_id
-> JOIN table3 t3 ON t1.id = t3.parent_id
-> WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+------+-------------+-------+--------+---------------------------+---------+---------+-------------------+---------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+---------------------------+---------+---------+-------------------+---------+-------------------------------------------------+
| 1 | SIMPLE | t3 | ALL | NULL | NULL | NULL | NULL | 966763 | Using where |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY,stamp_created_idx | PRIMARY | 4 | test.t3.parent_id | 1 | Using where |
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 1028246 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+--------+---------------------------+---------+---------+-------------------+---------+-------------------------------------------------+
3 rows in set (0.001 sec)
The EXPLAIN output confirms that the optimizer chose the Block Nested Loop as a fallback, since there’s no indexes it can utilize for this join. The join buffer does help to reduce repetitive reads, but it’s still a brute force approach. every combination must be compared, so this scales very poorly.
At one million rows per table, I already did not have patience to try and actually execute this query. However, I felt that some kind of benchmark might be interesting, so check the appendix for a benchmark with 100k rows/table, where the query got orders of magnitude faster going from 23s to 0.012s.
Fixing the example BNL
Fortunately, there is a simple fix in this case. Since I’ve used a simple random number generator for all columns bar stamp_created, every column has a high enough cardinality for an index. Since our JOIN is on parent_id of tables 2 and 3, that’s exactly what we need to target. We need to add an index on those columns:
MariaDB [test]> ALTER TABLE table2 ADD INDEX idx1(`parent_id`);
Query OK, 0 rows affected (1.322 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [test]> ALTER TABLE table3 ADD INDEX idx1(`parent_id`);
Query OK, 0 rows affected (1.571 sec)
Records: 0 Duplicates: 0 Warnings: 0
Retrying the same explain, we should see the BNL vanish:
MariaDB [test]> EXPLAIN SELECT
-> t1.id,
-> t1.col1,
-> t2.col3,
-> t3.col1,
-> t3.col3
-> FROM table1 t1
-> JOIN table2 t2 ON t1.id = t2.parent_id
-> JOIN table3 t3 ON t1.id = t3.parent_id
-> WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+------+-------------+-------+-------+---------------------------+-------------------+---------+------------+-------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------------------+-------------------+---------+------------+-------+-----------------------+
| 1 | SIMPLE | t1 | range | PRIMARY,stamp_created_idx | stamp_created_idx | 6 | NULL | 86564 | Using index condition |
| 1 | SIMPLE | t2 | ref | idx1 | idx1 | 5 | test.t1.id | 1 | |
| 1 | SIMPLE | t3 | ref | idx1 | idx1 | 5 | test.t1.id | 1 | |
+------+-------------+-------+-------+---------------------------+-------------------+---------+------------+-------+-----------------------+
3 rows in set (0.001 sec)
The Block Nested Loop has vanished, the optimizer no longer wants to do a full scan, and has found a relevant index it can use for an indexed lookup. The only slow part is because of the stamp_created filter, which is a range scan handled by the stamp_created_idx and had no bearing on the BNL problem itself.
Conclusion
I hope this article helped give a basic insight into what a Block Nested Loop is, and how much difference eliminating one can make. There’s not always such a simple fix for it, but a BNL execution plan is usually worth chasing down, even if it requires a more involved approach.
As a final final note, sometimes it may even be fine to leave a BNL alone: if you’re joining a few tiny configuration tables, you’re not going to see much benefit in eliminating it. There’s also always going to be that one weird unoptimizable ad-hoc analytical query that you can feed to the bottom of your replication chain.
Appendix
The benchmark with ~100k rows/table:
- -> BNL
MariaDB [test]> EXPLAIN SELECT COUNT(*)
-> FROM table1 t1
-> JOIN table2 t2 ON t1.id = t2.parent_id
-> JOIN table3 t3 ON t1.id = t3.parent_id
-> WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+------+-------------+-------+--------+---------------------------+---------+---------+-------------------+--------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+---------------------------+---------+---------+-------------------+--------+-------------------------------------------------+
| 1 | SIMPLE | t3 | ALL | NULL | NULL | NULL | NULL | 101781 | Using where |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY,stamp_created_idx | PRIMARY | 4 | test.t3.parent_id | 1 | Using where |
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 115792 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+--------+---------------------------+---------+---------+-------------------+--------+-------------------------------------------------+
3 rows in set (0.000 sec)
MariaDB [test]> SELECT COUNT(*)
-> FROM table1 t1
-> JOIN table2 t2 ON t1.id = t2.parent_id
-> JOIN table3 t3 ON t1.id = t3.parent_id
-> WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+----------+
| COUNT(*) |
+----------+
| 4131 |
+----------+
1 row in set (23.011 sec)
- -> Without BNL
MariaDB [test]> EXPLAIN SELECT COUNT(*)
-> FROM table1 t1
-> JOIN table2 t2 ON t1.id = t2.parent_id
-> JOIN table3 t3 ON t1.id = t3.parent_id
-> WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+------+-------------+-------+-------+---------------------------+-------------------+---------+------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------------------+-------------------+---------+------------+------+--------------------------+
| 1 | SIMPLE | t1 | range | PRIMARY,stamp_created_idx | stamp_created_idx | 6 | NULL | 4195 | Using where; Using index |
| 1 | SIMPLE | t2 | ref | idx1 | idx1 | 5 | test.t1.id | 1 | Using index |
| 1 | SIMPLE | t3 | ref | idx1 | idx1 | 5 | test.t1.id | 1 | Using index |
+------+-------------+-------+-------+---------------------------+-------------------+---------+------------+------+--------------------------+
3 rows in set (0.001 sec)
MariaDB [test]> SELECT COUNT(*)
-> FROM table1 t1
-> JOIN table2 t2 ON t1.id = t2.parent_id
-> JOIN table3 t3 ON t1.id = t3.parent_id
-> WHERE t1.stamp_created BETWEEN '2025-06-01' AND '2025-10-31';
+----------+
| COUNT(*) |
+----------+
| 4131 |
+----------+
1 row in set (0.012 sec)
The SQL Queries needed to re-create the test tables:
CREATE TABLE table1 (
id INT PRIMARY KEY,
col1 INT,
col2 INT,
col3 INT,
col4 INT
);
ALTER TABLE table1 ADD COLUMN `stamp_created` DATETIME AFTER id;
ALTER TABLE table1 ADD INDEX stamp_created_idx(`stamp_created`);
CREATE TABLE table2 (
id INT PRIMARY KEY,
parent_id INT,
col1 INT,
col2 INT,
col3 INT,
col4 INT
);
CREATE TABLE table3 (
id INT PRIMARY KEY,
parent_id INT,
col1 INT,
col2 INT,
col3 INT,
col4 INT
);