SQL Tuning Series · Episode 30
Choosing the Right Join Method
Nested Loop, Sort Merge, or Hash Join
Learn how nested loop join, sort merge join, and hash join work, and how indexes, row counts, and memory influence join method selection.
Joins are central to relational databases.
They allow SQL to combine data from multiple tables, but they can also become major performance bottlenecks.
The three physical join methods every SQL tuner should understand are nested loop join, sort merge join, and hash join.
01 Nested Loop Join
A nested loop join reads rows from one table and repeatedly looks up matching rows in another table.
The first table is often called the outer or driving table. The second table is the inner or driven table.
Nested loop join works well when:
- The outer result set is small.
- The inner table has a useful index.
- The query can return rows quickly.
It becomes inefficient when the outer result set is large and the inner lookup repeats many times.
02 Sort Merge Join
Sort merge join reads both inputs, sorts them by the join key, and then merges the sorted results.
It can be useful when:
- Join ranges are large.
- Indexes are not available.
- The join condition is not only equality.
The downside is sorting cost. If both inputs are large, sorting can consume significant memory and temporary space.
03 Hash Join
Hash join builds a hash table from one input, then probes it with rows from the other input.
It is commonly used for large equality joins.
Hash join works well when:
- The join condition is equality.
- Index access would cause too much random I/O.
- Sorting would be expensive.
- The smaller build input fits in memory.
If the hash area is too small, the database may use temporary disk space, which can slow the join.
04 How the Optimizer Chooses
The optimizer considers row counts, indexes, join predicates, memory, statistics, and estimated cost.
For example:
- Small outer table plus indexed inner table often favors nested loop.
- Large unsorted inputs may favor hash join.
- Large inputs that can be sorted or already ordered may favor sort merge.
05 The Tuner’s Role
A tuner should understand why a join method was chosen.
Ask:
- Which table is driving the join?
- How many rows are produced before the join?
- Does the inner table have a useful index?
- Is the join result small or large?
- Is memory sufficient for hash or sort work?
- Are statistics accurate?
06 Practical Example
Nested loop join can be effective when the driving set is small:
EXAMPLE
SELECT o.order_id, c.customer_name FROM orders o JOIN customers c ON c.customer_id = o.customer_id WHERE o.order_id = :order_id;
Hash join may be better for large equality joins:
EXAMPLE
SELECT o.order_id, p.payment_amount FROM orders o JOIN payments p ON p.order_id = o.order_id WHERE o.order_date >= DATE '2026-01-01';
The best join method depends on row counts, indexes, memory, and statistics.
07 Key Takeaway
There is no single best join method.
Nested loop, sort merge, and hash join each fit different data volumes and access conditions. SQL tuning means helping the optimizer choose the join strategy that matches the real workload.