🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz

What is a lateral join in SQL?

A lateral join in SQL allows a subquery in the FROM clause to reference columns from tables listed earlier in the same query. This differs from standard joins, where each subquery operates independently. By using keywords like LATERAL (in PostgreSQL) or CROSS APPLY/OUTER APPLY (in SQL Server), you can create dependencies between the subquery and the outer table, enabling row-by-row processing. For example, a lateral join can compute a result for each row of the main table, such as fetching the most recent order for each customer, where the subquery depends on the current customer’s ID.

A common use case is when working with table-valued functions or correlated subqueries. Suppose you have a users table and an orders table. To get each user’s latest order, you might write:

SELECT u.id, o.order_date
FROM users u
LEFT JOIN LATERAL (
 SELECT order_date 
 FROM orders 
 WHERE user_id = u.id 
 ORDER BY order_date DESC 
 LIMIT 1
) o ON true;

Here, the subquery inside LATERAL references u.id from the outer users table. Without LATERAL, the subquery couldn’t access u.id, since standard joins process tables independently. This approach efficiently retrieves per-row results, avoiding complex window functions or nested queries.

Lateral joins are particularly useful when combining data with JSON/array expansions or hierarchical queries. For instance, if a products table stores JSON arrays of tags, you could use:

SELECT p.id, tag_data->>'name' AS tag
FROM products p
LEFT JOIN LATERAL jsonb_to_recordset(p.tags) AS tag_data(tag_id INT, name TEXT) ON true;

This expands the JSON array into rows, with each row depending on the tags column of the current product. However, performance can vary: lateral joins may execute the subquery repeatedly for each row, so indexing or limiting results within the subquery is advisable. Use them when row-specific calculations or transformations are necessary, but test for efficiency in large datasets.

Like the article? Spread the word