Lateral joins are joins in which sub-selects and expressions on the right side of LATERAL keyword can use the scope of the join on the left side of LATERAL keyword. E.g. (trivial):
SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
which is equivalent to:
SELECT * FROM foo, bar WHERE bar.id = foo.bar_id;
See PostgreSQL docs.
A less-trivial example would be:
select
month,
signups,
visitors,
p as avg,
low,
high
from signups_by_month,
lateral (select signups / visitors as p) probability,
lateral (select sqrt(p * (1 - p) / visitors) as se) std_error,
lateral (select p - 1.94 * se as low) lower_bound,
lateral (select p + 1.94 * se as high) upper_bound
(source: Reuse Calculations in the Same Query with Lateral Joins)
and a truly non-trivial one would be:
SELECT r.name, i.inspected_at, i.score, COUNT(v.id) violations
FROM restaurants r
LEFT OUTER JOIN LATERAL (SELECT *
FROM inspections i2
WHERE i2.restaurant_id = r.id
ORDER BY i2.inspected_at DESC
LIMIT 1) i
ON true
LEFT OUTER JOIN violations v
ON v.inspection_id = i.id
GROUP BY r.name, i.inspected_at, i.score
;
(source: Postgres LATERAL JOIN)
(assumes development based on #5824)
LATERAL keyword in syntax, make Join have a isLateral propertylateral Join have its scope extend the scope of Join's LHS (similar to how UNNEST is implemented now)lateral inner Join-sleft outer lateral joins:EmptyResultToNullRow as right child for lateral left outer JoinNode-sThe above rewrites would additionally provide support for a class of non-scalar correlated subqueries, which is a nice follow-up for #5824.
The 'truly non-trivial' example would have to be investigated on if and how it could be rewritten, and is out of scope of the initial LATERAL support. It might turn out that the only reasonable way to implement it would be by implementing the Iterative Approach for direct execution of ApplyNode-s.
Note that right lateral joins are well defined only when they're not correlated (the RHS does not refer to LHS).
A right lateral join with a correlation should result in an error message. :
postgres=# SELECT * FROM T1 RIGHT JOIN LATERAL (SELECT * FROM T2 WHERE x = y) ss ON TRUE;
ERROR: invalid reference to FROM-clause entry for table "t1"
LINE 1: ...ROM T1 RIGHT JOIN LATERAL (SELECT * FROM T2 WHERE x = y) ss ...
^
DETAIL: The combining JOIN type must be INNER or LEFT for a LATERAL reference.
We're opening the issue to let you know we've started planning the work on that :)
@kokosing please review the problem statement :)
@martint @electrum @cberner do you have any comments on this issue before we start? I asked @kokosing and it looks good to him.
The proposal looks good. Go for it!
At teradata we had a hackathon in which we wanted to implement this feature. Here is a branch https://github.com/Teradata/presto/tree/feature_lateral were I collected our work. I did some testing and cleaning. LATERAL is working there but it breaks IN predicate with correlated subquqery, to fix that https://github.com/prestodb/presto/issues/6637 has to be fixed before.
Inner and LEFT OUTER lateral joins are now supported. I consider it as good enough solution for this issue. Obviously there is still a huge room for improvements but these should be track as separate issues (if needed).
Most helpful comment
We're opening the issue to let you know we've started planning the work on that :)
@kokosing please review the problem statement :)