Strange behavior in optimize function #3851
-
I am trying to use the optimizer to optimize some complex queries. However, I met the 'could not be resolved' error with correlated subqueries. Here is a simplified way to reproduce it: print(optimize('''
SELECT a.g
FROM a
WHERE a.e < ( SELECT MAX(u) FROM (
SELECT SUM(c.b) AS u
FROM c
WHERE c.d = a.f
GROUP BY c.e
) w);
''', schema={"a": {"g": "INT", "e": "INT", "f": "INT"}, "c" : { "d": "INT", "e": "INT", "b": "INT" }}).sql(pretty=True)) The sqlglot throws:
I know that such a query is hard to optimize into a JOIN statement. However, the error message here is strange. Maybe the expected behavior is to ignore this subquery and apply other optimizations. Any ideas about this error message? sqlglot version: 25.8.0 Anyway, sqlglot is a wonderful tool. It helps me a lot. Thanks :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hey @fuboat, thanks for reporting this. It looks like there's a gap in the optimizer where we don't correctly detect correlation if the column is referenced within the context of a derived table that is nested under a subquery. We'll take a look when we find some bandwidth and update this thread 👍 |
Beta Was this translation helpful? Give feedback.
PR that fixes this: #3872