Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem right, what if field is not a column and is an expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, moreover if the original query had both
table1.col
andtable2.col
in the projections, this would produce twocol
s in the new one.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly under the current implementation this can only work where outer_selects are unique columns or *, and this isn't changed by this PR.
Currently this function takes an existing query, adds a row count window func to use for distincting, and converts it to a subquery aliased as
_t
.The outer query then selects from this subquery where
_row_number = 1
Selecting anything other than unique column names without table identifiers or * in the outer select won't resolve. eg:
^ won't work as presumably the expression is related to the inner query tables, not the outer
^ won't work as the reference to table1, table2 is invalid where the subquery is aliased as "_t"
This PR is solely intended to fix situations where the table identifier is added to the outer query selects, but is referencing the table identifiers from inside the subquery rather than the identifier of the subquery itself.
Obviously let me know if I am misunderstanding this or you think this requires a different solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can just do
SELECT *
instead for the outer query... Or, try to give unique (e.g. auto-generated) names to the inner projections as a best-effort approach, and then reference those in the outer query.The
SELECT *
is probably the simplest approach, albeit introducing non-deterministic behavior, but maybe that's fine for transpilation purposes.A more sophisticated approach could be to only project columns in the inner query, giving them unique aliases (preserving existing ones or those that have valid "output names" where applicable), and then reproducing whatever complex expression in the outer query by referencing those unique names.
So, if you had a projection
t1.x + t2.x + 1
, you'd projectt1.x
andt2.x
, giving them unique aliases, and then in the outer query you'd reference them to reconstruct that sum.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But re: what happens today, you're probably right. The transformation seems a bit naive.