Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(duckdb): StructColumns are compiled to untyped ROW, which aren't CAST correctly #10417

Open
1 task done
NickCrews opened this issue Nov 2, 2024 · 1 comment
Open
1 task done
Labels
bug Incorrect behavior inside of ibis

Comments

@NickCrews
Copy link
Contributor

What happened?

import ibis

s1 = ibis.struct({"i": 1, "s": "foo"})
s2 = ibis.struct({"i": ibis.literal(1), "s": ibis.literal("foo")})
e1 = s1.cast("struct<s: string, i: int32>").name("x")
e2 = s2.cast("struct<s: string, i: int32>").name("x")

e1.execute()  # Works
s1.op()  # Literal
print(ibis.to_sql(e1))
# SELECT CAST(CAST(ROW(1, 'foo') AS STRUCT("i" TINYINT, "s" TEXT)) AS STRUCT("s" TEXT, "i" INT)) AS "x"

e2.execute()  # ConversionException: Conversion Error: Could not convert string 'foo' to INT32
s1.op()  # StructColumn
print(ibis.to_sql(e2))
# SELECT CAST(ROW(1, 'foo') AS STRUCT("s" TEXT, "i" INT)) AS "x"

I think the solution would be to add the explicit CAST to the StructColumn compilation, to match how the Literal is compiled? It obviously isn't always needed, based on how all tests are currently passing, but it appears to be context-dependent on where the val is getting used, and I think I would like to just remove any possibility of bugs, even if it sometimes more verbose than needed.

It also could be considered a bug in sqlglot, because in the sqlglot IR, there is enough info to see what the correct thing to do is:

Cast(
    this=Struct(
        expressions=[
            PropertyEQ(
                this=Identifier(this="i", quoted=True),
                expression=Literal(this=1, is_string=False),
            ),
            PropertyEQ(
                this=Identifier(this="i", quoted=True),
                expression=Literal(this="foo", is_string=True),
            ),
        ]
    ),
    to=DataType(
        this=Type.STRUCT,
        expressions=[
            ColumnDef(
                this=Identifier(this="s", quoted=True), kind=DataType(this=Type.VARCHAR)
            ),
            ColumnDef(
                this=Identifier(this="i", quoted=True), kind=DataType(this=Type.INT)
            ),
        ],
        nested=True,
    ),
    copy=False,
)

eg is sqlglot's implementation for compiling Cast was really clever, they would see this reordering and fix it.

What version of ibis are you using?

main

What backend(s) are you using, if any?

duckdb, IDK if it is present in others.

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Nov 2, 2024
@cpcloud
Copy link
Member

cpcloud commented Nov 2, 2024

With any sqlglot related issues, please show your sqlglot version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Status: backlog
Development

No branches or pull requests

2 participants