-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Description
Some expressions set a different output field from the one produced by the query engine. In other words: LazyFrame.collect_schema()
must match LazyFrame.collect().schema
for any expression. Each mismatch is technically a bug, and may impact execution (depending on the execution context).
Initial scope: ApplyExpr
.
Detailed scope: see the mismatch baseline at #23649.
Known problematic cases:
-
str.to_integer
(fix: Fix output field dtype forToInteger
#23664) -
business_day_count
to use LHS-convention for output fieldname
(fix: Match output field name to lhs forBusinessDaycount
#23679) -
cum_sum_horizontal
(fix: Match output dtype to engine forcum_sum_horizontal
#23686 ) -
get_arithmetic_field
with struct and numeric -
str.extract_groups
with the empty string pattern (fix: Fix output forstr.extract_groups
with empty string pattern #23698) -
rolling_map
(fix: Match output type to engine forrolling_map
#23702) -
interpolate()
onpl.Decimal
(fix: Match output type to engine forinterpolate
onDecimal
#23706) -
over(pl.struct("x"))
-
when_then_over
-
str.to_decimal
Scale is resolved based on data. Should get ascale
parameter and return unknown if not given. -
str.to_datetime
/str.strptime(pl.DateTime)
Resolves time zone based on data if format is not given. Time unit is don't resolved based on data, but has some complex rules to it. -
shrink_dtype
Output datatype is resolved based on data. Should return unknown. -
when_then
Output type mismatch inwhen-then
#23733
Metadata
Metadata
Assignees
Labels
No labels