-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Labels
bugSomething isn't workingSomething isn't workingneeds triageAwaiting prioritization by a maintainerAwaiting prioritization by a maintainerpythonRelated to Python PolarsRelated to Python Polars
Description
Checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Add the following to test_search_sorted.py:
def test_search_sorted_categorical() -> None:
# Sorting will be based on order in which entries were added:
series = pl.Series(["c", "b", "b", "a", "c", "b"], dtype=pl.Categorical).sort()
series2 = pl.Series(["c", "b", "a"], dtype=series.dtype)
assert series.search_sorted(series2).to_list() == [0, 2, 5]
assert series.search_sorted("c") == 0
assert series.search_sorted("b") == 2
assert series.search_sorted("a") == 5
def test_search_sorted_enum() -> None:
E = pl.Enum(["a", "b", "c"])
series = pl.Series(["c", "b", "b", "a", "c", "b"], dtype=E).sort()
series2 = pl.Series(["c", "b", "a"], dtype=E)
assert series.search_sorted(series2).to_list() == [4, 1, 0]
assert series.search_sorted("c") == 4
assert series.search_sorted("b") == 1
assert series.search_sorted("a") == 0
Log output
______________________________________________________________________ test_search_sorted_categorical _______________________________________________________________________tests/unit/operations/test_search_sorted.py:86: in test_search_sorted_categorical
assert series.search_sorted("c") == 0
polars/series/series.py:3466: in search_sorted
df = F.select(F.lit(self).search_sorted(element, side))
polars/functions/lazy.py:1952: in select
return empty_frame.select(*exprs, **named_exprs)
polars/dataframe/frame.py:9275: in select
return self.lazy().select(*exprs, **named_exprs).collect(_eager=True)
polars/lazyframe/frame.py:2030: in collect
return wrap_df(ldf.collect(callback))
E polars.exceptions.InvalidOperationError: got invalid or ambiguous dtypes: '[cat, str]' in expression 'search_sorted'
E
E Consider explicitly casting your input types to resolve potential ambiguity.
E
E Resolved plan until failure:
E
E ---> FAILED HERE RESOLVING 'select' <---
E SELECT [Series.search_sorted([String(c)])] FROM
E DF []; PROJECT */0 COLUMNS; SELECTION: None
__________________________________________________________________________ test_search_sorted_enum __________________________________________________________________________tests/unit/operations/test_search_sorted.py:96: in test_search_sorted_enum
assert series.search_sorted("c") == 4
polars/series/series.py:3466: in search_sorted
df = F.select(F.lit(self).search_sorted(element, side))
polars/functions/lazy.py:1952: in select
return empty_frame.select(*exprs, **named_exprs)
polars/dataframe/frame.py:9275: in select
return self.lazy().select(*exprs, **named_exprs).collect(_eager=True)
polars/lazyframe/frame.py:2030: in collect
return wrap_df(ldf.collect(callback))
E polars.exceptions.InvalidOperationError: got invalid or ambiguous dtypes: '[enum, str]' in expression 'search_sorted'
E
E Consider explicitly casting your input types to resolve potential ambiguity.
E
E Resolved plan until failure:
E
E ---> FAILED HERE RESOLVING 'select' <---
E SELECT [Series.search_sorted([String(c)])] FROM
E DF []; PROJECT */0 COLUMNS; SELECTION: None
Issue description
The supertype casting logic doesn't handle casting individual strings to Categoricals or Enums.
Expected behavior
The tests should pass. If a fix is implemented after #19894 is merged, the relevant commented out tests in test_index_of.py
should also be uncommented out and pass (or a new issue should be filed covering them specifically).
Installed versions
Git version of polars as of Dec 5, 2024.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingneeds triageAwaiting prioritization by a maintainerAwaiting prioritization by a maintainerpythonRelated to Python PolarsRelated to Python Polars