Skip to content

BUG: DataFrame[StringDtype].where(DataFrame[bool], list[str]) returns object type instead of StringDtype. #63842

@mroeschke

Description

@mroeschke
In [1]: import pandas as pd

In [2]: pdf = pd.DataFrame({"A": ["a", "bc", "cde", "fghi"]})

In [3]: pdf.dtypes
Out[3]: 
A    str
dtype: object

In [4]: pdf_mask = pd.DataFrame({"A": [True, False, True, False]})

In [6]: result = pdf.where(pdf_mask, ["cudf"])

In [7]: result
Out[7] 
      A
0     a
1  cudf
2   cde
3  cudf

In [11]: pdf.dtypes.iloc[0]
Out[11]: <StringDtype(na_value=nan)>

In [12]: result.dtypes.iloc[0]
Out[12]: dtype('O')

I would expect the result to preserve a StringDtype instead of return object

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions