Skip to content

BUG: QueryCompilerCaster breaks NamedTuple arguments #7594

@sfc-gh-joshi

Description

@sfc-gh-joshi

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

import modin.pandas as pd
from modin.pandas.api.extensions import register_dataframe_accessor

from typing import NamedTuple

class CustomTuple(NamedTuple):
    a: str
    b: int


@register_dataframe_accessor("custom_method")
def custom_method(self, custom_arg: CustomTuple):
    print(custom_arg.a + str(custom_arg.b))


pd.DataFrame().custom_method(CustomTuple("a", 1))

Issue Description

The above raises TypeError: CustomTuple.__new__() missing 1 required positional argument: 'b'.

When the query compiler caster walks a function's arguments, it attempts to convert tuples to lists. However, NamedTuple objects have different constructor behavior from the native tuple object:

tuple(["a", 1])  # ('a', 1)
CustomTuple(["a", 1])  # raises TypeError because it tries to use the whole list as the first field

To fix this, we need to modify this block of code:

return (
# ValuesView, which we might get from dict.values(), is immutable,
# but not constructable, so we convert it to a tuple. Otherwise,
# we return an object of the same type as the input.
tuple
if issubclass(args_type, ValuesView)
else args_type
)(visit_nested_args(list(arguments), fn))

to either use NamedTuple._make, or handle passing of collections differently.

Expected Behavior

Should not error.

Error Logs

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/joshi/code/modin/modin/core/storage_formats/pandas/query_compiler_caster.py", line 986, in f_with_argument_casting
    visit_nested_args(args, register_query_compilers)
  File "/Users/joshi/code/modin/modin/core/storage_formats/pandas/query_compiler_caster.py", line 428, in visit_nested_args
    )(visit_nested_args(list(arguments), fn))
  File "/Users/joshi/code/modin/modin/core/storage_formats/pandas/query_compiler_caster.py", line 436, in visit_nested_args
    visit_nested_args(arguments[i], fn)
  File "/Users/joshi/code/modin/modin/core/storage_formats/pandas/query_compiler_caster.py", line 421, in visit_nested_args
    return (
TypeError: CustomTuple.__new__() missing 1 required positional argument: 'b'

Installed Versions

INSTALLED VERSIONS

commit : 8600760
python : 3.10.13.final.0
python-bits : 64
OS : Darwin
OS-release : 24.5.0
Version : Darwin Kernel Version 24.5.0: Tue Apr 22 19:54:25 PDT 2025; root:xnu-11417.121.6~2/RELEASE_ARM64_T6020
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

Modin dependencies

modin : 0.32.0+69.g86007603
ray : 2.34.0
dask : 2024.8.1
distributed : 2024.8.1

pandas dependencies

pandas : 2.2.2
numpy : 2.2.6
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.0.0
pip : 23.3
Cython : None
pytest : 8.3.2
hypothesis : None
sphinx : 5.3.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 5.3.0
html5lib : None
pymysql : None
psycopg2 : 2.9.9
jinja2 : 3.1.4
IPython : 8.17.2
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat : None
fastparquet : 2024.5.0
fsspec : 2024.6.1
gcsfs : None
matplotlib : 3.9.2
numba : None
numexpr : 2.10.1
odfpy : None
openpyxl : 3.1.5
pandas_gbq : 0.23.1
pyarrow : 17.0.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : 2024.6.1
scipy : 1.14.1
sqlalchemy : 2.0.32
tables : 3.10.1
tabulate : 0.9.0
xarray : 2024.7.0
xlrd : 2.0.1
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Important tasks that we should complete soonbug 🦗Something isn't workinghybrid-execution

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions