Skip to content

BUG: DataFrame constructor not compatible with array-like classes that have a 'name' attribute #61443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 3 tasks
user27182 opened this issue May 14, 2025 · 2 comments
Open
2 of 3 tasks
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@user27182
Copy link

user27182 commented May 14, 2025

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
import pandas as pd
import vtk

poly = vtk.vtkPolyData(points=np.eye(3))
pd.DataFrame(poly.points)
ValueError: Per-column arrays must each be 1-dimensional

Originally posted in pyvista/pyvista#7519

Issue Description

Wrapping a DataFrame with the array-like object above results in an unexpected ValueError being raised. The cause is this line, which assumes that the input object must be a Series or Index type based on having a 'name' attribute.

pandas/pandas/core/frame.py

Lines 798 to 799 in 41968a5

elif getattr(data, "name", None) is not None:
# i.e. Series/Index with non-None name

This assumption fails for the VTKArray poly.points, which also has a 'name' attribute.

Expected Behavior

No error should be raised, and the array-like input should be wrapped correctly by DataFrame

Installed Versions

INSTALLED VERSIONS

commit : 0691c5c
python : 3.12.2
python-bits : 64
OS : Darwin
OS-release : 23.4.0
Version : Darwin Kernel Version 23.4.0: Fri Mar 15 00:19:22 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T8112
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_CA.UTF-8
pandas : 2.2.3
numpy : 1.26.4
pytz : 2025.2
dateutil : 2.9.0.post0
pip : 25.1.1
Cython : None
sphinx : 8.1.3
IPython : 8.36.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.4
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : 6.131.9
gcsfs : None
jinja2 : 3.1.6
lxml.etree : None
matplotlib : 3.10.1
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : 8.3.5
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.14.1
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None

@user27182 user27182 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 14, 2025
@iabhi4
Copy link

iabhi4 commented May 15, 2025

Hi!

I'm a beginner contributor and spent some time digging into this — here's what I found:

When an array-like object (like vtkArray or similar) is passed to pd.DataFrame() and it has a .name attribute, the constructor currently assumes it's a Series or Index and wraps it into a {name: data} dict. This then routes to dict_to_mgr()_extract_index() which attempts to treat the 2D array-like as a 1D column and eventually raises the error

This behavior is unexpected because the array-like input is valid (2D, convertible to DataFrame), but it's being misinterpreted solely due to the presence of .name.

I'd like to work on this issue. I'm happy to follow any guidance or suggestions!

@user27182
Copy link
Author

Yes exactly - the check for a 'name' attribute as a proxy for the input being Series or Index type is the issue. The fix could be as simple as doing a proper isinstance check instead, e.g.:

- elif getattr(data, "name", None) is not None: 
+ elif isinstance(data, (Series, Index)):

though there may be some historical or other reason for why 'name' is used here. But this is what I would try first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants