-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Open
Copy link
Description
Describe the enhancement requested
It would be good to have a utility function to create an Arrow table directly instead of having to go through pandas in some of out pyarrow tests. The existing utility function that uses pandas is:
arrow/python/pyarrow/tests/parquet/common.py
Lines 98 to 121 in bb33493
def _test_dataframe(size=10000, seed=0): | |
import pandas as pd | |
np.random.seed(seed) | |
df = pd.DataFrame({ | |
'uint8': _random_integers(size, np.uint8), | |
'uint16': _random_integers(size, np.uint16), | |
'uint32': _random_integers(size, np.uint32), | |
'uint64': _random_integers(size, np.uint64), | |
'int8': _random_integers(size, np.int8), | |
'int16': _random_integers(size, np.int16), | |
'int32': _random_integers(size, np.int32), | |
'int64': _random_integers(size, np.int64), | |
'float32': np.random.randn(size).astype(np.float32), | |
'float64': np.arange(size, dtype=np.float64), | |
'bool': np.random.randn(size) > 0, | |
'strings': [util.rands(10) for i in range(size)], | |
'all_none': [None] * size, | |
'all_none_category': [None] * size | |
}) | |
# TODO(PARQUET-1015) | |
# df['all_none_category'] = df['all_none_category'].astype('category') | |
return df |
This issue would move some of tests using _test_dataframe
to use a new utility function and remove the @pytest.mark.pandas
in this cases.
Component(s)
Python
rok