Skip to content

Commit

Permalink
Add special orc test data: timestamp interspersed with null values (#…
Browse files Browse the repository at this point in the history
…17713)

Authors:
  - Tianyu Liu (https://github.com/kingcrimsontianyu)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #17713
  • Loading branch information
kingcrimsontianyu authored Jan 10, 2025
1 parent fb2413e commit dc2a75c
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 0 deletions.
Binary file not shown.
Binary file not shown.
6 changes: 6 additions & 0 deletions python/cudf/cudf/tests/test_orc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1975,8 +1975,14 @@ def test_row_group_alignment(datadir):
@pytest.mark.parametrize(
"inputfile",
[
# These sample data have a single column my_timestamp of the TIMESTAMP type,
# 2660 rows, and 1536 rows per row group.
"TestOrcFile.timestamp.desynced.uncompressed.RLEv2.orc",
"TestOrcFile.timestamp.desynced.snappy.RLEv2.orc",
# These two data are the same with the above, except that every 100 rows start
# with a null value.
"TestOrcFile.timestamp.desynced.uncompressed.RLEv2.hasNull.orc",
"TestOrcFile.timestamp.desynced.snappy.RLEv2.hasNull.orc",
],
)
def test_orc_reader_desynced_timestamp(datadir, inputfile):
Expand Down

0 comments on commit dc2a75c

Please sign in to comment.