Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip from_json overflow tests for [databricks] 14.3 #11719

Open
wants to merge 3 commits into
base: branch-24.12
Choose a base branch
from

Conversation

mythrocks
Copy link
Collaborator

Fixes #11533.

This commit addresses the test failures reported in #11533, for the following tests:

  • json_matrix_test.py::test_from_json_long_structs()
  • json_matrix_test.py::test_scan_json_long_structs()

These failures are a result of #11711. When the JSON parser attempts to read integral struct members from a JSON file, if the parsing leads to an overflow, then the STRUCT column value is deemed null on Databricks 14.3 (i.e. without spark-rapids active). This behaviour differs from that exhibited by Apache Spark versions exceeding 3.4.1.

This commit breaks out the problematic JSON test rows into a separate file, whose read is tested in an xfail for Databricks 14.3. The remaining rows are tested on all versions.

The true fix for #11711 will be addressed later.

Fixes NVIDIA#11533.

This commit addresses the test failures reported in NVIDIA#11533, for the
following tests:
  - `json_matrix_test.py::test_from_json_long_structs()`
  - `json_matrix_test.py::test_scan_json_long_structs()`

These failures are a result of NVIDIA#11711.  When the JSON parser attempts to
read integral struct members from a JSON file, if the parsing leads to
an overflow, then the `STRUCT` column value is deemed null on Databricks
14.3 (i.e. *without* `spark-rapids` active).  This behaviour differs
from that exhibited by Apache Spark versions exceeding 3.4.1.

This commit breaks out the problematic JSON test rows into a separate
file, whose read is tested in an `xfail` for Databricks 14.3.  The
remaining rows are tested on all versions.

The true fix for NVIDIA#11711 will be addressed later.

Signed-off-by: MithunR <[email protected]>
@mythrocks
Copy link
Collaborator Author

Build

@sameerz sameerz added the test Only impacts tests label Nov 13, 2024
@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

The CI runs are fine. The failure in (400, 17) is on account of #11722.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix JSON Matrix tests on Databricks 14.3
3 participants