Skip to content

Conversation

@kmuehlbauer
Copy link
Collaborator

@kmuehlbauer kmuehlbauer commented Oct 2, 2025

Description

Reading (partly) uninitialized vlen string datasets breaks because NULL values are not properly handled when fetching from global heap. Also fillvalue's need proper distribution.

Closes #109

Before you get started

Checklist

  • This pull request has a descriptive title and labels
  • This pull request has a minimal description (most was discussed in the issue, but a two-liner description is still desirable)
  • Unit tests have been added (if codecov test fails)
  • Any changed dependencies have been added or removed correctly (if need be)
  • If you are working on the documentation, please ensure the current build passes
  • All tests pass

@kmuehlbauer
Copy link
Collaborator Author

OK, first I've changed the test to have uninitialized values to really see it break. Then, I've added the fix (and some more fixes 😬).

Getting this in would help for the h5netcdf integration. I somehow deactivated vlen datasets over there and found it breaking with uninitialized data after reinstating the tests proper.

@codecov
Copy link

codecov bot commented Oct 2, 2025

Codecov Report

❌ Patch coverage is 95.23810% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 73.31%. Comparing base (4512046) to head (20f70b6).
⚠️ Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
pyfive/misc_low_level.py 90.00% 0 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (95.23%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #110      +/-   ##
==========================================
+ Coverage   72.73%   73.31%   +0.57%     
==========================================
  Files          11       11              
  Lines        2505     2548      +43     
  Branches      382      394      +12     
==========================================
+ Hits         1822     1868      +46     
+ Misses        580      576       -4     
- Partials      103      104       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kmuehlbauer
Copy link
Collaborator Author

I've added a similar fix for vlen attributes (found via h5netcdf testsuite).

@kmuehlbauer kmuehlbauer changed the title fix handling of uninitialized vlen string datasets fix handling of uninitialized vlen strings Oct 2, 2025
@valeriupredoi
Copy link
Collaborator

@kmuehlbauer cheers - just let me know when this one's good to go in, mate 🍺
Also if @bnlawrence has a minute to look at it, that'd be fab

@kmuehlbauer
Copy link
Collaborator Author

just let me know when this one's good to go in, mate 🍺

I'll add a test for the uninitialized attribute for completeness. Hang on some minutes 😁

@kmuehlbauer
Copy link
Collaborator Author

I've had to recreate the breaking test from the h5netcdf testsuite. It used some mismatched dimension scales to trigger the error. I've put it in the test_references.py.

@kmuehlbauer
Copy link
Collaborator Author

Think this is ready from my side, when the lights get green. 💚

@valeriupredoi valeriupredoi merged commit ae10cdc into NCAS-CMS:main Oct 2, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reading vlen string dataset/attribute break with unitialized data

2 participants