-
-
Notifications
You must be signed in to change notification settings - Fork 308
HDF5 Testing Framework
H. Joe Lee edited this page Sep 2, 2025
·
7 revisions
- Regression Testing: Verify that updates, bug fixes, or enhancements to the HDF5 library maintain compatibility with previous versions and do not alter the expected behavior of existing features, APIs, or data formats.
- Performance Evaluation: Evaluate the efficiency, scalability, and reliability of HDF5 operations under various workloads.
- Installation and setup
- API & Functional testing
- Compatibility testing:
- Backward compatibility with File Format
- Backward compatibility with library versions (version functions tested via GitHub)
- Platforms & Compilers
- Read/write throughput
- Latency
- Memory usage (Currently not monitored)
- I/O patterns across different configurations and environments
Set up a controlled testing environment that reflects the target deployment scenario:
- Hardware specifications (CPU, RAM, storage type)
- Operating system and file system details
- HDF5 library version and configuration
- Network setup for distributed or parallel I/O testing
- HDF5 command-line utilities:
h5perf
,h5dump
,h5stat
- Custom benchmarking scripts using
h5py
or C APIs- Other benchmarks to be determined later
- H5Bench suite:
- Simulates common HDF5 usage patterns
- Supports parallel I/O
- Evaluates I/O overhead and observed I/O rate
- Includes patterns for synchronous/asynchronous operations, caching, logging, and metadata stress
- GitHub
- Documentation
- Profiling tools: Grafana
- Monitoring tools: CDash (optional)
- NOTE: It is the responsibility of the test authors to address these metrics, testing only verifies pass or fail on various configurations.
Metric | Description |
---|---|
Backward Compatibility | Ensure older HDF5 files can still be read and written correctly |
API Stability | Confirm that public APIs behave consistently across versions |
Data Integrity | Validate that data stored and retrieved remains unchanged |
Performance Consistency | Detect any regressions in read/write performance |
Cross-Platform Consistency | Ensure consistent behavior across supported platforms and compilers |
Error Handling | Confirm that known error conditions are still handled correctly |
- NOTE: It is the responsibility of the test authors to address these metrics, testing only verifies pass or fail on various configurations.
Metric | Description |
---|---|
Throughput Measurement | Assess read/write speeds for different dataset sizes and access patterns |
File Size and Layout | Compare performance between contiguous and chunked layouts |
Chunking Strategies | Evaluate impact of chunk sizes and compression methods |
Parallel I/O | Test performance with MPI-enabled HDF5 and scalability |
Metadata Access | Measure time to read/write attributes and nested group structures |
Dataset Access Patterns | Benchmark selection methods and data type performance |
Caching Behavior | Analyze effects of chunk cache settings and flushing |
Additional Considerations | Latency, CPU/memory utilization |
- Sequential and random read/write operations
- Chunked and compressed dataset access
- Parallel I/O using MPI
- Large-scale dataset handling
- Metadata access and update performance
Document test results with:
- Summary of test configurations (Larry, Pull from CDash)
- Tabulated performance metrics
- Observations and anomalies
- Recommendations for optimization
- Comparison with baseline or previous versions