HDF5 Testing Framework

Objectives

Regression Testing: Verify that updates, bug fixes, or enhancements to the HDF5 library maintain compatibility with previous versions and do not alter the expected behavior of existing features, APIs, or data formats.
Performance Evaluation: Evaluate the efficiency, scalability, and reliability of HDF5 operations under various workloads.

Key Areas

Regression Testing:

Installation and setup
API & Functional testing
Compatibility testing:
- Backward compatibility with File Format
- Backward compatibility with library versions (version functions tested via GitHub)
Platforms & Compilers

Performance Evaluation:

Read/write throughput
Latency
Memory usage (Currently not monitored)
I/O patterns across different configurations and environments

Environment Setup

Set up a controlled testing environment that reflects the target deployment scenario:

Hardware specifications (CPU, RAM, storage type)
Operating system and file system details
HDF5 library version and configuration
Network setup for distributed or parallel I/O testing

Recommended Tools and Utilities

HDF5 command-line utilities: h5perf, h5dump, h5stat
Custom benchmarking scripts using h5py or C APIs
- Other benchmarks to be determined later
H5Bench suite:
- Simulates common HDF5 usage patterns
- Supports parallel I/O
- Evaluates I/O overhead and observed I/O rate
- Includes patterns for synchronous/asynchronous operations, caching, logging, and metadata stress
- GitHub
- Documentation
Profiling tools: Grafana
Monitoring tools: CDash (optional)

Testing Metrics

Regression Testing (Lead: Larry Knox)

NOTE: It is the responsibility of the test authors to address these metrics, testing only verifies pass or fail on various configurations.

Metric	Description
Backward Compatibility	Ensure older HDF5 files can still be read and written correctly
API Stability	Confirm that public APIs behave consistently across versions
Data Integrity	Validate that data stored and retrieved remains unchanged
Performance Consistency	Detect any regressions in read/write performance
Cross-Platform Consistency	Ensure consistent behavior across supported platforms and compilers
Error Handling	Confirm that known error conditions are still handled correctly

Performance Testing (Lead: Joe Lee)

NOTE: It is the responsibility of the test authors to address these metrics, testing only verifies pass or fail on various configurations.

Metric	Description
Throughput Measurement	Assess read/write speeds for different dataset sizes and access patterns
File Size and Layout	Compare performance between contiguous and chunked layouts
Chunking Strategies	Evaluate impact of chunk sizes and compression methods
Parallel I/O	Test performance with MPI-enabled HDF5 and scalability
Metadata Access	Measure time to read/write attributes and nested group structures
Dataset Access Patterns	Benchmark selection methods and data type performance
Caching Behavior	Analyze effects of chunk cache settings and flushing
Additional Considerations	Latency, CPU/memory utilization

Test Scenarios

Sequential and random read/write operations
Chunked and compressed dataset access
Parallel I/O using MPI
Large-scale dataset handling
Metadata access and update performance

Reporting and Analysis

Document test results with:

Summary of test configurations (Larry, Pull from CDash)
Tabulated performance metrics
Observations and anomalies
Recommendations for optimization
Comparison with baseline or previous versions

Reference

Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

HDF5 Testing Framework

HDF5 Testing Framework

Objectives

Key Areas

Regression Testing:

Performance Evaluation:

Environment Setup

Recommended Tools and Utilities

Testing Metrics

Regression Testing (Lead: Larry Knox)

Performance Testing (Lead: Joe Lee)

Test Scenarios

Reporting and Analysis

Reference

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally