Skip to content

Conversation

@lazappi
Copy link
Collaborator

@lazappi lazappi commented Dec 1, 2025

Related to: Fixes #128

Description

Add support for backed objects using DelayedArray matrices:

  • Add a HDF5File class to store file handles and paths with convenient methods for opening/closing them
  • Use the {HDF5Array} package for reading dense and sparse arrays from HDF5 files
  • Add a backed argument to HDF5AnnData
  • Add the backed option to InMemoryAnnData
  • Conversion of backed AnnData objects to SingleCellExperiment/Seurat

Other features that could be added:

  • Add support for writing DelayedArray matrices
  • Add SparseArray as a format for in-memory matrices

Checklist

Before review

  • Update and regenerate man pages
  • Add/update tests
  • Add/update examples in vignettes
  • Pass CI checks

Before merge

  • Update NEWS
  • Bump devel version

@lazappi
Copy link
Collaborator Author

lazappi commented Dec 2, 2025

/style

@lazappi
Copy link
Collaborator Author

lazappi commented Dec 3, 2025

/style

@lazappi lazappi marked this pull request as ready for review December 3, 2025 11:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds DelayedArray support for backed HDF5 objects in the anndataR package. The implementation introduces a new HDF5File R6 class to manage HDF5 file handles and adds a backed argument throughout the codebase to enable disk-backed matrix operations using the HDF5Array package.

Key Changes

  • New HDF5File class: Manages HDF5 file handles with automatic open/close functionality using withr's deferred execution
  • Backed mode support: Adds backed parameter to read_h5ad(), HDF5AnnData$new(), and as_HDF5AnnData() to return DelayedArray matrices instead of loading data into memory
  • Refactored file handling: All HDF5 read/write operations now use the HDF5File class instead of raw file handles, improving resource management and enabling backed array support

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
R/HDF5File.R New R6 class for managing HDF5 file handles with deferred open/close operations
R/HDF5AnnData.R Updated to use HDF5File class, added backed mode support, improved mode handling (read-only vs read-write)
R/read_h5ad_helpers.R Added backed parameter to reading functions, implemented DelayedArray returns via HDF5Array package
R/write_h5ad_helpers.R Refactored to use HDF5File objects instead of raw handles
R/write_hdf5_helpers.R Refactored to use HDF5File objects, added helper functions for file operations
R/read_h5ad.R Added backed parameter and explicit file open/close for conversions
R/as_SingleCellExperiment.R Added .as_SCE_process_pairs_mapping() to convert DelayedArrays for SelfHits compatibility
R/as_Seurat.R Updated to handle DelayedArray conversions for graphs
R/utils.R Updated to_R_matrix() to handle DelayedArray inputs with allow_backed parameter
tests/testthat/test-*.R Updated all tests to use HDF5File objects, added comprehensive tests for backed mode
man/*.Rd Updated documentation for new backed parameter
DESCRIPTION Added DelayedArray and HDF5Array to Suggests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add DelayedArray support

2 participants