Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarr support #190

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open

Zarr support #190

wants to merge 31 commits into from

Conversation

keller-mark
Copy link

@keller-mark keller-mark commented Nov 5, 2024

Fixes #91

These changes are from both me and @Artur-man

The main public-facing changes here are:

  • The ZarrAnnData class
  • read_zarr and write_zarr top-level functions
  • Support for from_Seurat(output_class="ZarrAnnData")
  • Support for from_SingleCellExperiment(output_class="ZarrAnnData")

Internally:

  • read_zarr_helpers.R is the zarr analog of read_h5ad_helpers.R
  • write_zarr_helpers.R is the zarr analog of write_h5ad_helpers.R
  • Test fixtures within inst/extdata/example.zarr (this makes the diff noisy, apologies)
  • Lots of tests:
    • test-Zarr-read.R (35 new tests)
    • test-Zarr-write.R (70)
    • test-ZarrAnnData.R (26)
    • test-h5ad-zarr.R (17)

A number of these functions generate warnings in the R console that are intended to be followed up on to improve the code (and should probably be resolved as end users may not appreciate them), but the tests still pass despite these warnings.

Known things that are not implemented here:

  • support for recarrays
  • usage of mode = c("r", "r+", "a", "w", "w-", "x") parameter value

Copy link
Collaborator

@rcannood rcannood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work @keller-mark and @Artur-man !

I went through the PR for a first time and left some minor comments. I will review the code by running it a couple of times next :)

R/read_zarr.R Outdated Show resolved Hide resolved
attrs <- g$get_attrs()$to_list()

if (!all(c("encoding-type", "encoding-version") %in% names(attrs))) {
path <- name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are a lot of linting issues in this file -- could you run lintr::lint_package() and fix any issues that pop up?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done ... did a full lint_package check and corrected some R check issues too!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should probably be removed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Artur-man
Copy link

@rcannood @keller-mark should we add pizzarr only under Suggests in DESCRIPTION and no need to add a Remotes list here ?

Suggests:
    ...
    ...
    pizzarr
Remotes:
    keller-mark/pizzarr

@keller-mark
Copy link
Author

Good point, at minimum we should add a note to the Github README here https://github.com/scverse/anndataR/tree/main?tab=readme-ov-file#installation

@Artur-man
Copy link

Artur-man commented Nov 6, 2024

Ah, this is already added here, nevermind then only Suggests should work.
https://github.com/keller-mark/anndataR/blob/d192e68ed312f212476a5490c973619f08e7c8de/R/ZarrAnnData.R#L240-L246.

but will add this to README as well!!

@lazappi
Copy link
Collaborator

lazappi commented Nov 7, 2024

Not a proper review of the code but I wanted to raise that we plan to submit {anndataR} to Bioconductor (or maybe CRAN) at some point (hopefully soonish). This means we won't be able to depend on GitHub only packages. Do you have plans to submit {pizzarr} somewhere or could you rewrite this to use one of the public Zarr packages?

@Artur-man
Copy link

@lazappi thanks for the notification, pizzarr could be submitted to CRAN soon. keller-mark/pizzarr#93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement zarr backend
4 participants