Skip to content
Draft
Show file tree
Hide file tree
Changes from 77 commits
Commits
Show all changes
93 commits
Select commit Hold shift + click to select a range
19698c3
WIP
keller-mark May 6, 2024
5100a40
More tests passing
keller-mark May 6, 2024
4c9d01f
Fix df read bug
keller-mark May 6, 2024
eb8b69c
More tests passing after fixing zero-dimensional get bug in pizzarr
keller-mark May 7, 2024
58f5ce4
WIP: writing
keller-mark May 7, 2024
a15513a
Fix more tests
keller-mark May 7, 2024
1ef1be6
Zarr df writing
keller-mark May 7, 2024
77a7ee2
WIP: ZarrAnnData class
keller-mark May 7, 2024
27ce6c1
Tests passing
keller-mark May 7, 2024
8790715
Tests that compare h5ad to zarr
keller-mark May 8, 2024
bb0c6c7
Use Rarr to read full numeric arrays
keller-mark May 17, 2024
0456ecd
Fix bugs. Add test for from_SingleCellExperiment with Zarr
keller-mark Jun 18, 2024
acac772
Add a to_dense param to ZarrAnnData constructor. Add overwrite params…
keller-mark Jun 20, 2024
30316ed
Update
keller-mark Jun 20, 2024
c6b4d89
Backwards dense/sparse
keller-mark Jun 20, 2024
9cadc26
Merge branch 'keller-mark/zarr' of https://github.com/keller-mark/ann…
Artur-man Nov 1, 2024
a40a618
Merge branch 'keller-mark/zarr' into zarr
Artur-man Nov 1, 2024
1afd6eb
Simplify how obs and var names handled in ZarrAnnData (similar to #171)
Artur-man Nov 1, 2024
7f53049
update extdata and documentation
Artur-man Nov 1, 2024
3800f38
fix set/get zarr _index, update text example.zarr and update tests si…
Artur-man Nov 2, 2024
4a7d4c2
Merge pull request #5 from Artur-man/zarr
keller-mark Nov 5, 2024
c881bb9
Merge
keller-mark Nov 5, 2024
4a1bbde
Fix test
keller-mark Nov 5, 2024
15dfbde
Revert unnecessary changes
keller-mark Nov 5, 2024
438809a
Formatting
keller-mark Nov 5, 2024
2215402
Merge pull request #6 from keller-mark/keller-mark/zarr-2
keller-mark Nov 5, 2024
087ffb7
Add comments
keller-mark Nov 5, 2024
37d1ae5
Merge pull request #7 from keller-mark/keller-mark/comments
keller-mark Nov 5, 2024
357a8d7
remove unnecessary example zarr store
Artur-man Nov 6, 2024
d192e68
lintr and R check for zarr related utilities and functions, updated s…
Artur-man Nov 6, 2024
1e0e868
add pizzarr to Suggests and README
Artur-man Nov 6, 2024
fe07028
proj
Artur-man Mar 10, 2025
7ef94f8
Merge branch 'main' into keller-mark/zarr
Artur-man Mar 10, 2025
bf8e797
add keller-mark/pizzarr to Remotes
Artur-man Mar 10, 2025
5abcc75
zip example.zarr
Artur-man Mar 14, 2025
c5ec1c0
Merge branch 'main' into keller-mark/zarr
Artur-man Apr 10, 2025
84ad61f
Merge branch 'main' into keller-mark/zarr
Artur-man Apr 12, 2025
c3cb8aa
adapt read_zarr to Rarr
Artur-man Apr 12, 2025
63f102c
adapt write_zarr to Rarr
Artur-man Apr 12, 2025
a31ff9b
update to most recent anndataR
Artur-man Nov 10, 2025
e98f877
remove old scripts
Artur-man Nov 23, 2025
b0bfad4
update write_zarr
Artur-man Nov 23, 2025
7ebe151
initial update to ZarrAnnData
Artur-man Nov 23, 2025
96c5824
update ZarrAnnData, documentation, and implement read_zarr_rec_array
Artur-man Nov 23, 2025
c41a042
review read zarr helpers, and update tests
Artur-man Nov 23, 2025
370ac17
update read_zarr, read tests pass
Artur-man Nov 23, 2025
ddb5271
some updates for writing zarr
Artur-man Nov 23, 2025
755904d
update write_empty_zarr
Artur-man Nov 24, 2025
4290aed
remove pizzarr, update documentation
Artur-man Nov 24, 2025
e43c819
remove pizzarr from tests
Artur-man Nov 24, 2025
a98b58f
fix test-ZarrAnnData
Artur-man Nov 24, 2025
2d551d8
update ZarrAnnData to imitate HDF5AnnData
Artur-man Nov 24, 2025
42fcbb1
check redundant files, correct lines
Artur-man Nov 24, 2025
205dee4
update example_h5ad.py, add zarr and change to example_files.py
Artur-man Nov 24, 2025
f7638eb
add new test example
Artur-man Nov 25, 2025
acede3c
some linting changes
Artur-man Nov 26, 2025
07c92f7
remove read/write_zattrs since implemented in Rarr
Artur-man Nov 26, 2025
2b672ab
access read/write_zarr_attr
Artur-man Nov 26, 2025
555a634
Merge branch 'main' into keller-mark/zarr
Artur-man Dec 1, 2025
dcaf157
add some missing tests
Artur-man Dec 1, 2025
b10faa5
Merge branch 'main' into keller-mark/zarr
Artur-man Dec 4, 2025
e3d08f8
update readers, update tests
Artur-man Dec 4, 2025
1e2addc
correct nullable string zarr array write/read, introduce ordering in …
Artur-man Dec 5, 2025
570325b
do some linting, fix commented out code
Artur-man Dec 5, 2025
0fac149
update some zarr writers and classes
Artur-man Dec 5, 2025
79023b4
fix documentation
Artur-man Dec 5, 2025
bece447
fix compression interface for zarr
Artur-man Dec 5, 2025
a46c9e1
full lint check
Artur-man Dec 5, 2025
f90d70a
fix examples
Artur-man Dec 5, 2025
73934a7
check, biocheck and lintr
Artur-man Dec 5, 2025
f42a6df
fix development status
Artur-man Dec 5, 2025
a373973
air format
Artur-man Dec 5, 2025
2f73501
air format test
Artur-man Dec 5, 2025
540852d
update example.zarr.zip, skip some test (waiting for Rarr)
Artur-man Dec 5, 2025
a22d007
update example.zarr, fix some read_zarr_
Artur-man Dec 6, 2025
1cc5ff6
fix examples
Artur-man Dec 6, 2025
2499d5c
remove overwrite
Artur-man Dec 6, 2025
bd6238a
R code styling
Artur-man Dec 9, 2025
aaf9801
fixes from @lazappi
Artur-man Dec 12, 2025
43d4f1f
Merge branch 'main' into keller-mark/zarr
Artur-man Dec 12, 2025
73ee0e3
air format
Artur-man Dec 12, 2025
d70a011
update some documentation
Artur-man Dec 20, 2025
ccb0cdf
fix some tests
Artur-man Dec 21, 2025
fe8f196
more fixes on anndata-zarr integration
Artur-man Dec 21, 2025
e6efbf2
update ZarrAnnData$initialize
Artur-man Jan 1, 2026
8b0d1b0
update zarr compression
Artur-man Jan 1, 2026
bafae8e
fix column-order here, C based ordering for arrays
Artur-man Jan 2, 2026
eead040
implement roundtrip tests for anndata-zarr
Artur-man Jan 2, 2026
64e4289
add zarr to vignettes
Artur-man Jan 2, 2026
23f8ac5
update README and software_design.rmd
Artur-man Jan 2, 2026
0852908
update AnnData-usage
Artur-man Jan 2, 2026
8677fad
update write_zarr documentation
Artur-man Jan 2, 2026
efa2ca0
update write_zarr_null
Artur-man Jan 8, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,4 @@ rsconnect/
vignettes/data/*.h5ad
/doc/
/Meta/
/data/
/data/
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ Suggests:
knitr,
processx,
rhdf5 (>= 2.52.1),
Rarr,
rmarkdown,
S4Vectors,
Seurat,
Expand Down
6 changes: 6 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,15 @@ S3method(r_to_py,AbstractAnnData)
export(AnnData)
export(AnnDataView)
export(as_AnnData)
export(create_zarr)
export(create_zarr_group)
export(generate_dataset)
export(get_generator_types)
export(is_zarr_empty)
export(read_h5ad)
export(read_zarr)
export(write_h5ad)
export(write_zarr)
importFrom(Matrix,as.matrix)
importFrom(Matrix,sparseMatrix)
importFrom(Matrix,t)
Expand All @@ -35,3 +40,4 @@ importFrom(reticulate,r_to_py)
importFrom(rlang,`%||%`)
importFrom(rlang,caller_env)
importFrom(stats,setNames)
importFrom(utils,tail)
39 changes: 39 additions & 0 deletions R/AbstractAnnData.R
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,28 @@ AbstractAnnData <- R6::R6Class(
)
},
#' @description
#' Convert to an [`ZarrAnnData`]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' Convert to an [`ZarrAnnData`]
#' Convert to a [`ZarrAnnData`]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

#'
#' See [as_ZarrAnnData()] for more details on the conversion
#'
#' @param file See [as_ZarrAnnData()]
#' @param compression See [as_ZarrAnnData()]
#' @param mode See [as_ZarrAnnData()]
#'
#' @return An [`ZarrAnnData`] object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' @return An [`ZarrAnnData`] object
#' @return A [`ZarrAnnData`] object

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

as_ZarrAnnData = function(
file,
compression = c("none", "gzip"),
mode = c("w-", "r", "r+", "a", "w", "x")
) {
as_ZarrAnnData(
adata = self,
file = file,
compression = compression,
mode = mode
)
},
#' @description
#' Write the `AnnData` object to an H5AD file
#'
#' See [write_h5ad()] for details
Expand All @@ -302,6 +324,23 @@ AbstractAnnData <- R6::R6Class(
mode = c("w-", "r", "r+", "a", "w", "x")
) {
write_h5ad(object = self, path, compression = compression, mode = mode)
},
#' @description
#' Write the `AnnData` object to an H5AD file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' Write the `AnnData` object to an H5AD file
#' Write the `AnnData` object to a Zarr file

#'
#' See [write_zarr()] for details
#'
#' @param path See [write_zarr()]
#' @param compression See [write_zarr()]
#' @param mode See [write_zarr()]
#'
#' @return `path` invisibly
write_zarr = function(
path,
compression = c("none", "gzip"),
mode = c("w-", "r", "r+", "a", "w", "x")
) {
write_zarr(object = self, path, compression = compression, mode = mode)
}
),
private = list(
Expand Down
75 changes: 75 additions & 0 deletions R/Rarr_utils.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#' create_zarr_group
#'
#' create zarr groups
#'
#' @param store the location of (zarr) store
#' @param name name of the group
#' @param version zarr version
#' @importFrom utils tail
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports from common packages can go in anndataR-package.R

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

#' @examples
#' store <- tempfile(fileext = ".zarr")
#' create_zarr(store)
#' create_zarr_group(store, "gp")
#' @export
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are internal functions that shouldn't be exported, use @noRd instead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more consistent with the formatting of the function documentation, something like:

Suggested change
#' create_zarr_group
#'
#' create zarr groups
#'
#' @param store the location of (zarr) store
#' @param name name of the group
#' @param version zarr version
#' @importFrom utils tail
#' @examples
#' store <- tempfile(fileext = ".zarr")
#' create_zarr(store)
#' create_zarr_group(store, "gp")
#' @export
#' Create Zarr group
#'
#' Create a new group in a Zarr store
#'
#' @param store the location of the Zarr store
#' @param name name of the group to create
#' @param version Zarr version
#'
#' @noRd
#'
#' @examples
#' store <- tempfile(fileext = ".zarr")
#' create_zarr(store)
#' create_zarr_group(store, "gp")

#' @return `NULL`
create_zarr_group <- function(store, name, version = "v2") {
split.name <- strsplit(name, split = "\\/")[[1]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be slightly faster if regexes are not used:

Suggested change
split.name <- strsplit(name, split = "\\/")[[1]]
split.name <- strsplit(name, split = "/", fixed = TRUE)[[1]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (length(split.name) > 1) {
split.name <- vapply(
seq_along(split.name),
function(x) paste(split.name[seq_len(x)], collapse = "/"),
FUN.VALUE = character(1)
)
split.name <- rev(tail(split.name, 2))
if (!dir.exists(file.path(store, split.name[2]))) {
create_zarr_group(store = store, name = split.name[2])
}
}
dir.create(file.path(store, split.name[1]), showWarnings = FALSE)
switch(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check be in create_zarr() (or wherever it will be first triggered)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good idea but I think it doesn't matter where it is put. Either in create_zarr or create_zarr_group this check should be performed. Since root zarr is a zarr group too, we call create_zarr_group with create_zarr.

However, I like how you think. This functions are auxiliary and will be deprecated once @Bisaloo implements it in Rarr.

version,
v2 = {
write(
"{\"zarr_format\":2}",
file = file.path(store, split.name[1], ".zgroup")
)
},
v3 = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on being able to support v3?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bisaloo is working on it, i think he wants to do this until march-april

stop("Currently only zarr v2 is supported!")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use cli_abort() for errors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

},
stop("only zarr v2 is supported. Use version = 'v2'")
)
}

#' create_zarr
#'
#' create zarr store
#'
#' @param store the location of zarr store
#' @param version zarr version
#' @examples
#' store <- tempfile(fileext = ".zarr")
#' create_zarr(store)
#' @export
#' @return `NULL`
create_zarr <- function(store, version = "v2") {
prefix <- basename(store)
dir <- gsub(paste0(prefix, "$"), "", store)
create_zarr_group(store = dir, name = prefix, version = version)
}

#' create_zarr
#'
#' create zarr store
#'
#' @param store the location of zarr store
#' @examples
#' store <- tempfile(fileext = ".zarr")
#' create_zarr(store)
#' is_zarr_empty(store)
#' @export
#' @return returns TRUE if zarr store is not empty
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the function is named is_zarr_empty() it should return TRUE when the store is empty. I'm not sure if it's the name or the documentation that is wrong but they don't match.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

is_zarr_empty <- function(store) {
files <- list.files(store, recursive = FALSE, full.names = FALSE)
all(files %in% c(".zarray", ".zattrs", ".zgroup"))
}
Loading
Loading