Skip to content

anndata dataframes and matrices #14

@GreenGilad

Description

@GreenGilad

Hi there,

When storing a matrix with row names and column names in the uns slot, these are removed. I assume that is to align with the python numpy implementation where these are not supported.

To try and work around this problem of loosing the row and column names I can convert the R matrix to a R dataframe. However, when doing so, the anndata object stores these as a pandas DataFrame. I wanted to ask why does the R anndata object stores R dataframe as a pandas DataFrame instead in the R format? Couldn't this be kept transparent to the user only for reading and writing the h5ad object to file but then once loaded to have the class of R dataframe? Currently, every time I wish to use such a dataframe I must use reticulate::py_to_r and I still loose row and column names when doing so.

Couldn't it be the same as anndata$X?

Related to this issue is the case that the matrix contains character values. In this case I am not able to nicely obtain the matrix with the names and in a proper matrix shape. I get it as a flat matrix even if I try to reshape it.

The scenario I am working on is of a square symmetric correlation matrix with the p-values, multiple hypothesis testing corrections matrix and the asterisks matrix.

data$uns$ss.cor <- list(
  names = colnames(data$X),
  corr = stats::cor(data$X, use = "pairwise.complete.obs", method = "spearman"),
  pval = outer(1:ncol(data$X), 1:ncol(data$X), Vectorize(function(i,j)
    cor.test(data$X[,i], data$X[,j], use="pairwise.complete.obs", method = "spearman")[["p.value"]]))
)
data$uns$ss.cor$adj.pval <- matrix(p.adjust(data$uns$ss.cor$pval, method = "BH"), nrow=nrow(data$uns$ss.cor$pval))
data$uns$ss.cor$sig <- matrix(cut(data$uns$ss.cor$adj.pval, c(-.1, 0.001, 0.01, 0.05, Inf), c("***", "**", "*", "")), nrow=nrow(data$uns$ss.cor$pval))
data$uns$ss.cor$params <- list(cor.method = "spearman",
                               cor.use = "pairwise.complete.obs",
                               p.adjust.method = "BH")

To keep it simple I am showing the above using data$X but in reality I am using a matrix of different shape than X and therefore using uns and not varp.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions