Description
CustomDataloaders currently don't support advanced capabilities like scArches or celltype prediction in scANVI. We have to create a registry without setup_anndata that contains the same elements (see below).
https://github.com/chanzuckerberg/cellxgene-census/blob/222efddf2ce82f93f76329aa353962c1dc2400ac/api/python/notebooks/experimental/pytorch_loader_scvi.ipynb is the first working example. Currently, they use the following code to save the model:
user_attributes = model._get_user_attributes()
user_attributes = {a[0]: a[1] for a in user_attributes if a[0][-1] == "_"}
user_attributes.update(
{
"n_batch": datamodule.n_batch,
"n_extra_categorical_covs": 0,
"n_extra_continuous_covs": 0,
"n_labels": 1,
"n_vars": datamodule.n_vars,
}
)
We want to create a new function that fills out the registry and passes it to the model at: model = scvi.model.SCVI(n_layers=n_layers, n_latent=n_latent, gene_likelihood="nb", encode_covariates=False)
. You can see all necessary entries and the structure at: scvi.adata_manager.get_state_registry(scvi.REGISTRY_KEYS.X_KEY).to_dict()
.
After fixing this, all uses of _module_init_on_train
throughout the codebase should be removed as they are not necessary anymore.