scPoli dimensionality and other parameters #260

officialprofile · 2024-12-19T12:03:11Z

Hello everyone,

I try to integrate around 500 000 cells from around a dozen GSE datasets. I'm not sure however how to asses the optimal number of parameters, the results I get are more or less ok, but not great. I will be very gratefull for clarifying sever things and suggesting what to do and what to avoid.

I assume we should rely only on HVGs, aroung 2000-5000 I guess, not on all genes? Should the count matrix be normalized?
Does the number of total and pre-training epochs affect significantly the final embedding? For example, will changing the default 100 total and 70 pre to 200 and 150 improve noticeably the result?
Is there any rule of thumb on how to asses the number of embedding and latent dimensions? Is 50 and 20 a reasonable choice? Or maybe a 50 and 50? I have the intuition for regular PC or Harmony dimensionality selection, but you neural network has fundamentally different nature, especially the latent dimensionality is a mystery for me.
I don't want to transfer any labels, so I removed cell_type_keys parameter from the scPoli function, a set labeled_indices to []. This is not a correct approach right, we need to set the labeled_indices anyway (scPoli Model for Unsupervised Use #224)?

My code looks like this:

scpoli_model = scPoli(
    adata=new_adata,
    condition_keys=['GSE'],
    # cell_type_keys=cell_type_key,
    embedding_dims=50,
    latent_dim=20,
    recon_loss='nb',
)

scpoli_model.train(
    n_epochs=100,
    pretraining_epochs=70,
    early_stopping_kwargs=early_stopping_kwargs,
    eta=5,
)

scpoli_query = scPoli.load_query_data(
    adata=new_adata,
    reference_model=scpoli_model,
    labeled_indices=[],
)

data_latent= scpoli_query.get_latent(new_adata, mean=True)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scPoli dimensionality and other parameters #260

scPoli dimensionality and other parameters #260

officialprofile commented Dec 19, 2024 •

edited

Loading

scPoli dimensionality and other parameters #260

scPoli dimensionality and other parameters #260

Comments

officialprofile commented Dec 19, 2024 • edited Loading

officialprofile commented Dec 19, 2024 •

edited

Loading