Skip to content

Using counts matrix with known labels as reference #35

@wmacnair

Description

@wmacnair

Hi

I would love to be able to try out Symphony for label-transfer, but with a set of labels defined by an alternative method to Harmony. So for example, starting from an sce object where the colData includes columns like sample_id and cluster. It feels like this could be quite a common use-case, for example where a dataset is published just with the counts matrix and a list of cluster annotations.

Is this currently possible? My guess is that it could theoretically work, with some tweaking to the code. You would need to implement a new function called something like createHarmonyObjWithDefinedClusters. This would calculate all the variable genes + PCA loadings + etc, but wouldn't do the clustering step. It would then use the pre-defined clusters to estimate the mixture model components. The Symphony steps would then work as normal.

Perhaps you could comment on whether this would actually work, or whether I've misunderstood something fundamental...! ;)

I've seen a couple of other issues that are related (#9, #15, #17, #32, I think), so it feels like a generic solution could be valuable.

Thanks for your efforts :)
Will

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions