-
Notifications
You must be signed in to change notification settings - Fork 8
Home
ALLSorts is a B-Cell Acute Lymphoblastic Leukemia (B-ALL) subtype classifier, taking gene expression counts and making predictions across 20 molecular subtypes and 5 meta-subtypes! This is a Python based implementation utilising the incredible Scikit Learn.
-
What is a B-Cell ALL? B-ALL is a form of Acute Lymphoblastic Leukemia (ALL), the most common paediatric cancer. It occurs when the maturation of B-Cell lymphoblasts is arrested, leading to their gradual accumulation.
-
Eep! What are subtypes then? It turns out that B-ALL can find its genesis through a variety of causal mechanisms, subtypes, with some conveying a higher risk than others. The World Health Organisation (WHO) have outlined 9 subtypes that encapsulate these distinct mechanisms (2 of which are provisional entries) [1]. However, a recent study from the St. Jude Children's Research Hospital has revealed the existence of perhaps 23 [2]!
-
And a classifier helps... how? Given that treatment can be adjusted based on the knowledge of which subtype of B-ALL a patient may have, it would be very useful to have some way of identifying which! With RNA Sequencing (RNA-Seq) we can quantify the activity of genes. And, as it turns out, there are distinct patterns across genes that are indicative of different subtypes. If we can learn the pattern that defines each subtype, we can then have some pipeline for identifying the subtype. A classifier is a supervised machine learning method that attempts to do just that. In short, it attempts to learns a model from true examples after which it can then perform a predictive task - In this case, assigning a subtype to a sample.
-
...Meta-subtypes? One interesting feature of subtypes is that some are more closely than others - some are phenocopies of another established group. For example, Ph and Ph-like, differ only in which causal mechanism creates the similar phenotype (hence the *-like). ALLSorts groups these similar subtypes into 5 meta-subtypes and performs classification hierarchically, i.e. B-ALL Sample > Ph Group > Ph / Ph-like.
Fine. You can click on the various Wiki pages in the sidebar to get you started.
For now! It's a part of my PhD project... it will be updated periodically. But it's exciting to exist publicly for you to use, no?
[1] Arber, D. A., Orazi, A., Hasserjian, R., Thiele, J., Borowitz, M. J., Le Beau, M. M., … Vardiman, J. W. (2016). The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood, 127(20), 2391–2405.
[2] Gu, Z., Churchman, M. L., Roberts, K. G., Moore, I., Zhou, X., Nakitandwe, J., … Mullighan, C. G. (2019). PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nature Genetics, 51(2), 296–307.