Here we review some recent ML tools that applied to single-cell data could help disentangle cell heterogeneity in AML by identifying distinct core molecular signatures of leukemic cell subsets
The main disadvantage of this approach is that the number of clusters depends on a resolution parameter assigned by the user (higher values will lead to a greater number of clusters) and thus, they may not faithfully reflect cell types.
The identification of cell types using typical workflows has several drawbacks: first, rare cell types are easily missed and grouped together with some more prevalent ones; second, cell identity is often not discrete but lies in a continuum (for instance, cells with mixed identities or in transition); and third, the clustering can reflect other sources of variability unrelated to cell types (41). To address these issues, ML tools have recently been developed allowing quantitative identification and probabilistic assignment of cell types, thus aiding the identification of rare and heterogeneous cell populations.
In the context of AML and other cancers, transcriptionally similar malignant cells are expected to group together, and can be unambiguously identified by the expression of certain feature genes that can be used as biomarkers for designing personalised treatments.
n the context of AML and other cancers, transcriptionally similar malignant cells are expected to group together, and can be unambiguously identified by the expression of certain feature genes that can be used as biomarkers for designing personalised treatments.
The recently developed Single-Cell Clustering Assessment Framework (SCCAF) (24) generates an optimal number of clusters automatically.
A disadvantage of supervised methods is that they rely on known markers or accurate cell type annotations to build classification models. Often, markers for rare cell populations, such as LSCs, are unknown, not robust (51) or can be expressed by more than one cell type (15). Further, in many cases, annotation of single-cell datasets requires additional standardisation (29).
A non-genetic source of heterogeneity
Multiomic single-cell technologies quantifying both surface proteins and transcriptomes of individual cells (e.g. CITE-seq)
could be ideally applied to the identification of surface targets for the design of cell based immunotherapies (46).