1. Define context(s)
reveal new biological insights
Primary goal of the model/tool/database
The use of single-cell transcriptomics has become a major approach to delineate cell subpopulations and the transitions between them. While various computational tools using different mathematical methods have been developed to infer clusters, marker genes, and cell lineage, none yet integrate these within a mathematical framework to perform multiple tasks coherently. Such coherence is critical for the inference of cell–cell communication, a major remaining challenge. Here, we present similarity matrix-based optimization for single-cell data analysis (SoptSC), in which unsupervised clustering, pseudotemporal ordering, lineage inference, and marker gene identification are inferred via a structured cell-to-cell similarity matrix. SoptSC then predicts cell–cell communication networks, enabling reconstruction of complex cell lineages that include feedback or feedforward interactions. Application of SoptSC to early embryonic development, epidermal regeneration, and hematopoiesis demonstrates robust identification of subpopulations, lineage relationships, and pseudotime, and prediction of pathway-specific cell communication patterns regulating processes of development and differentiation.
Biological domain of the model
scRNA-seq data of various tissues
Structure(s) of interest in the model
celluar trajectories, cell-cell communications
Spatial scales included in the model
cellular to tissue
Time scales included in the model
seconds to weeks
2. Data for building and validating the model
Data for building the model |
Published? |
Private? |
How is credibility checked? |
Current Conformance Level / Target Conformance Level |
in vitro (primary cells cell, lines, etc.) |
|
|
|
|
ex vivo (excised tissues) |
|
|
|
|
in vivo pre-clinical (lower-level organism or small animal) |
|
|
|
|
in vivo pre-clinical (large animal) |
Yes |
No |
The model was built in an unsupervised way on unbiased single-cell RNA sequencing data. |
Extensive |
Human subjects/clinical |
|
|
|
|
Other: ________________________ |
|
|
|
|
Data for validating the model |
Published? |
Private? |
How is credibility checked? |
Current Conformance Level / Target Conformance Level |
in vitro (primary cells cell, lines, etc.) |
|
|
|
|
ex vivo (excised tissues) |
|
|
|
|
in vivo pre-clinical (lower-level organism or small animal) |
|
|
|
|
in vivo pre-clinical (large animal) |
Yes |
No |
By comparing the model determined pseudotime and clustering to knowledge based cell type annotation and real temporal points in data. |
Adequate |
Human subjects/clinical |
|
|
|
|
Other: ________________________ |
|
|
|
|
3. Validate within context(s)
|
Who does it? |
When does it happen? |
How is it done? |
Current Conformance Level / Target Conformance Level |
Verification |
Students/postdocs/investigators |
Throughout the project |
1) The convergence of solution is guaranteed by formal theoretical analysis. 2) The spectrum of clustering agrees with prior knowledge of number of cell types. 3) The inferred cellular trajectories agree with known developmental paths. |
Extensive |
Validation |
Students/postdocs/investigators |
As the unsupervised model was established |
1) The clustering results are validated using cell types annotated based on knowledge. 2) The temporal ordering is validated using datasets with multiple real temporal points. 3) The method was compared to several other popular methods on various benchmarks and achieved top performance. |
Extensive |
Uncertainty quantification |
|
|
|
|
Sensitivity analysis |
Students/postdocs/investigators |
As the unsupervised model was established |
By tuning key parameters and comparing to annotated data. |
Adequate |
Other:__________ |
|
|
|
|
Additional Comments |
|
|
|
|
4. Limitations
Disclaimer statement (explain key limitations) |
Who needs to know about this disclaimer? |
How is this disclaimer shared with that audience? |
Current Conformance Level / Target Conformance Level |
The technical noise in single-cell RNA sequencing data might cause inaccuracy. |
Scientific community who intends to apply this method to raw scRNA-seq data. |
|
Adequate |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5. Version control
Current Conformance Level / Target Conformance Level |
Extensive |
|
Naming Conventions? |
Repository? |
Code Review? |
individual modeler |
Yes |
Yes |
peers |
within the lab |
Yes |
Yes |
peers |
collaborators |
Yes |
Yes |
via regular meetings |
6. Documentation
|
Current Conformance Level / Target Conformance Level |
Code commented? |
Extensive |
Scope and intended use described? |
Extensive |
User’s guide? |
Extensive |
Developer’s guide? |
Partial |
7. Dissemination
Current Conformance Level / Target Conformance Level |
Extensive |
Target Audience(s): |
“Inner circle” |
Scientific community |
Public |
Simulations |
|
|
|
Models |
|
|
|
Software |
R package: https://mkarikom.github.io/RSoptSC/; MATLAB package: https://github.com/WangShuxiong/SoptSC |
R package: https://mkarikom.github.io/RSoptSC/; MATLAB package: https://github.com/WangShuxiong/SoptSC |
|
Results |
Shared folders |
Paper and tutorials |
|
Implications of results |
|
|
|
8. Independent reviews
Current Conformance Level / Target Conformance Level |
Partial |
Reviewer(s) name & affiliation: |
Ruan, H., Liao, Y., Ren, Z. et al. Single-cell reconstruction of differentiation trajectory reveals a critical role of ETS1 in human cardiac lineage commitment. BMC Biol 17, 89 (2019). https://doi.org/10.1186/s12915-019-0709-6 |
When was review performed? |
2019 |
How was review performed and outcomes of the review? |
The tool has been used by the scientific community for the analysis of single-cell RNA sequencing datasets. |
9. Test competing implementations
Current Conformance Level / Target Conformance Level |
Adequate |
|
Yes or No (briefly summarize) |
Were competing implementations tested? |
Yes. The method has been compared to several other commonly used methods on benchmark datasets. |
Did this lead to model refinement or improvement? |
Yes. |