Software
Here are some software I have created and maintained.
VITAE
VITAE (Variational Inference for for Trajectory Analysis by AutoEncoder) is a method for inferring developmental trajectories from single-cell RNA-seq data. It integrates and aligns single-cell data from different modalities such as chromatin accessibility and gene expression.
- Trajectory Inference: Accurately infers developmental trajectories and pseudotime in various datasets.
- Data Integration: Better handles the integration of multiple single-cell datasets, adjusting for batch effects and other confounders.
- Accelerated Gaussian Version: An accelerated version that approximates distributions with Gaussian assumptions for computational efficiency.
- Differential Gene Expression Analysis: Effective in identifying differentially expressed genes along inferred trajectories.
scVAEIT
scVAEIT (single-cell Variational AutoEncoder for integration and transfer learning) is a Python module that utilizes a variational autoencoder (VAE) for single-cell mosaic integration and transfer learning. It aims to integrate and impute single-cell data from different modalities, such as gene expression, protein abundance, and chromatin accessibility.
- Multimodal Data Integration: Integrates single-cell data from multiple modalities, such as scRNA-seq, scATAC-seq, and CITE-seq, when the observations may not share the same set of features.
- Imputation: Imputes missing values in single-cell datasets by leveraging information from other modalities.
- Transfer Learning and Cross-modality Translation: Enables transfer learning by training on a reference dataset and readily transferring the learned knowledge to new sources for cross-modality translation and imputation.
sklearn_ensemble_cv
sklearn_ensemble_cv is a Python module for performing accurate and efficient ensemble cross-validation methods from various projects..
- Flexibility: The module builds on scikit-learn/sklearn to provide the most flexibility on various base predictors.
- Risk estimation and hyperparameter tuning for ensemble learning: The module includes functions for creating ensembles of models, training the ensembles using cross-validation, and making predictions with the ensembles.
- Evaluation utilities: The module also includes utilities for evaluating the performance of the ensembles and the individual models that make up the ensembles.