scMulan

A 368M-parameter multitask generative pre-trained model for single-cell analysis.

scMulan is a 368M-parameter multitask generative pre-trained foundation model for single-cell RNA-seq analysis, pre-trained on ~10M cells. It supports zero-shot cell-type annotation, batch correction, and conditional generation.

I contributed to the latent-variable model design (joint-VAE universal coordinate systems supporting cross-platform and cross-modality alignment) and to dataset curation and evaluation. RECOMB / ISMB 2024.

Project repo

References