Motivation Unsupervised clustering of single-cell RNA sequencing (scRNA-seq) data holds the promise of characterizing known and novel cell type in various biological and clinical contexts. However, intrinsic multi-scale clustering resolutions poses challenges to deal with multiple sources of variability in the high-dimensional and noisy data. Results We present Cluster Match, a stable match optimization model to align scRNA-seq data at the cluster level. In one hand, Cluster Match leverages the mutual correspondence by canonical correlation analysis and multi-scale Louvain clustering algorithms to identify cluster with optimized resolutions. In the other hand, it utilizes stable matching framework to align scRNA-seq data in the latent space while maintaining interpretability with overlapped marker gene set. Through extensive experiments, we demonstrate the efficacy of Cluster Match in data integration, cell type annotation, and cross-species/timepoint alignment scenarios. Our results show Cluster Match's ability to utilize both global and local information of scRNA-seq data, sets the appropriate resolution of multi-scale clustering, and offers interpretability by utilizing marker genes. Availability and implementation. The code of Cluster Match software is freely available at https://github.com/AMSSwanglab/ClusterMatch.
Publication:
Bioinformatics, Volume 40, Issue 8, August 2024
https://doi.org/10.1093/bioinformatics/btae480
Author:
Teer Ba
School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
School of Mathematical Sciences, Inner Mongolia University, Hohhot 010021, China
Hao Miao
CEMS, NCMIS, HCMS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
School of Mathematics, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
Lirong Zhang
School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
Caixia Gao
School of Mathematical Sciences, Inner Mongolia University, Hohhot 010021, China
Email: gaocx0471@163.com
Yong Wang
CEMS, NCMIS, HCMS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
School of Mathematics, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China
Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 330106, China
Email: ywang@amss.ac.cn