项目作者: ttgump

项目描述 :
scDeepCluster for Single Cell RNA-seq data
高级语言: Jupyter Notebook
项目地址: git://github.com/ttgump/scDeepCluster.git
创建时间: 2018-07-09T05:11:11Z
项目社区:https://github.com/ttgump/scDeepCluster

开源协议:Apache License 2.0

下载


scDeepCluster

scDeepCluster, a model-based deep embedding clustering for Single Cell RNA-seq data. See details in our paper: “Clustering single-cell RNA-seq data with a model-based deep learning approach” published in Nature Machine Intelligence https://www.nature.com/articles/s42256-019-0037-0.

Table of contents

Network diagram" class="reference-link">Network diagram

alt text

Requirements" class="reference-link">Requirements

Python —- 3.6.3

Keras —- 2.1.4

Tensorflow —- 1.1.0

Scanpy —- 1.0.4

Nvidia Tesla K80 (12G)

Please note that if using different versions, the results reported in our paper might not be able to repeat.

Usage" class="reference-link">Usage

  1. python scDeepCluster.py --data_file data.h5 --n_clusters 10

set data_file to the destination to the data (stored in h5 format, with two components X and Y, where X is the cell by gene count matrix and Y is the true labels), n_clusters to the number of clusters.

The final output reports the clustering performance, here is an example on 10X PBMC scRNA-seq data:

Final: ACC= 0.8100, NMI= 0.7736, ARI= 0.7841

Pytorch version" class="reference-link">Pytorch version

Recommend the pytorch version, I have added some new features: 1. automatically estimating number of clusters after pretraining; 2. clustering on datasets from different batches.

See detail at https://github.com/ttgump/scDeepCluster_pytorch

Raw data" class="reference-link">Raw data

The raw data used in this paper can be found: https://figshare.com/articles/dataset/scDeepCluster_supporting_data/17158025

Online app" class="reference-link">Online app

Online app website: https://app.superbio.ai/apps/107

Contact" class="reference-link">Contact

Tian Tian tiantianwhu@163.com