# A scalable SCENIC workflow for single-cell gene regulatory network analysis
This repository describes how to run a pySCENIC gene regulatory network inference analysis alongside a basic "best practices" expression analysis for single-cell data.
This includes:
* Standalone Jupyter notebooks for an interactive analysis
* A Nextflow DSL1 workflow, which provides a semi-automated and streamlined method for running these steps
* Details on pySCENIC installation, usage, and downstream analysis
See also the associated publication in **Nature Protocols**: https://doi.org/10.1038/s41596-020-0336-2.
For an advanced implementation of the steps in this protocol, see **[VSN Pipelines](https://github.com/vib-singlecell-nf/vsn-pipelines)**, a Nextflow DSL2 implementation of pySCENIC with comprehensive and customizable pipelines for expression analysis.
This includes additional pySCENIC features (multi-runs, integrated motif- and track-based regulon pruning, loom file generation).
## Overview
* [Quick start](#quick-start)
* [Requirements](#general-requirements-for-this-workflow)
* [Installation](docs/installation.md)
* Case studies
* PBMC 10k dataset (10x Genomics)
* Full SCENIC analysis, plus filtering, clustering, visualization, and SCope-ready loom file creation:
* [Jupyter notebook](notebooks/PBMC10k_SCENIC-protocol-CLI.ipynb)
|
[HTML render](http://htmlpreview.github.io/?https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/PBMC10k_SCENIC-protocol-CLI.html)
* Extended analysis post-SCENIC:
* [Jupyter notebook](notebooks/PBMC10k_downstream-analysis.ipynb)
|
[HTML render](http://htmlpreview.github.io/?https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/PBMC10k_downstream-analysis.html)
* To run the same dataset through the VSN Pipelines DSL2 workflow, see [this tutorial](https://vsn-pipelines-examples.readthedocs.io/en/latest/PBMC10k.html).
* Cancer data sets
* [Jupyter notebook](notebooks/SCENIC%20Protocol%20-%20Case%20study%20-%20Cancer%20data%20sets.ipynb)
|
[HTML render](http://htmlpreview.github.io/?https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/SCENIC%20Protocol%20-%20Case%20study%20-%20Cancer%20data%20sets.html)
* Mouse brain data set
* [Jupyter notebook](notebooks/SCENIC%20Protocol%20-%20Case%20study%20-%20Mouse%20brain%20data%20set.ipynb)
|
[HTML render](http://htmlpreview.github.io/?https://github.com/aertslab/SCENICprotocol/blob/master/notebooks/SCENIC%20Protocol%20-%20Case%20study%20-%20Mouse%20brain%20data%20set.html)
* [References and more information](#references-and-more-information)
<p align="center">
<img src="docs/figs/Figure01.png" width="600" alt="SCENIC workflow diagram">
</p>
---
## Quick start
### Running the pySCENIC pipeline in a Jupyter notebook
We recommend using
[this notebook](notebooks/PBMC10k_SCENIC-protocol-CLI.ipynb)
as a template for running an interactive analysis in Jupyter.
See the
[installation instructions](docs/installation.md)
for information on setting up a kernel with pySCENIC and other required packages.
### Running the Nextflow pipeline on the example dataset
#### Requirements (Nextflow/containers)
The following tools are required to run the steps in this Nextflow pipeline:
* [Nextflow](https://www.nextflow.io/)
* A container system, either of:
* [Docker](https://docs.docker.com/)
* [Singularity](https://www.sylabs.io/singularity/)
The following container images will be pulled by nextflow as needed:
* Docker: [aertslab/pyscenic:latest](https://hub.docker.com/r/aertslab/pyscenic).
* Singularity: [aertslab/pySCENIC:latest](https://www.singularity-hub.org/collections/2033).
* [See also here.](https://github.com/aertslab/pySCENIC#docker-and-singularity-images)
#### Using the test profile
A quick test can be accomplished using the `test` profile, which automatically pulls the testing dataset (described in full below):
nextflow run aertslab/SCENICprotocol \
-profile docker,test
This small test dataset takes approximately 70s to run using 6 threads on a standard desktop computer.
#### Download testing dataset
Alternately, the same data can be run with a more verbose approach (this is more illustrative for how to substitute other data into the pipeline).
Download a minimum set of SCENIC database files for a human dataset (approximately 78 MB).
mkdir example && cd example/
# Transcription factors:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/test_TFs_tiny.txt
# Motif to TF annotation database:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/motifs.tbl
# Ranking databases:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/genome-ranking.feather
# Finally, get a tiny sample expression matrix (loom format):
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/expr_mat_tiny.loom
#### Running the example pipeline
Either Docker or Singularity images can be used by specifying the appropriate profile (`-profile docker` or `-profile singularity`).
Please note that for the tiny test dataset to run successfully, the default thresholds need to be lowered.
##### Using loom input
nextflow run aertslab/SCENICprotocol \
-profile docker \
--loom_input expr_mat_tiny.loom \
--loom_output pyscenic_integrated-output.loom \
--TFs test_TFs_tiny.txt \
--motifs motifs.tbl \
--db *feather \
--thr_min_genes 1
By default, this pipeline uses the container specified by the `--pyscenic_container` parameter.
This is currently set to `aertslab/pyscenic:0.9.19`, which uses a container with both pySCENIC and Scanpy `1.4.4.post1` installed.
A custom container can be used (e.g. one built on a local machine) by passing the name of this container to the `--pyscenic_container` parameter.
##### Expected output
The output of this pipeline is a loom-formatted file (by default: `output/pyscenic_integrated-output.loom`) containing:
* The original expression matrix
* The pySCENIC-specific results:
* Regulons (TFs and their target genes)
* AUCell matrix (cell enrichment scores for each regulon)
* Dimensionality reduction embeddings based on the AUCell matrix (t-SNE, UMAP)
* Results from the parallel best-practices analysis using highly variable genes:
* Dimensionality reduction embeddings (t-SNE, UMAP)
* Louvain clustering annotations
## General requirements for this workflow
* Python version 3.6 or greater
* Tested on various Unix/Linux distributions (Ubuntu 18.04, CentOS 7.6.1810, MacOS 10.14.5)
---
## References and more information
### SCENIC
* [SCENIC (R) on GitHub](https://github.com/aertslab/SCENIC)
* [SCENIC website](http://scenic.aertslab.org/)
* [SCENIC publication](https://doi.org/10.1016/j.cell.2018.05.057)
* [pySCENIC on GitHub](https://github.com/aertslab/pySCENIC)
* [pySCENIC documentation](https://pyscenic.readthedocs.io/en/latest/)
* [VSN Pipelines](https://github.com/vib-singlecell-nf/vsn-pipelines), a repository of pipelines for single-cell data in Nextflow DSL2, including an implementation of pySCENIC.
### SCope
* [SCope webserver](http://scope.aertslab.org/)
* [SCope on GitHub](https://github.com/aertslab/SCope)
* [SCopeLoomR](https://github.com/aertslab/SCopeLoomR)
* [SCopeLoomPy](https://github.com/aertslab/SCopeLoomPy)
### Scanpy
* [Scanpy on GitHub](https://github.com/theislab/scanpy)
* [Scanpy documentation](https://scanpy.readthedocs.io/)
* [Scanpy publication](https://doi.org/10.1186/s13059-017-1382-0)
没有合适的资源?快使用搜索试试~ 我知道了~
SCENICprotocol:可扩展的SCENIC工作流程,用于单细胞基因调控网络分析
共44个文件
txt:7个
ipynb:7个
py:5个
需积分: 37 3 下载量 86 浏览量
2021-05-02
19:38:05
上传
评论
收藏 143.29MB ZIP 举报
温馨提示
可扩展的SCENIC工作流程,用于单细胞基因调控网络分析 该存储库描述了如何对单细胞数据运行pySCENIC基因调控网络推断分析以及基本的“最佳实践”表达分析。 这包括: 独立的Jupyter笔记本电脑,用于交互式分析 Nextflow DSL1工作流程,它提供了一种半自动化且简化的方法来运行这些步骤 pySCENIC安装,使用和下游分析的详细信息 另请参阅《自然规约》中的相关出版物: : 。 有关此协议中步骤的高级实现,请参阅 ,这是pySCENIC的Nextflow DSL2实现,具有用于表达式分析的全面且可自定义的管道。 这包括其他pySCENIC功能(多次运行,集成的基于主题和基于轨迹的regulon修剪,织机文件生成)。 概述 实例探究 PBMC 10k数据集(10x基因组学) 完整的SCENIC分析,以及过滤,群集,可视化和SCope就绪的织机文件创建: | SCEN
资源详情
资源评论
资源推荐
收起资源包目录
SCENICprotocol-master.zip (44个子文件)
SCENICprotocol-master
.gitignore 109B
requirements.txt 2KB
nextflow.config 1KB
bin
filtering-basic.py 4KB
grnboost2_without_dask.py 4KB
integrateOutput.py 6KB
arboreto_with_multiprocessing.py 6KB
preprocess_visualize_project_scanpy.py 5KB
scenic_protocol.yml 5KB
LICENSE 34KB
.github
workflows
nextflow_test.yml 501B
notebooks
SCENIC Protocol - Case study - Mouse brain data set.ipynb 2.24MB
PBMC10k_SCENIC-protocol-CLI.html 5.18MB
PBMC10k_SCENIC-protocol-CLI-tracks.html 3.7MB
SCENIC Protocol - High resolution images.ipynb 5.88MB
SCENIC Protocol - Case study - Cancer data sets.ipynb 29.89MB
PBMC10k_downstream-analysis.html 23.38MB
SCENIC Protocol - Case study - Cancer data sets.html 30.3MB
Figure - Speed comparison R and Python.ipynb 44KB
SCENIC Protocol - Case study - Mouse brain data set.html 2.53MB
PBMC10k_SCENIC-protocol-CLI-tracks.ipynb 3.36MB
pbmc10k_garnett_results.txt 396KB
PBMC10k_SCENIC-protocol-CLI.ipynb 4.84MB
PBMC10k_downstream-analysis.ipynb 23.06MB
main.nf 6KB
README.md 8KB
example
allTFs_mm.txt 11KB
test_TFs_tiny.txt 5B
allTFs_hg38.txt 11KB
genome-ranking.feather 44.14MB
sample_data_tiny.tar.gz 12KB
allTFs_dmel.txt 5KB
expr_mat.loom 472KB
expr_mat_small.loom 35KB
expr_mat_tiny.loom 35KB
sample_data_small.tar.gz 11KB
sample_data.tar.gz 468KB
test_TFs_small.txt 9B
motifs.tbl 30.56MB
conf
test.config 540B
.gitattributes 30B
docs
pipeline.md 1KB
figs
Figure01.png 678KB
installation.md 5KB
共 44 条
- 1
zhangjames
- 粉丝: 21
- 资源: 4745
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0