# Histology single-cell identification pipeline
Deep learning pipeline repository for our paper "Geospatial immune variability illuminates differential evolution of lung adenocarcinoma" published in Nature Medicine.
In addition to a combination of Python, MATLAB and R scripts, this repository also includes, example H&E images and their final outputs and single-cell annotations data for external cohort testing.
The pipeline accepts a standard H&E (e.g. ndpi format) and outputs a spatial map, where all cancer, lymphocyte and stromal cells can be recognized. The SCCNN method was first published in doi.org/10.1109/TMI.2016.2525803 but re-implemented with different parameters in Python-TensorFlow here. Tissue segmentation is based on MicroNet: doi.org/10.1016/j.media.2018.12.003.
## Citation
If you use this pipeline or some of its steps, or if you use the attached annotation data, please cite:
* AbdulJabbar, K. et al. Geospatial immune variability illuminates differential evolution of lung adenocarcinoma. Nature Medicine (2020). doi: 10.1038/s41591-020-0900-x
## Highlights
<p align="center">
<img width="800" src="https://github.com/qalid7/compath/blob/master/common/images/pipeline.png">
</p>
The steps can be further explained as follows:
* Tiling: to convert a raw microscopy image into 2000x2000 JPEG tiles.
* Tissue segmentation: to segment viable tissue area from a H&E slide.
The above two steps can be skipped, e.g. if you already have small sections of a H&E as JPEG tiles, or if you don't think there is any need to segment tissue areas. However, please note, tissue segmentation is a fast step that rids large unwanted tiles from a standard H&E to save time for the next two steps.
* Cell detection: identifying cell nucleus,
* Cell classification: predicting the class of an identified cell (cancer, stromal, lymphocyte, other)
Both cell detection and classification algorithms contain pre processing routines. You can turn this off/on or modify it from the main run script or sub matlab dir.
To execute, you need the below Conda virtual environments.
## Python-TensorFlow virtual envs (Linux)
* For cell detection and classification:
```
module load anaconda/3/4.4.0
conda create -n tfdavrosCPU1p3 python=3.5.4
conda activate tfdavrosCPU1p3
conda install scipy=0.19 pandas=0.20 numpy=1.13.1
pip install /apps/tensorflow/tensorflow-1.3.0-cp35-cp35m-linux_x86_64.whl
cd /apps/MATLAB/R2018b/extern/engines/python
#replace your dir:
python setup.py build --build-base="/home/dir/tmp" install
pip install pillow==4.2.1 h5py==2.7.1
conda deactivate
#check by running python then 'import tensorflow as tf'
```
* For tiling raw ndpi files:
```
module load anaconda/3/4.4.0
conda create –n CWS python=3.5
source activate CWS
conda install numpy
module load java/sun8/1.8.0u66
pip install 'python-bioformats<=1.3.0'
module load openjpeg/2.1.2
module load openslide/3.4.1
pip install openslide-python
source deactivate CWS
```
## Example data
Under data/example we provide sample tiles. The aim should be to run both cell detection and classification and replicate the results as seen under example/results.
* example/data: raw tiled JPEGs, ready for cell detection and cell classification.
* example/results: the output of this pipeline in the form of annotated images and cell coordinates.
## Post processsing
A likely scenario is to see a lot of rubbish being detected outside the tissue regions. This happens simply because our algorithm hasn't seen enough 'negative non-cell' events from a chohort other than Lung TRACERx. Though much of this rubbish should be avoided with tissue segmentation, however, we provide a simple MATLAB script for post processing (cleaning) under: post_proc. This script should also create a summary for all slides in one table: number and relative percentage of cells identified for each class.
## Test data (LATTICe-A annotations)
<p align="center">
<img height="150" src="https://github.com/qalid7/compath/blob/master/common/images/ann_data.png">
</p>
Single-cell expert pathology annotations from the LATTICe-A cohort are provided under: test_data. This test dataset represents one of several external validations performed in the paper.
The R scripts is provided to re-generate single-cell accuracy results - you should be able to replicate Table S3 from the paper using:
* latticea_test_data/imgs: the original raw H&E tiles used for single-cell pathology annoations.
* latticea_test_data/gt_celllabels: expert pathology annotations in the form of class, and x,y coordinates.
* latticea_test_data/dl_celllabels: our final cell predictions from this pipeline.
## Multiplex IHC
By large, this pipeline is designed for H&E images as they make the bulk of our paper. For multiplex IHC images (CD8-CD4-FOXP3); refer to Methods in the paper. Depending on your IHC images (combination of colors, cytoplasmic/nuclear staining), the pipeline may need some modification.
## Training
Training codes are available for each step of this pipeline. We aim to update this repo with a more recent version (updated codes, tf version 1.13).
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
Geospatial immune variability illuminates differential evolution of lung adenocarcinoma 我们在《自然医学》上发表的论文“地理空间免疫变异性阐明肺腺癌的差异进化”的深度学习管道存储库。 除了 Python、MATLAB 和 R 脚本的组合之外,该存储库还包括示例 H&E 图像及其最终输出和用于外部队列测试的单细胞注释数据。 这些步骤可以进一步解释如下: 平铺:将原始显微镜图像转换为 2000x2000 JPEG 平铺。 组织分割:从 H&E 载玻片中分割出可行的组织区域。 上述两个步骤可以跳过,例如,如果您已经将 H&E 的小部分作为 JPEG 瓦片,或者如果您认为不需要分割组织区域。但是,请注意,组织分割是一个快速步骤,可以从标准 H&E 中去除不需要的大块,从而为接下来的两个步骤节省时间。 细胞检测:识别细胞核, 细胞分类:预测已识别细胞的类别(癌症、基质、淋巴细胞、其他) 更多详情、使用方法,请下载后阅读README.md文件
资源推荐
资源详情
资源评论
收起资源包目录
地理空间免疫变异性阐明肺腺癌的差异进化”的深度学习代码存储库_MATLAB_python_代码_下载 (720个子文件)
colour_deconvolution.c 17KB
colour_deconvolution.c 17KB
colour_deconvolution.c 17KB
colour_deconvolution.c 17KB
Da237.csv 87KB
Da267.csv 84KB
Da186.csv 79KB
Da237.csv 68KB
Da267.csv 66KB
Da43.csv 64KB
Da42.csv 64KB
Da186.csv 62KB
Da3.csv 62KB
Da105.csv 62KB
Da4.csv 61KB
Da110.csv 61KB
Da5.csv 60KB
Da120.csv 60KB
Da123.csv 60KB
Da136.csv 60KB
Da6.csv 60KB
Da104.csv 59KB
Da124.csv 59KB
Da128.csv 59KB
Da115.csv 59KB
Da1.csv 59KB
Da2.csv 58KB
Da83.csv 58KB
Da121.csv 58KB
Da81.csv 58KB
Da27.csv 58KB
Da47.csv 58KB
Da52.csv 57KB
Da14.csv 57KB
Da108.csv 57KB
Da127.csv 56KB
Da111.csv 56KB
Da41.csv 56KB
Da84.csv 56KB
Da106.csv 56KB
Da11.csv 56KB
Da114.csv 56KB
Da107.csv 56KB
Da64.csv 56KB
Da87.csv 54KB
Da40.csv 54KB
Da55.csv 54KB
a.csv 54KB
Da57.csv 54KB
Da82.csv 54KB
Da109.csv 54KB
Da85.csv 54KB
Da51.csv 54KB
Da90.csv 53KB
Da53.csv 53KB
Da48.csv 53KB
Da118.csv 53KB
Da134.csv 53KB
Da60.csv 52KB
Da23.csv 52KB
Da20.csv 52KB
Da15.csv 52KB
Da130.csv 52KB
Da78.csv 51KB
Da58.csv 51KB
Da61.csv 51KB
Da45.csv 51KB
Da119.csv 51KB
Da65.csv 50KB
Da56.csv 50KB
Da77.csv 49KB
Da54.csv 49KB
Da26.csv 49KB
Da49.csv 49KB
Da135.csv 49KB
Da10.csv 49KB
Da39.csv 49KB
Da71.csv 49KB
Da112.csv 49KB
Da7.csv 49KB
Da46.csv 49KB
Da101.csv 49KB
Da69.csv 48KB
Da18.csv 48KB
Da70.csv 48KB
Da80.csv 48KB
Da133.csv 48KB
Da24.csv 48KB
Da36.csv 48KB
Da16.csv 48KB
Da21.csv 48KB
Da79.csv 48KB
Da19.csv 48KB
Da63.csv 48KB
Da97.csv 47KB
Da129.csv 47KB
Da31.csv 47KB
Da33.csv 47KB
Da30.csv 47KB
Da76.csv 47KB
共 720 条
- 1
- 2
- 3
- 4
- 5
- 6
- 8
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9154
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功