`methpype` is a python package for processing Illumina methylation array data.
[![Readthedocs](https://readthedocs.com/projects/life-epigenetics-methpype-dev/badge/?version=latest)](https://life-epigenetics-methpype-dev.readthedocs-hosted.com/en/latest/) [![image](https://img.shields.io/pypi/l/pipenv.svg)](https://python.org/pypi/pipenv)
Linux/OSX
[![CircleCI](https://circleci.com/gh/LifeEGX/methpype-dev.svg?style=shield&circle-token=28f0bca658e0752a3096432063f2c2ef260d3a84)](https://circleci.com/gh/LifeEGX/methpype-dev) Windows [![Build status](https://ci.appveyor.com/api/projects/status/7vdji73odyc2cate/branch/master?svg=true)](https://ci.appveyor.com/project/life_epigenetics/methpype/branch/master)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/9e4e03c5cbf54c8aa16dd2cf1a440e2f)](https://www.codacy.com?utm_source=github.com&utm_medium=referral&utm_content=LifeEGX/methpype&utm_campaign=Badge_Grade)
[![Coverage Status](https://coveralls.io/repos/github/LifeEGX/methpype/badge.svg?t=mwigt8)](https://coveralls.io/github/LifeEGX/methpype)
# MethPype Package
The MethPype package contains both high-level APIs for processing data from local files and low-level functionality allowing you to customize the flow of data and how it is processed.
# Installation
MethPype maintains configuration files for your Python package manager of choice: [conda](https://conda.io), [pipenv](https://pipenv.readthedocs.io/en/latest/), and [pip](https://pip.pypa.io/en/stable/).
```python
pip install methpype
```
---
## High-Level Processing
The primary MethPype API provides methods for the most common data processing and file retrieval functionality.
### `run_pipeline`
Run the complete methylation processing pipeline for the given project directory, optionally exporting the results to file.
Returns: A collection of DataContainer objects for each processed sample
```python
from methpype import run_pipeline
data_containers = run_pipeline(data_dir, array_type=None, export=False, manifest_filepath=None, sample_sheet_filepath=None, sample_names=None)
```
Argument | Type | Default | Description
--- | --- | --- | ---
`data_dir` | `str`, `Path` | - | Base directory of the sample sheet and associated IDAT files
`array_type` | `str` | `None` | Code of the array type being processed. Possible values are `custom`, `450k`, `epic`, and `epic+`. If not provided, the pacakage will attempt to determine the array type based on the number of probes in the raw data.
`export` | `bool` | `False` | Whether to export the processed data to CSV
`manifest_filepath` | `str`, `Path` | `None` | File path for the array's manifest file. If not provided, this file will be downloaded from a Life Epigenetics archive.
`sample_sheet_filepath` | `str`, `Path` | `None` | File path of the project's sample sheet. If not provided, the package will try to find one based on the supplied data directory path.
`sample_names` | `str` collection | `None` | List of sample names to process. If provided, only those samples specified will be processed. Otherwise all samples found in the sample sheet will be processed.
### `get_sample_sheet`
Find and parse the sample sheet for the provided project directory path.
Returns: A SampleSheet object containing the parsed sample information from the project's sample sheet file
```python
from methpype import get_sample_sheet
sample_sheet = get_sample_sheet(dir_path, filepath=None)
```
Argument | Type | Default | Description
--- | --- | --- | ---
`data_dir` | `str`, `Path` | - | Base directory of the sample sheet and associated IDAT files
`sample_sheet_filepath` | `str`, `Path` | `None` | File path of the project's sample sheet. If not provided, the package will try to find one based on the supplied data directory path.
### `get_manifest`
Find and parse the manifest file for the processed array type.
Returns: A Manifest object containing the parsed probe information for the processed array type
```python
from methpype import get_manifest
manifest = get_manifest(raw_datasets, array_type=None, manifest_filepath=None)
```
Argument | Type | Default | Description
--- | --- | --- | ---
`raw_datasets` | `RawDataset` collection | - | Collection of RawDataset objects containing probe information from the raw IDAT files.
`array_type` | `str` | `None` | Code of the array type being processed. Possible values are `custom`, `450k`, `epic`, and `epic+`. If not provided, the pacakage will attempt to determine the array type based on the provided RawDataset objects.
`manifest_filepath` | `str`, `Path` | `None` | File path for the array's manifest file. If not provided, this file will be downloaded from a Life Epigenetics archive.
### `get_raw_datasets`
Find and parse the IDAT files for samples within a project's sample sheet.
Returns: A collection of RawDataset objects for each sample's IDAT file pair.
```python
from methpype import get_raw_datasets
raw_datasets = get_raw_datasets(sample_sheet, sample_names=None)
```
Argument | Type | Default | Description
--- | --- | --- | ---
`sample_sheet` | `SampleSheet` | - | A SampleSheet instance from a valid project sample sheet file.
`sample_names` | `str` collection | `None` | List of sample names to process. If provided, only those samples specified will be processed. Otherwise all samples found in the sample sheet will be processed.
---
## Low-Level Processing
### MethPype CLI
MethPype provides a command line interface (CLI) so the package can be used directly in bash/batchfile scripts as part of building your custom processing pipeline.
All invocations of the MethPype CLI will provide contextual help, supplying the possible arguments and/or options available based on the invoked command. If you specify verbose logging the package will emit log output of DEBUG levels and above.
```Shell
>>> python -m methpype
usage: methpype [-h] [-v] {process,sample_sheet} ...
Utility to process methylation data from Illumina IDAT files
positional arguments:
{process,sample_sheet}
process process help
sample_sheet sample sheet help
optional arguments:
-h, --help show this help message and exit
-v, --verbose Enable verbose logging
```
---
## Commands
The MethPype cli provides two top-level commands:
- `process` to process methylation data
- `sample_sheet` to find/read a sample sheet and output its contents
### `process`
Process the methylation data for a group of samples listed in a single sample sheet.
If you do not provide the file path for the project's sample_sheet the module will try to find one based on the supplied data directory path.
You must supply either the name of the array being processed or the file path for the array's manifest file. If you only specify the array type, the array's manifest file will be downloaded from a Life Epigenetics archive.
```Shell
>>> python -m methpype process
usage: methpype idat [-h] -d DATA_DIR [-a {custom,450k,epic,epic+}]
[-m MANIFEST] [-s SAMPLE_SHEET]
[--sample_name [SAMPLE_NAME [SAMPLE_NAME ...]]]
[--export]
Process Illumina IDAT files
optional arguments:
-h, --help show this help message and exit
-d, --data_dir Base directory of the sample sheet and associated IDAT
files
-a, --array_type Type of array being processed
Choices: {custom,450k,epic,epic+}
-m, --manifest File path of the array manifest file
-s, --sample_sheet File path of the sample sheet
--sample_name Sample(s) to process
--export Export data to csv
```
### `sample_sheet`
Find and parse the sample sheet in a given directory and emit the details of each sample.
```Shell
>>> python -m methpype sample_sheet
usage: methpype sample_sheet [-h] -d DATA_DIR
Process Illumina sample sheet file
optional arguments:
-h, --help show this help message and exi
PyPI 官网下载 | methpype-1.0.1.tar.gz
版权申诉
48 浏览量
2022-01-28
11:25:37
上传
评论
收藏 20KB GZ 举报
![avatar](https://profile-avatar.csdnimg.cn/277f6345dca0446498fbbc03843436aa_qq_38161040.jpg!1)
挣扎的蓝藻
- 粉丝: 13w+
- 资源: 15万+
最新资源
- 山东大学面向对象编程考试内容的详细归纳
- 基于LoRa的主从机农田监测系统代码
- 计算机组成原理第六版课后习题可能涉及的一些主要内容和概念
- Visual Studio 最新版一键安装包(何时安装何时就可以最新版)
- Matplotlib - Matplotlib tutorial - Nicolas P. Rougier
- XlineSoft PHPRunner 是一款功能强大且灵活的 PHP 代码生成器,专为快速开发和部署数据库驱动的 Web 应用
- c语言之俄罗斯方块123
- matplotlib - matplotlib for beginners
- ensp构建一个小型校园网
- vbf2.2.0-2.2.3
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
![feedback](https://img-home.csdnimg.cn/images/20220527035711.png)
![feedback](https://img-home.csdnimg.cn/images/20220527035711.png)
![feedback-tip](https://img-home.csdnimg.cn/images/20220527035111.png)