# Qualifyr
This package provides an extensible framework whereby multiple text files that provide QC information can be parsed to give a failure/warning/pass status for each QC file and combined to give an overall QC status for a sample. The 'worst' QC status from any QC file will be used to derive a QC status. For example if some of the QC files are pass but one is deemed a failure the overall status will be FAILURE.
The command is invoked as follows
```
usage: qualifyr [-h] {check,report} ...
A package to check quality files and assess overall pass/fail
positional arguments:
{check,report} The following commands are available. Type qualifyr
<COMMAND> -h for more help on a specific commands
check Check multiple quality metric files based on conditions and
produce overall result
report Produce a html report based on the qualifyr output from
multiple sampels
optional arguments:
-h, --help show this help message and exit
```
There are two subcommands:
1. check. Specify multiple qc files two assess and assign an individual and overall QC status.
```
usage: qualifyr check [-h] [-q QUAST_FILE] [-f FASTQC_FILE [FASTQC_FILE ...]]
[-c CONFINDR_FILE] -y CONDITIONS_YAML_FILE [-j] -s
SAMPLE_NAME [-o OUTPUT_DIR]
required arguments:
At least one of the following quality files
-q QUAST_FILE, --quast_file QUAST_FILE
quast file path
-f FASTQC_FILE [FASTQC_FILE ...], --fastqc_file FASTQC_FILE [FASTQC_FILE ...]
fastqc summary file path
-c CONFINDR_FILE, --confindr_file CONFINDR_FILE
confindr report file path
-y CONDITIONS_YAML_FILE, --conditions_yaml_file CONDITIONS_YAML_FILE
conditions yaml file path
-s SAMPLE_NAME, --sample_name SAMPLE_NAME
The name of the sample from which the quality files
are derived
optional arguments:
-j, --json_output_format
Output the check results as JSON rather than TSV
(default)
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Path to output directory. If specified output will be
written to file in format
sample_name.qualifyr.{tsv,json}
-h, --help show this help message and exit
```
Currently QC files from
- [fastqc](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
- [quast](http://bioinf.spbau.ru/quast)
- [confindr](https://lowandrew.github.io/ConFindr/)
are supported. Multiple summary fastqc files can be supplied e.g from read 1 and 2
The overall sample status will be returned to STDOUT
The possible return statuses are
- PASS
- WARNING
- FAILURE
If there are any warnings or failures, the reason for the status along with the file will be returned as tab separated lines to STDERR
2. report. Generate an html report from multiple qualifyr outputs in json format
```
usage: qualifyr report [-h] -i INPUT_DIR [-o OUTPUT_DIR] [-c EXTRA_COLUMNS]
[-t REPORT_TITLE]
required arguments:
-i INPUT_DIR, --input_dir INPUT_DIR
Path to input directory containing multiple qualifyr
json outputs
optional arguments:
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Path to output directory. If not supplied this will be
the same as the input directory
-c EXTRA_COLUMNS, --extra_columns EXTRA_COLUMNS
Extra columns to add to the report provided as a quote
enclosed comma separated list e.g 'quast.N50,quast.#
contigs (>= 1000 bp),confindr.contam_status'
-t REPORT_TITLE, --report_title REPORT_TITLE
Title for the report
-h, --help show this help message and exit
```
The command requires an input directory that contains multiple qualifyr json output files.
## Supplying conditions for the warning, failure criteria
These are supplied in a YAML file that is specified by the `-y` argument to the `qualifyr` script. The basic format is:
```
<FILE TYPE:
'<METRIC NAME':
<WARNING or PASS>:
condition_type: <One of gt, lt, lt_or_gt, gt_and_lt, eq, ne, any>
condition_value: <VALUE>
```
A specific example for quast output is
```
quast:
'# contigs (>= 1000 bp)':
warning:
condition_type: gt
condition_value: 75
failure:
condition_type: gt
condition_value: 150
```
In this case a sample will be a given a WARNING status if there are greater than 75 contigs of size 1000bp or more, and a FAILURE status if the same values is graeter than 150.
An example of a full conditions file can be found [here](example_qc_conditions.yml)
## Installation
```
pip3 install qualifyr
```
## Installation from source
Clone the git repo and install via python setup
```
git clone https://gitlab.com/cgps/qualifyr.git
cd qualifyr
python setup.py install
```
## Tests
```
python setup.py test
```
## Test in dev
- clone repo
- cd into directory
- to see pass result run
```
qualifyr -y tests/test_data/pass_conditions.yml -q tests/test_data/quast_valid.txt -f tests/test_data/fastqc_valid.txt tests/test_data/fastqc_fail.txt -c tests/test_data/confindr_pass.csv
```
- to see fail result run
```
qualifyr -y tests/test_data/fail_conditions.yml -q tests/test_data/quast_valid.txt -f tests/test_data/fastqc_valid.txt tests/test_data/fastqc_fail.txt -c tests/test_data/confindr_pass.csv`
```
## Adding other QC file types
The framework is designed to be extensible by subclassing the [QualityFile](qualifyr/quality_file.py) class. See the [quast_file.py](qualifyr/quast_file.py) for an example.
The subclass must implement 2 functions
1. validate: This function should check the file is the expected format and return a list of lines containing just the metrics from the file.
2. parse: This function should call validate and then with the returned list populate an instance variable `metrics` which is a dict containing the metric names as keys and the associated values as the values.
In addition the class should specify a class variable `file_type`
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
QualiFyr-1.4.4.tar.gz (41个子文件)
QualiFyr-1.4.4
MANIFEST.in 70B
PKG-INFO 510B
qualifyr
html_report
templates
main.html 9KB
assets
css
dataTables.semanticui.min.css 3KB
semantic.min.css 604KB
images
details_open.png 709B
details_close.png 686B
fonts
icons.ttf 103KB
icons.eot 140KB
icons.svg 275KB
icons.woff 49KB
icons.woff2 39KB
js
jquery-3.3.1.min.js 85KB
semantic.transition.min.js 13KB
jquery.dataTables.min.js 80KB
dataTables.semanticui.min.js 2KB
semantic.dropdown.min.js 50KB
.DS_Store 6KB
.DS_Store 6KB
text_report.py 1KB
bactinspector_file.py 1KB
conditions_file.py 686B
file_size_check_file.py 1KB
utility.py 2KB
__init__.py 23B
fastqc_summary_file.py 1KB
confindr_file.py 2KB
html_report.py 2KB
quality_file.py 8KB
quast_file.py 1KB
run_qualifyr.py 8KB
quality_check.py 3KB
QualiFyr.egg-info
PKG-INFO 510B
requires.txt 23B
SOURCES.txt 1KB
entry_points.txt 57B
top_level.txt 9B
dependency_links.txt 1B
setup.cfg 38B
setup.py 867B
README.md 6KB
共 41 条
- 1
资源评论
挣扎的蓝藻
- 粉丝: 13w+
- 资源: 15万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功