<div align="center">
<img src="https://github.com/HallLab/pandas-genomics/raw/master/docs/_static/logo.png" alt="pandas_genomics logo"/>
</div>
<br/>
<div align="center">
<!-- Python version -->
<a href="https://pypi.python.org/pypi/pandas-genomics">
<img src="https://img.shields.io/badge/python-3.7+-blue.svg?style=flat-square" alt="PyPI version"/>
</a>
<!-- PyPi -->
<a href="https://pypi.org/project/pandas-genomics/">
<img src="https://img.shields.io/pypi/v/pandas-genomics.svg?style=flat-square" alt="pypi" />
</a><br>
<!-- Build status -->
<a href="https://github.com/HallLab/pandas-genomics/actions?query=workflow%3ACI">
<img src="https://img.shields.io/github/workflow/status/HallLab/pandas-genomics/CI?style=flat-square" alt="Build Status" />
</a>
<!-- Docs -->
<a href="https://pandas-genomics.readthedocs.io/en/latest/">
<img src="https://img.shields.io/readthedocs/pandas-genomics?style=flat-square" alt="Read the Docs" />
</a>
<!-- Test coverage -->
<a href="https://codecov.io/gh/HallLab/pandas-genomics/">
<img src="https://img.shields.io/codecov/c/gh/HallLab/pandas-genomics.svg?style=flat-square" alt="Coverage Status"/>
</a><br>
<!-- License -->
<a href="https://opensource.org/licenses/BSD-3-Clause">
<img src="https://img.shields.io/pypi/l/pandas-genomics?style=flat-square" alt="license"/>
</a>
<!-- Black -->
<a href="https://github.com/psf/black">
<img src="https://img.shields.io/badge/code%20style-Black-black?style=flat-square" alt="code style: black"/>
</a>
</div>
<br/>
Pandas ExtensionDtypes and ExtensionArray for working with genomics data
Quickstart
----------
`Variant` objects holds information about a particular variant:
```python
from pandas_genomics.scalars import Variant
variant = Variant('12', 112161652, id='rs12462', ref='A', alt=['C', 'T'])
print(variant)
```
rs12462[chr=12;pos=112161652;ref=A;alt=C,T]
Each variant should have a unique ID, and a random ID is generated if one is not specified.
`Genotype` objects are associated with a particular `Variant`:
```python
gt = variant.make_genotype("A", "C")
print(gt)
```
```
A/C
```
The `GenotypeArray` stores genotypes with an associated variant and has useful methods and properties:
```python
from pandas_genomics.scalars import Variant
from pandas_genomics.arrays import GenotypeArray
variant = Variant('12', 112161652, id='rs12462', ref='A', alt=['C'])
gt_array = GenotypeArray([variant.make_genotype_from_str(s) for s in ["C/C", "A/C", "A/A"]])
print(gt_array)
```
```
<GenotypeArray>
[Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=1, allele2=1),
Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=0, allele2=1),
Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=0, allele2=0)]
Length: 3, dtype: genotype[12; 112161652; rs12462; A; C]
```
```python
print(gt_array.astype(str))
```
```
['C/C' 'A/C' 'A/A']
```
```python
print(gt_array.encode_dominant())
```
```
<IntegerArray>
[1.0, 1.0, 0.0]
Length: 3, dtype: float
```
There are also `genomics` accessors for Series and DataFrame
```python
import pandas as pd
print(pd.Series(gt_array).genomics.encode_codominant())
```
```
0 Hom
1 Het
2 Ref
Name: rs12462_C, dtype: category
Categories (3, object): ['Ref' < 'Het' < 'Hom']
```
程序员Chino的日记
- 粉丝: 3757
- 资源: 5万+
最新资源
- springboot043基于springboot的“衣依”服装销售平台的设计与实现.zip
- springboot243基于SpringBoot的小学生身体素质测评管理系统设计与实现.zip
- 多类型电动汽车 负荷预测 蒙特卡洛 SOC 基于蒙特卡洛的多种类型电动汽车负荷预测 软件:Matlab 介绍:基于蒙特卡洛模拟(MCS)抽样,四种充电汽车类型同时模拟,根据私家车、公交车、出租车、公务
- springboot045新闻推荐系统.zip
- springboot044美容院管理系统.zip
- springboot244基于SpringBoot和VUE技术的智慧生活商城系统设计与实现.zip
- springboot245科研项目验收管理系统.zip
- springboot246老年一站式服务平台.zip
- springboot046古典舞在线交流平台的设计与实现.zip
- T113S3增加串口4(Uart4)-Tina环境-board.dts文件比较
- MPU6050六轴传感器位移测算
- springboot048校园资料分享平台.zip
- springboot047大学生就业招聘系统的设计与实现.zip
- haohuan_release.apk
- springboot247人事管理系统.zip
- springboot248校园资产管理.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈