<div align="center">
<img src="https://github.com/HallLab/pandas-genomics/raw/master/docs/_static/logo.png" alt="pandas_genomics logo"/>
</div>
<br/>
<div align="center">
<!-- Python version -->
<a href="https://pypi.python.org/pypi/pandas-genomics">
<img src="https://img.shields.io/badge/python-3.7+-blue.svg?style=flat-square" alt="PyPI version"/>
</a>
<!-- PyPi -->
<a href="https://pypi.org/project/pandas-genomics/">
<img src="https://img.shields.io/pypi/v/pandas-genomics.svg?style=flat-square" alt="pypi" />
</a><br>
<!-- Build status -->
<a href="https://github.com/HallLab/pandas-genomics/actions?query=workflow%3ACI">
<img src="https://img.shields.io/github/workflow/status/HallLab/pandas-genomics/CI?style=flat-square" alt="Build Status" />
</a>
<!-- Docs -->
<a href="https://pandas-genomics.readthedocs.io/en/latest/">
<img src="https://img.shields.io/readthedocs/pandas-genomics?style=flat-square" alt="Read the Docs" />
</a>
<!-- Test coverage -->
<a href="https://codecov.io/gh/HallLab/pandas-genomics/">
<img src="https://img.shields.io/codecov/c/gh/HallLab/pandas-genomics.svg?style=flat-square" alt="Coverage Status"/>
</a><br>
<!-- License -->
<a href="https://opensource.org/licenses/BSD-3-Clause">
<img src="https://img.shields.io/pypi/l/pandas-genomics?style=flat-square" alt="license"/>
</a>
<!-- Black -->
<a href="https://github.com/psf/black">
<img src="https://img.shields.io/badge/code%20style-Black-black?style=flat-square" alt="code style: black"/>
</a>
</div>
<br/>
Pandas ExtensionDtypes and ExtensionArray for working with genomics data
Quickstart
----------
`Variant` objects holds information about a particular variant:
```python
from pandas_genomics.scalars import Variant
variant = Variant('12', 112161652, id='rs12462', ref='A', alt=['C', 'T'])
print(variant)
```
rs12462[chr=12;pos=112161652;ref=A;alt=C,T]
Each variant should have a unique ID, and a random ID is generated if one is not specified.
`Genotype` objects are associated with a particular `Variant`:
```python
gt = variant.make_genotype("A", "C")
print(gt)
```
```
A/C
```
The `GenotypeArray` stores genotypes with an associated variant and has useful methods and properties:
```python
from pandas_genomics.scalars import Variant
from pandas_genomics.arrays import GenotypeArray
variant = Variant('12', 112161652, id='rs12462', ref='A', alt=['C'])
gt_array = GenotypeArray([variant.make_genotype_from_str(s) for s in ["C/C", "A/C", "A/A"]])
print(gt_array)
```
```
<GenotypeArray>
[Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=1, allele2=1),
Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=0, allele2=1),
Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=0, allele2=0)]
Length: 3, dtype: genotype[12; 112161652; rs12462; A; C]
```
```python
print(gt_array.astype(str))
```
```
['C/C' 'A/C' 'A/A']
```
```python
print(gt_array.encode_dominant())
```
```
<IntegerArray>
[1.0, 1.0, 0.0]
Length: 3, dtype: float
```
There are also `genomics` accessors for Series and DataFrame
```python
import pandas as pd
print(pd.Series(gt_array).genomics.encode_codominant())
```
```
0 Hom
1 Het
2 Ref
Name: rs12462_C, dtype: category
Categories (3, object): ['Ref' < 'Het' < 'Hom']
```
程序员Chino的日记
- 粉丝: 3756
- 资源: 5万+
最新资源
- Rainbow 8.1.0.SPC3 迁移工具操作指南
- 水泵反渗透和一拖3恒压供水 1.西门子SMART和海为云触摸屏做的反渗透和恒压供水电气控制系统, 程序注释完善,在山东某养鸡场运行正常 2,有正常制水模式,反洗模式,原水恒压供水和供水恒压供水(1托
- Python 与 MySQL 基础:开发数据管理应用的必备技能 - pdf
- PLC工业超滤净水控制系统(牧场用的比较多) 1.西门子SMART和海为云触摸屏做的超滤控制系统 或者是昆仑同泰触摸屏加远程模块,可以手机远程监控,修改监控程序; 2,包括单套和双套系统(可以清洗超滤
- 癌症数据,癌症患者数据集,涵盖:人口统计学细节、生活方式因素、癌症诊断信息、治疗详情以及结果等
- 全套恒压供水一拖三程序图纸(看描述)恒压供水一拖三图纸程序 1.采用西门子SMART SR20 CPU 加AE04模块; 2.触摸屏采用昆仑通态;同时通讯PLC和变频器; 3.PLC模拟量检测压
- python使用mysql基础教程
- python使用mysql基础教程
- Python 与 MySQL 基础:数据交互与数据库操作-pdf
- python使用mysql基础教程
- 永磁同步电机的谐波注入补偿simulink模型仿真 5次7次电流谐波抑制;
- python使用mysql基础教程
- python使用mysql基础教程
- python使用mysql基础教程
- 西门子1200-1500博途追款锁机软件程序例程,经典程序编程及到期催款锁机,采用SCL语言编程子程序,内含物料运输顺序控制,运料车自动装卸料控制,展厅人数控制,风机运行监控,卫生间定时冲水,冒泡排序
- python使用mysql基础教程
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈