<div align="center">
<img src="https://github.com/HallLab/pandas-genomics/raw/master/docs/_static/logo.png" alt="pandas_genomics logo"/>
</div>
<br/>
<div align="center">
<!-- Python version -->
<a href="https://pypi.python.org/pypi/pandas-genomics">
<img src="https://img.shields.io/badge/python-3.7+-blue.svg?style=flat-square" alt="PyPI version"/>
</a>
<!-- PyPi -->
<a href="https://pypi.org/project/pandas-genomics/">
<img src="https://img.shields.io/pypi/v/pandas-genomics.svg?style=flat-square" alt="pypi" />
</a><br>
<!-- Build status -->
<a href="https://github.com/HallLab/pandas-genomics/actions?query=workflow%3ACI">
<img src="https://img.shields.io/github/workflow/status/HallLab/pandas-genomics/CI?style=flat-square" alt="Build Status" />
</a>
<!-- Docs -->
<a href="https://pandas-genomics.readthedocs.io/en/latest/">
<img src="https://img.shields.io/readthedocs/pandas-genomics?style=flat-square" alt="Read the Docs" />
</a>
<!-- Test coverage -->
<a href="https://codecov.io/gh/HallLab/pandas-genomics/">
<img src="https://img.shields.io/codecov/c/gh/HallLab/pandas-genomics.svg?style=flat-square" alt="Coverage Status"/>
</a><br>
<!-- License -->
<a href="https://opensource.org/licenses/BSD-3-Clause">
<img src="https://img.shields.io/pypi/l/pandas-genomics?style=flat-square" alt="license"/>
</a>
<!-- Black -->
<a href="https://github.com/psf/black">
<img src="https://img.shields.io/badge/code%20style-Black-black?style=flat-square" alt="code style: black"/>
</a>
</div>
<br/>
Pandas ExtensionDtypes and ExtensionArray for working with genomics data
Quickstart
----------
`Variant` objects holds information about a particular variant:
```python
from pandas_genomics.scalars import Variant
variant = Variant('12', 112161652, id='rs12462', ref='A', alt=['C', 'T'])
print(variant)
```
rs12462[chr=12;pos=112161652;ref=A;alt=C,T]
Each variant should have a unique ID, and a random ID is generated if one is not specified.
`Genotype` objects are associated with a particular `Variant`:
```python
gt = variant.make_genotype("A", "C")
print(gt)
```
```
A/C
```
The `GenotypeArray` stores genotypes with an associated variant and has useful methods and properties:
```python
from pandas_genomics.scalars import Variant
from pandas_genomics.arrays import GenotypeArray
variant = Variant('12', 112161652, id='rs12462', ref='A', alt=['C'])
gt_array = GenotypeArray([variant.make_genotype_from_str(s) for s in ["C/C", "A/C", "A/A"]])
print(gt_array)
```
```
<GenotypeArray>
[Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=1, allele2=1),
Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=0, allele2=1),
Genotype(variant=rs12462[chr=12;pos=112161652;ref=A;alt=C], allele1=0, allele2=0)]
Length: 3, dtype: genotype[12; 112161652; rs12462; A; C]
```
```python
print(gt_array.astype(str))
```
```
['C/C' 'A/C' 'A/A']
```
```python
print(gt_array.encode_dominant())
```
```
<IntegerArray>
[1.0, 1.0, 0.0]
Length: 3, dtype: float
```
There are also `genomics` accessors for Series and DataFrame
```python
import pandas as pd
print(pd.Series(gt_array).genomics.encode_codominant())
```
```
0 Hom
1 Het
2 Ref
Name: rs12462_C, dtype: category
Categories (3, object): ['Ref' < 'Het' < 'Hom']
```
没有合适的资源?快使用搜索试试~ 我知道了~
pandas-genomics-0.10.1.tar.gz
需积分: 1 0 下载量 111 浏览量
2024-03-07
12:40:18
上传
评论
收藏 32KB GZ 举报
温馨提示
Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。
资源推荐
资源详情
资源评论
收起资源包目录
pandas-genomics-0.10.1.tar.gz (25个子文件)
pandas-genomics-0.10.1
setup.py 5KB
LICENSE 2KB
PKG-INFO 5KB
pyproject.toml 1KB
pandas_genomics
accessors
__init__.py 399B
series_accessor.py 6KB
utils
__init__.py 50B
edge_encoding.py 9KB
dataframe_accessor.py 10KB
__init__.py 517B
arrays
utils.py 497B
__init__.py 526B
genotype_array.py 27KB
info_mixin.py 4KB
encoding_mixin.py 5KB
scalars.py 17KB
io
__init__.py 348B
plink
__init__.py 68B
to_plink.py 7KB
from_plink.py 8KB
vcf.py 3KB
sim
__init__.py 381B
random_gt.py 2KB
biallelic_model_simulator.py 18KB
README.md 3KB
共 25 条
- 1
资源评论
程序员Chino的日记
- 粉丝: 3662
- 资源: 5万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功