# Pandas pipeline in graphviz
Python package to build a nice explanative schema of a data processing pipeline in pandas.
It's heavily inspired by [dask's `.visualize` method](https://docs.dask.org/en/latest/graphviz.html), but improved with 2 useful features:
- visualize columns names in data nodes
- highlight created columns at each task
Here is an example from the [examples folder](examples):
![](examples/03_apply_pandas_pipeline_decorator.png)
## Installation
### Pip
Install with pip:
```bash
$ pip install pandas-pipeline-graphviz
```
### Manual installation
Install manually:
- git clone
- use `python setup.py`
## Usage
### Disclaimer
#### ⚠️ WARNING — it's a hack!
There are no reliable methods in python to get variables names, either as input or as output. The methods used in this package are quite _hacky_, as discussed in this [stackoverflow thread](https://stackoverflow.com/questions/2749796/how-to-get-the-original-variable-name-of-variable-passed-to-a-function).
To build the graph, this package makes use of:
- `globals()` **to get the names of input dataframes**, doing a comparison between the input dataframes and all the variables available in the global variables.
- `inspect.stack()` **to get the name of the output dataframe**, gathering the code lines calling the function and parsing it to find the output. Currently it supports only single-output transformations.
Both methods should be considered as experimental and the behavior of the decorator is expected to break easily if it's not used as presented in the [examples](examples).
#### Conditions for use
- do not use several decorators on your function, only this decorator, otherwise it will break the output dataframe name detection through `inspect.stack()`
- use only single output transformation functions, i.e. functions which return only 1 dataframe.
### Examples
See [examples folder](examples) in the repository.
程序员Chino的日记
- 粉丝: 3734
- 资源: 5万+
最新资源
- OctaveMatlab的开源仿真包.zip
- Optometrika MATLAB库使用Snells和fresnel折射和反射定律实现了光学图像形成的分析和迭代光线.zip
- python自动排工期
- PatchMatch算法的MATLAB实现.zip
- paper_quality_plotmatlab.zip
- Polar码快速MATLAB实现,包括编码器几种类型的SC解码器、CRCSCL解码器和许多编码构造算法.zip
- Python Pytorch和Matlab MatConvNet实现CVPR 2021图像匹配研讨会论文DFM深度特征.zip
- PlatEMO进化多目标优化平台matlab.zip
- 电力电子网侧变器,阻抗模型和阻抗扫描,PSCAD,matlab均可 有pscad次同步振荡仿真模型,投入弱交流电网,引发SSO 网侧变阻抗模型建立,bode图阻抗扫频
- 机械设计飞秒激光深孔加工理论与系统设计(sw14可编辑+cad+说明书)全套技术资料100%好用.zip
- 基于势能法采用MATLAB编写的含剥落故障的直齿轮啮合刚度程序,考虑了齿轮变位及中性轴位置的变化 可调整剥落参数得到不同条件下的时变啮合刚度,本人亲自编写,可解答,其他如有雷同,谨防假冒 另有齿轮
- FPGA USB3.0 UVC工业相机 本设计用FPGA驱动FT602芯片实现USB3.0UVC 相机彩条视频输出试验,使用同步245模式通信,提供vivado工程源码,用verilog代码生成的彩条
- 根稀疏贝叶斯学习离网格DOA估计的MATLAB代码.zip
- 工具与艾伦研究所的CCF数据在matlab中工作.zip
- 关于如何使用强化学习开发金融交易模型的MATLAB示例.zip
- 光电容积脉搏波成像的MATLAB工具箱.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈