# Pandas Profiling
![Pandas Profiling Logo Header](https://pandas-profiling.ydata.ai/docs/assets/logo_header.png)
[![Build Status](https://github.com/ydataai/pandas-profiling/actions/workflows/tests.yml/badge.svg?branch=master)](https://github.com/ydataai/pandas-profiling/actions/workflows/tests.yml)
[![Code Coverage](https://codecov.io/gh/ydataai/pandas-profiling/branch/master/graph/badge.svg?token=gMptB4YUnF)](https://codecov.io/gh/ydataai/pandas-profiling)
[![Release Version](https://img.shields.io/github/release/ydataai/pandas-profiling.svg)](https://github.com/ydataai/pandas-profiling/releases)
[![Python Version](https://img.shields.io/pypi/pyversions/pandas-profiling)](https://pypi.org/project/pandas-profiling/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black)
<p align="center">
<a href="https://pandas-profiling.ydata.ai/docs/master/rtd/">Documentation</a>
|
<a href="https://slack.ydata.ai">Slack</a>
|
<a href="https://stackoverflow.com/questions/tagged/pandas-profiling">Stack Overflow</a>
|
<a href="https://pandas-profiling.ydata.ai/docs/master/rtd/pages/changelog.html#changelog">Latest changelog</a>
</p>
Generates profile reports from a pandas `DataFrame`.
The pandas `df.describe()` function is great but a little basic for serious exploratory data analysis.
`pandas_profiling` extends the pandas DataFrame with `df.profile_report()` for quick data analysis.
For each column the following statistics - if relevant for the column type - are presented in an interactive HTML report:
* **Type inference**: detect the [types](#types) of columns in a dataframe.
* **Essentials**: type, unique values, missing values
* **Quantile statistics** like minimum value, Q1, median, Q3, maximum, range, interquartile range
* **Descriptive statistics** like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
* **Most frequent values**
* **Histogram**
* **Correlations** highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
* **Missing values** matrix, count, heatmap and dendrogram of missing values
* **Text analysis** learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data.
* **File and Image analysis** extract file sizes, creation dates and dimensions and scan for truncated images or those containing EXIF information.
## Announcements
**Spark backend in progress**: We can happily announce that we're nearing v1 for the Spark backend for generating profile reports.
Beta testers wanted! The Spark backend will be released as a pre-release for this package.
**Monitoring time series?**: I'd like to draw your attention to [popmon](https://github.com/ing-bank/popmon). Whereas pandas-profiling allows you to explore patterns in a single dataset, popmon allows you to uncover temporal patterns. It's worth checking out!
---
_Contents:_ **[Examples](#examples)** |
**[Installation](#installation)** | **[Documentation](#documentation)** |
**[Large datasets](#large-datasets)** | **[Command line usage](#command-line-usage)** |
**[Advanced usage](#advanced-usage)** | **[Support](#support)** | **[Go beyond](#go-beyond)** |
**[Support the project](#supporting-open-source)** | **[Types](#types)** | **[How to contribute](#contributing)** |
**[Editor Integration](#editor-integration)** | **[Dependencies](#dependencies)**
---
## Examples
The following examples can give you an impression of what the package can do:
* [Census Income](https://pandas-profiling.ydata.ai/examples/master/census/census_report.html) (US Adult Census data relating income)
* [NASA Meteorites](https://pandas-profiling.ydata.ai/examples/master/meteorites/meteorites_report.html) (comprehensive set of meteorite landings) [![Open In Colab](https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/github/pandas-profiling/pandas-profiling/blob/master/examples/meteorites/meteorites.ipynb) [![Binder](https://camo.githubusercontent.com/483bae47a175c24dfbfc57390edd8b6982ac5fb3/68747470733a2f2f6d7962696e6465722e6f72672f62616467655f6c6f676f2e737667)](https://mybinder.org/v2/gh/pandas-profiling/pandas-profiling/master?filepath=examples%2Fmeteorites%2Fmeteorites.ipynb)
* [Titanic](https://pandas-profiling.ydata.ai/examples/master/titanic/titanic_report.html) (the "Wonderwall" of datasets) [![Open In Colab](https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/github/pandas-profiling/pandas-profiling/blob/master/examples/titanic/titanic.ipynb) [![Binder](https://camo.githubusercontent.com/483bae47a175c24dfbfc57390edd8b6982ac5fb3/68747470733a2f2f6d7962696e6465722e6f72672f62616467655f6c6f676f2e737667)](https://mybinder.org/v2/gh/pandas-profiling/pandas-profiling/master?filepath=examples%2Ftitanic%2Ftitanic.ipynb)
* [NZA](https://pandas-profiling.ydata.ai/examples/master/nza/nza_report.html) (open data from the Dutch Healthcare Authority)
* [Stata Auto](https://pandas-profiling.ydata.ai/examples/master/stata_auto/stata_auto_report.html) (1978 Automobile data)
* [Vektis](https://pandas-profiling.ydata.ai/examples/master/vektis/vektis_report.html) (Vektis Dutch Healthcare data)
* [Colors](https://pandas-profiling.ydata.ai/examples/master/colors/colors_report.html) (a simple colors dataset)
* [UCI Bank Dataset](https://pandas-profiling.ydata.ai/examples/master/bank_marketing_data/uci_bank_marketing_report.html) (banking marketing dataset)
* [RDW](https://pandas-profiling.ydata.ai/examples/master/rdw/rdw.html) (RDW, the Dutch DMV's vehicle registration 10 million rows, 71 features)
Specific features:
* [Russian Vocabulary](https://pandas-profiling.ydata.ai/examples/master/features/russian_vocabulary.html) (demonstrates text analysis)
* [Cats and Dogs](https://pandas-profiling.ydata.ai/examples/master/features/cats-and-dogs.html) (demonstrates image analysis from the file system)
* [Celebrity Faces](https://pandas-profiling.ydata.ai/examples/master/features/celebrity-faces.html) (demonstrates image analysis with EXIF information)
* [Website Inaccessibility](https://pandas-profiling.ydata.ai/examples/master/features/website_inaccessibility_report.html) (demonstrates URL analysis)
* [Orange prices](https://pandas-profiling.ydata.ai/examples/master/features/united_report.html) and [Coal prices](https://pandas-profiling.ydata.ai/examples/master/features/flatly_report.html) (showcases report themes)
Tutorials:
* [Tutorial: report structure using Kaggle data (advanced)](https://pandas-profiling.ydata.ai/examples/master/tutorials/modify_report_structure.ipynb) (modify the report's structure) [![Open In Colab](https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/github/pandas-profiling/pandas-profiling/blob/master/examples/tutorials/modify_report_structure.ipynb) [![Binder](https://camo.githubusercontent.com/483bae47a175c24dfbfc57390edd8b6982ac5fb3/68747470733a2f2f6d7962696e6465722e6f72672f62616467655f6c6f676f2e737667)](https://mybinder.org/v2/gh/pandas-profiling/pandas-profiling/master?filepath=examples%2Ftutorials%2Fmodify_report_structure.ipynb)
## Installation
### Using pip
[![PyPi Downloads](https://pepy.tech/badge/pandas-profiling)](https://pepy.tech/project/pandas-profiling)
[![PyPi Monthly Downloads](https://pepy.tech/badge/pandas-profiling/month)](https://pepy.tech/project/pandas-profiling/month)
[![PyPi Version](https://badge.fury.io/py/pandas-profiling.svg)](https://pypi.org/project/pandas-profiling/)
没有合适的资源?快使用搜索试试~ 我知道了~
pandas-profiling-3.2.0.tar.gz
需积分: 1 0 下载量 61 浏览量
2024-03-07
12:45:52
上传
评论
收藏 203KB GZ 举报
温馨提示
Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。
资源推荐
资源详情
资源评论
收起资源包目录
pandas-profiling-3.2.0.tar.gz (197个子文件)
make.bat 1KB
setup.cfg 38B
flatly.bootstrap.min.css 124KB
united.bootstrap.min.css 120KB
bootstrap.min.css 118KB
bootstrap-theme.min.css 23KB
style.css 5KB
style.html 2KB
select.html 1KB
frequency_table.html 1KB
navigation.html 1KB
frequency_table_small.html 1KB
tabs.html 1KB
toggle_button.html 944B
javascript.html 904B
report.html 885B
batch_grid.html 652B
alerts.html 641B
sections.html 533B
table.html 469B
alert_high_correlation.html 418B
collapse.html 371B
diagram.html 353B
grid.html 287B
variable_info.html 275B
sample.html 240B
variable.html 239B
named_list.html 213B
footer.html 201B
alert_truncated.html 199B
alert_infinite.html 197B
alert_missing.html 194B
alert_type_date.html 194B
alert_zeros.html 181B
alert_high_cardinality.html 168B
alert_unsupported.html 166B
list.html 165B
alert_skewed.html 164B
alert_duplicates.html 152B
alert_constant.html 144B
alert_uniform.html 120B
alert_constant_length.html 115B
duplicate.html 115B
alert_unique.html 113B
alert_empty.html 17B
MANIFEST.in 693B
jquery-1.12.4.min.js 95KB
bootstrap.min.js 36KB
script.js 491B
LICENSE 1KB
Makefile 813B
README.md 19KB
CONTRIBUTING.md 7KB
PKG-INFO 23KB
PKG-INFO 23KB
profile_report.py 15KB
render_categorical.py 14KB
plot.py 13KB
config.py 9KB
report.py 9KB
alerts.py 9KB
describe_categorical_pandas.py 8KB
render_real.py 8KB
formatters.py 8KB
dataframe.py 8KB
typeset.py 7KB
overview.py 7KB
render_image.py 7KB
describe.py 6KB
describe_image_pandas.py 6KB
correlations.py 5KB
describe_numeric_pandas.py 5KB
serialize_report.py 5KB
summary_algorithms.py 4KB
expectations_report.py 4KB
render_path.py 4KB
render_count.py 4KB
correlations_pandas.py 4KB
missing.py 4KB
render_url.py 4KB
correlations.py 4KB
missing.py 4KB
container.py 4KB
flavours.py 4KB
summary_pandas.py 3KB
render_boolean.py 3KB
expectation_algorithms.py 3KB
utils.py 3KB
console.py 3KB
frequency_table_utils.py 3KB
typeset_relations.py 3KB
context.py 3KB
render_date.py 3KB
render_complex.py 3KB
summarizer.py 3KB
setup.py 2KB
templates.py 2KB
render_file.py 2KB
handler.py 2KB
notebook.py 2KB
共 197 条
- 1
- 2
资源评论
程序员Chino的日记
- 粉丝: 3664
- 资源: 5万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功