# PandasAI ð¼
[](https://pypi.org/project/pandasai/)
[](https://github.com/gventuri/pandas-ai/actions/workflows/ci.yml/badge.svg)
[](https://github.com/gventuri/pandas-ai/actions/workflows/cd.yml/badge.svg)
[](https://codecov.io/gh/gventuri/pandas-ai)
[](https://pandas-ai.readthedocs.io/en/latest/?badge=latest)
[](https://discord.gg/kF7FqH2FwS)
[](https://pepy.tech/project/pandasai) [](https://opensource.org/licenses/MIT)
[](https://colab.research.google.com/drive/1ZnO-njhL7TBOYPZaqvMvGtsjckZKrv2E?usp=sharing)
PandasAI is a Python library that adds Generative AI capabilities to [pandas](https://github.com/pandas-dev/pandas), the popular data analysis and manipulation tool. It is designed to be used in conjunction with pandas, and is not a replacement for it.
<!-- Add images/pandas-ai.png -->

## ð§ Quick install
```bash
pip install pandasai
```
## ð Demo
Try out PandasAI in your browser:
[](https://colab.research.google.com/drive/1ZnO-njhL7TBOYPZaqvMvGtsjckZKrv2E?usp=sharing)
## ð Documentation
The documentation for PandasAI can be found [here](https://pandas-ai.readthedocs.io/en/latest/).
## ð» Usage
> Disclaimer: GDP data was collected from [this source](https://ourworldindata.org/grapher/gross-domestic-product?tab=table), published by World Development Indicators - World Bank (2022.05.26) and collected at National accounts data - World Bank / OECD. It relates to the year of 2020. Happiness indexes were extracted from [the World Happiness Report](https://ftnnews.com/images/stories/documents/2020/WHR20.pdf). Another useful [link](https://data.world/makeovermonday/2020w19-world-happiness-report-2020).
PandasAI is designed to be used in conjunction with pandas. It makes pandas conversational, allowing you to ask questions to your data in natural language.
### Queries
For example, you can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return a DataFrame containing only those rows:
```python
import pandas as pd
from pandasai import SmartDataframe
# Sample DataFrame
df = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
"happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})
# Instantiate a LLM
from pandasai.llm import OpenAI
llm = OpenAI(api_token="YOUR_API_TOKEN")
df = SmartDataframe(df, config={"llm": llm})
df.chat('Which are the 5 happiest countries?')
```
The above code will return the following:
```
6 Canada
7 Australia
1 United Kingdom
3 Germany
0 United States
Name: country, dtype: object
```
Of course, you can also ask PandasAI to perform more complex queries. For example, you can ask PandasAI to find the sum of the GDPs of the 2 unhappiest countries:
```python
df.chat('What is the sum of the GDPs of the 2 unhappiest countries?')
```
The above code will return the following:
```
19012600725504
```
### Charts
You can also ask PandasAI to draw a graph:
```python
df.chat(
"Plot the histogram of countries showing for each the gdp, using different colors for each bar",
)
```

You can save any charts generated by PandasAI by setting the `save_charts` parameter to `True` in the `PandasAI` constructor. For example, `PandasAI(llm, save_charts=True)`. Charts are saved in `./pandasai/exports/charts` .
### Multiple DataFrames
Additionally, you can also pass in multiple dataframes to PandasAI and ask questions relating them.
```python
import pandas as pd
from pandasai import SmartDatalake
from pandasai.llm import OpenAI
employees_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}
salaries_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Salary': [5000, 6000, 4500, 7000, 5500]
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
llm = OpenAI()
dl = SmartDatalake([employees_df, salaries_df], config={"llm": llm})
dl.chat("Who gets paid the most?")
```
The above code will return the following:
```
Oh, Olivia gets paid the most.
```
You can find more examples in the [examples](examples) directory.
### â¡ï¸ Shortcuts
PandasAI also provides a number of shortcuts (beta) to make it easier to ask questions to your data. For example, you can ask PandasAI to `clean_data`, `impute_missing_values`, `generate_features`, `plot_histogram`, and many many more.
```python
# Clean data
df.clean_data()
# Impute missing values
df.impute_missing_values()
# Generate features
df.generate_features()
# Plot histogram
df.plot_histogram(column="gdp")
```
Learn more about the shortcuts [here](https://pandas-ai.readthedocs.io/en/latest/shortcuts/).
## ð Privacy & Security
In order to generate the Python code to run, we take the dataframe head, we randomize it (using random generation for sensitive data and shuffling for non-sensitive data) and send just the head.
Also, if you want to enforce further your privacy you can instantiate PandasAI with `enforce_privacy = True` which will not send the head (but just column names) to the LLM.
## ð¤ Contributing
Contributions are welcome! Please check out the todos below, and feel free to open a pull request.
For more information, please see the [contributing guidelines](CONTRIBUTING.md).
After installing the virtual environment, please remember to install `pre-commit` to be compliant with our standards:
```bash
pre-commit install
```
## Contributors
[](https://github.com/gventuri/pandas-ai/graphs/contributors)
## ð License
PandasAI is licensed under the MIT License. See the LICENSE file for more details.
## Acknowledgements
- This project is based on the [pandas](https://github.com/pandas-dev/pandas) library by independent contributors, but it's in no way affiliated with the pandas project.
- This project is meant to be used as a tool for data exploration and analysis, and it's not meant to be used for production purposes. Please use it responsibly.

程序员Chino的日记
- 粉丝: 3819
- 资源: 5万+
最新资源
- PEM电解槽多物理场耦合的三维两相流模拟研究:探究电流密度分布与析氢析氧过程的影响(使用comsol软件分析),PEM电解槽复杂多物理场的三维两相流模拟与性能分析-涵盖电化学、传质及析氢析氧过程,利
- 基于Matlab的悬臂梁有限元分析:四节点与八节点四边形单元编程指南,基于Matlab的悬臂梁结构有限元分析程序:四节点与八节点四边形单元编程详解,悬臂梁,有限元编程 基于matlab的悬臂梁四节点
- 基于COMSOL有限元PDE接口的二维混凝土湿热力耦合模型解析与优化:固体力学模块收敛问题解决方案,适合新手学习 ,基于COMSOL PDE接口的二维混凝土湿热力耦合模型解析:固体力学模块不收敛问题及
- (源码)基于加权概率算术编码的自适应信道编码系统.zip
- 10t双级纯水系统在某龙头水泥厂的PLC与HMI应用:西门子Smart PLC与海为触摸屏的Profinet通信控制方案,十年专注,专业树立行业标杆,程序通用且可定制,满足各类水处理及供求需求,设备配
- PHP新闻网站系统.rar
- 西门子S7-200 PLC在豆浆机流量控制中的应用:基于MCGS组态画面与S7-200程序的设计与实现,西门子S7-200 PLC程序与MCGS组态画面联合实现豆浆机流量控制:设计与实现,90#西门子
- c&c++课程设计-学生成绩管理系统.rar
- 知识-数据混合驱动的电网频率协同控制算法代码实现与解析
- 管家婆普及ⅡTOP13.22
- 管家婆普及ⅡTOP13.32
- 管家婆普及ⅡTOP15.0
- 基于T型三电平逆变器的SVPWM调制及电压电流双闭环控制仿真概览与波形分析(附图),基于T型三电平逆变器SVPWM策略的电压电流双闭环控制及波形分析仿真模型介绍,T型三电平逆变器 SVPWM 大扇区
- 威纶通触摸屏与台达变频器通讯协议实践详解:如何实现485直接通讯连接,威纶通触摸屏与台达变频器通讯实现详解:通讯协议与直接通讯技术探讨,威纶通触摸屏与台达变频器通讯485直接通讯 ,威纶通触摸屏; 台
- JAVA小区门户网站(源代码).rar
- 西门子S7-1200 PLC控制的七层单部电梯模拟系统:功能丰富、操作便捷的自动化电梯程序,西门子S7-1200 PLC控制的七层单部电梯模拟系统:功能全面、灵活定制的电梯程序设计,电梯程序PLC西门
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈


