# pandas-appender
[![Build Status](https://dev.azure.com/lindahl0577/pandas-appender/_apis/build/status/wumpus.pandas-appender?branchName=main)](https://dev.azure.com/lindahl0577/pandas-appender/_build/latest?definitionId=2&branchName=main) [![Coverage](https://coveralls.io/repos/github/wumpus/pandas-appender/badge.svg?branch=main)](https://coveralls.io/github/wumpus/pandas-appender?branch=main) [![Apache License 2.0](https://img.shields.io/github/license/wumpus/pandas-appender.svg)](LICENSE)
Have you ever wanted to append a bunch of rows to a Pandas DataFrame? Turns out that
it's extremely inefficient to do so for a large dataframe, you're supposed to make
multiple dataframes and pd.concat them instead.
So... helper function? Pandas doesn't seem to have one. Roll your own?
OK then. Here's that helper function. It can append around 1 million
very small rows per cpu-second, and has a modest additional memory
usage of around 5 megabytes, dynamically growing with the number of
rows appended.
## Install
`pip install pandas-appender`
## Usage
```
from pandas_appender import DF_Appender
dfa = DF_Appender(ignore_index=True) # note that ignore_index moves to the init
for i in range(1_000_000):
dfa = dfa.append({'i': i})
df = dfa.finalize()
```
## Type hints and category detection
Using narrower types and categories can often dramatically reduce the size of a
DataFrame. There are two ways to do this in pandas-appender. One is to
append to an existing dataframe:
```
dfa = DF_Appender(df, ignore_index=True)
```
and the second is to pass in a `dtypes=` argument:
```
dfa = DF_Appender(ignore_index=True, dtypes=another_dataframe.dtypes)
```
pandas-appender also offers a way to infer which columns would be smaller
if they were categories. This code will either analyze an existing dataframe
that you're appending to:
```
dfa = DF_Appender(df, ignore_index=True, infer_categories=True)
```
or it will analyze the first chunk of appended lines:
```
dfa = DF_Appender(ignore_index=True, infer_categories=True)
```
These inferred categories will override existing types or a `dtypes=` argument.
## Incompatibilities with pandas.DataFrame.append()
### pandas.DataFame.append is idempotent, DF_Appender is not
* Pandas: `df_new = df.append() # df is not changed`
* DF_Appender: `dfa_new = dfa.append() # modifies dfa, and dfa_new == dfa`
### pandas.DataFrame.append will promote types, while DF_Appender is strict
* Pandas: append `0.1` to an integer column, and the column will be promoted to float
* DF_Appender: when initialized with `dtypes=` or an existing DataFrame, appending
`0.1` to an integer column causes `0.1` to be cast to an integer, i.e. `0`.
程序员Chino的日记
- 粉丝: 3719
- 资源: 5万+
最新资源
- 微信小程序源码-大学生心理健康服务-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-电影院订票选座小程序-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-儿童预防接种预约微信小程序-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-电影院订票选座小程序-微信端-毕业设计源码-期末大作业.zip
- Java容器类学习心得体会
- 微信小程序源码-高校体育场管理系统-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-高校体育场管理系统-微信端-毕业设计源码-期末大作业.zip
- 微信小程序源码-儿童预防接种预约微信小程序-微信端-毕业设计源码-期末大作业.zip
- springboot-vue-银行账目账户管理系统的设计与实现-源码工程-29页从零开始全套图文详解-30页设计论文-27页答辩ppt-全套开发环境工具、文档模板、电子教程、视频教学资源分享
- 微信小程序源码-互助学习小程序的设计与实现-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-购物系统-微信端-毕业设计源码-期末大作业.zip
- 微信小程序源码-购物系统-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-互助学习小程序的设计与实现-微信端-毕业设计源码-期末大作业.zip
- 微信小程序源码-会议发布与预约系统的设计与开发-服务端-毕业设计源码-期末大作业.zip
- 微信小程序源码-会议发布与预约系统的设计与开发-微信端-毕业设计源码-期末大作业.zip
- 微信小程序源码-绘画学习平台-服务端-毕业设计源码-期末大作业.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈