pandasql3-0.7.3.tar.gz资源-CSDN文库

需积分: 1 71 浏览量 2024-03-11 16:20:11 上传评论收藏 27KB GZ 举报

共22个文件

txt：6个

py：6个

csv：3个

资源推荐

资源详情

资源评论

收起资源包目录

pandasql3-0.7.3.tar.gz （22个子文件）

pandasql3-0.7.3

pandasql

__init__.py 438B

data

births_by_month.csv 11KB

meat.csv 62KB

births.csv 1KB

tests

__init__.py 0B

test_utils.py 1KB

test_pandasql.py 7KB

sqldf.py 6KB

setup.py 606B

LICENSE.txt 1KB

README.rst 3KB

PKG-INFO 4KB

CHANGES.txt 892B

pandasql3.egg-info

SOURCES.txt 449B

top_level.txt 9B

PKG-INFO 4KB

requires.txt 24B

dependency_links.txt 1B

MANIFEST.in 64B

setup.cfg 42B

README.md 2KB

AUTHORS.md 121B

pandasql3 ======== `pandasql3` allows you to query `pandas` DataFrames using SQL syntax. It works similarly to `sqldf` in R. `pandasql3` seeks to provide a more familiar way of manipulating and cleaning data for people new to Python or `pandas`. #### Installation ``` $ pip install -U pandasql3 ``` #### Basics The main function used in pandasql3 is `sqldf`. `sqldf` accepts 2 parametrs - a sql query string - a set of session/environment variables (`locals()` or `globals()`) Specifying `locals()` or `globals()` can get tedious. You can define a short helper function to fix this. from pandasql3 import sqldf pysqldf = lambda q: sqldf(q, globals()) #### Querying `pandasql3` uses [SQLite syntax](http://www.sqlite.org/lang.html). Any `pandas` dataframes will be automatically detected by `pandasql3`. You can query them as you would any regular SQL table. ``` $ python >>> from pandasql3 import sqldf, load_meat, load_births >>> pysqldf = lambda q: sqldf(q, globals()) >>> meat = load_meat() >>> births = load_births() >>> print pysqldf("SELECT * FROM meat LIMIT 10;").head() date beef veal pork lamb_and_mutton broilers other_chicken turkey 0 1944-01-01 00:00:00 751 85 1280 89 None None None 1 1944-02-01 00:00:00 713 77 1169 72 None None None 2 1944-03-01 00:00:00 741 90 1128 75 None None None 3 1944-04-01 00:00:00 650 89 978 66 None None None 4 1944-05-01 00:00:00 681 106 1029 78 None None None ``` joins and aggregations are also supported ``` >>> q = """SELECT m.date, m.beef, b.births FROM meats m INNER JOIN births b ON m.date = b.date;""" >>> joined = pyqldf(q) >>> print joined.head() date beef births 403 2012-07-01 00:00:00 2200.8 368450 404 2012-08-01 00:00:00 2367.5 359554 405 2012-09-01 00:00:00 2016.0 361922 406 2012-10-01 00:00:00 2343.7 347625 407 2012-11-01 00:00:00 2206.6 320195 >>> q = "select strftime('%Y', date) as year , SUM(beef) as beef_total FROM meat GROUP BY year;" >>> print pysqldf(q).head() year beef_total 0 1944 8801 1 1945 9936 2 1946 9010 3 1947 10096 4 1948 8766 ```

评论收藏

内容反馈