# impyla
Python client for HiveServer2 implementations (e.g., Impala, Hive) for
distributed query engines.
For higher-level Impala functionality, including a Pandas-like interface over
distributed data sets, see the [Ibis project][ibis].
### Features
* HiveServer2 compliant; works with Impala and Hive, including nested data
* Fully [DB API 2.0 (PEP 249)][pep249]-compliant Python client (similar to
sqlite or MySQL clients) supporting Python 2.6+ and Python 3.3+.
* Works with Kerberos, LDAP, SSL
* [SQLAlchemy][sqlalchemy] connector
* Converter to [pandas][pandas] `DataFrame`, allowing easy integration into the
Python data stack (including [scikit-learn][sklearn] and
[matplotlib][matplotlib]); but see the [Ibis project][ibis] for a richer
experience
### Dependencies
Required:
* Python 2.6+ or 3.3+
* `six`, `bit_array`
* `thrift` (on Python 2.x) or `thriftpy` (on Python 3.x)
For Hive and/or Kerberos support:
* `thrift_sasl`
* `python-sasl` (for Python 3.x support, requires
[cloudera/python-sasl@cython][python-sasl-cython] branch)
Optional:
* `pandas` for conversion to `DataFrame` objects; but see the [Ibis project][ibis] instead
* `sqlalchemy` for the SQLAlchemy engine
* `pytest` for running tests; `unittest2` for testing on Python 2.6
### Installation
Install the latest release (`0.13.1`) with `pip`:
```bash
pip install impyla
```
For the latest (dev) version, install directly from the repo:
```bash
pip install git+https://github.com/cloudera/impyla.git
```
or clone the repo:
```bash
git clone https://github.com/cloudera/impyla.git
cd impyla
python setup.py install
```
#### Running the tests
impyla uses the [pytest][pytest] toolchain, and depends on the following
environment variables:
```bash
export IMPYLA_TEST_HOST=your.impalad.com
export IMPYLA_TEST_PORT=21050
export IMPYLA_TEST_AUTH_MECH=NOSASL
```
To run the maximal set of tests, run
```bash
cd path/to/impyla
py.test --connect impyla
```
Leave out the `--connect` option to skip tests for DB API compliance.
### Usage
Impyla implements the [Python DB API v2.0 (PEP 249)][pep249] database interface
(refer to it for API details):
```python
from impala.dbapi import connect
conn = connect(host='my.host.com', port=21050)
cursor = conn.cursor()
cursor.execute('SELECT * FROM mytable LIMIT 100')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
```
The `Cursor` object also exposes the iterator interface, which is buffered
(controlled by `cursor.arraysize`):
```python
cursor.execute('SELECT * FROM mytable LIMIT 100')
for row in cursor:
process(row)
```
You can also get back a pandas DataFrame object
```python
from impala.util import as_pandas
df = as_pandas(cur)
# carry df through scikit-learn, for example
```
[pep249]: http://legacy.python.org/dev/peps/pep-0249/
[pandas]: http://pandas.pydata.org/
[sklearn]: http://scikit-learn.org/
[matplotlib]: http://matplotlib.org/
[madlib]: http://madlib.net/
[madlibport]: https://github.com/bitfort/madlibport
[numba]: http://numba.pydata.org/
[llvm]: http://llvm.org/
[pytest]: http://pytest.org/latest/
[sqlalchemy]: http://www.sqlalchemy.org/
[ibis]: http://www.ibis-project.org/
[python-sasl-cython]: https://github.com/laserson/python-sasl/tree/cython/sasl
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
共74个文件
py:54个
thrift:8个
txt:6个
资源分类:Python库 所属语言:Python 资源全名:impyla-0.13.1.tar.gz 资源来源:官方 安装方法:https://lanzao.blog.csdn.net/article/details/101784059
资源详情
资源评论
资源推荐
收起资源包目录
impyla-0.13.1.tar.gz (74个子文件)
impyla-0.13.1
MANIFEST.in 58B
PKG-INFO 5KB
bin
register-impala-udfs.py 6KB
impyla.egg-info
PKG-INFO 5KB
requires.txt 32B
not-zip-safe 1B
SOURCES.txt 2KB
entry_points.txt 64B
top_level.txt 7B
dependency_links.txt 1B
ez_setup.py 10KB
setup.cfg 59B
impala
_thrift_gen
fb303
constants.py 254B
ttypes.py 845B
__init__.py 53B
FacebookService.py 57KB
ImpalaService
constants.py 254B
ImpalaService.py 41KB
ttypes.py 20KB
__init__.py 79B
ImpalaHiveServer2Service.py 12KB
hive_metastore
constants.py 994B
ThriftHiveMetastore.py 539KB
ttypes.py 127KB
__init__.py 57B
ExecStats
constants.py 254B
ttypes.py 15KB
__init__.py 34B
beeswax
constants.py 254B
ttypes.py 22KB
__init__.py 52B
BeeswaxService.py 68KB
__init__.py 0B
Types
constants.py 254B
ttypes.py 35KB
__init__.py 34B
Status
constants.py 254B
ttypes.py 4KB
__init__.py 34B
TCLIService
constants.py 958B
ttypes.py 199KB
TCLIService.py 101KB
__init__.py 49B
interface.py 8KB
tests
_dbapi20_tests.py 32KB
test_hive.py 2KB
test_data_types.py 2KB
conftest.py 3KB
test_dbapi_compliance.py 3KB
util.py 2KB
test_query_parameters.py 9KB
compat.py 735B
__init__.py 577B
test_sqlalchemy.py 1KB
dbapi.py 6KB
util.py 5KB
compat.py 917B
__init__.py 617B
_thrift_api.py 6KB
hiveserver2.py 41KB
thrift
fb303.thrift 2KB
TCLIService.thrift 31KB
hive_metastore.thrift 29KB
ImpalaService.thrift 10KB
__init__.py 734B
ExecStats.thrift 3KB
beeswax.thrift 5KB
Status.thrift 937B
Types.thrift 5KB
error.py 1KB
sqlalchemy.py 7KB
setup.py 2KB
README.md 3KB
LICENSE.txt 11KB
共 74 条
- 1
挣扎的蓝藻
- 粉丝: 13w+
- 资源: 15万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0