Python wrapper for TreeTagger
=============================
.. note:
TreeTagger is a language independent part-of-speech tagger.
:author: Laurent Pointal <laurent.pointal@limsi.fr> <laurent.pointal@laposte.net>
:organization: CNRS - LIMSI
:copyright: CNRS - 2004-2018
:license: GNU-GPL Version 3 or greater
:version: 2.2.5
- `Module documentation <http://treetaggerwrapper.readthedocs.org/>`_
(on Read The Docs).
- `Subversion repository & bug tracking <https://sourcesup.renater.fr/scm/viewvc.php?root=ttpw>`_
(on french academic SourceSup site).
- `Developer page <https://perso.limsi.fr/pointal/dev:treetaggerwrapper>`_
What is it?
-----------
This module wrap the Helmut Schmid language independent part-of-speech
statistical tagger into a Python class allowing to tag
several texts one after the other, maintaining connexions with the tagger
process to speed-up processing (remove external Perl scripts dependency
for chunking).
Using objects, you can start multiple taggers simultaneously, eventually using
different languages.
Support chunking for:
- english
- french
- german
- spanish
Support tagging for languages supported by TreeTagger, but you have to
do chunking by your own, if necessary you have to specify parameter
files via options.
This version has been reworked to run with Python2 and Python3 (thanks so six)
and globally reworked, bugs fixed.
Installation
------------
Unless someone built a package for your OS distro, the simplest procedure
is to use ``pip`` to install the module:
pip install treetaggerwrapper
If you have no admin access to install things on you computer, you may install
a virtualenv and run pip inside this virtual env, or you can do a local user
installation:
pip install --user treetaggerwrapper
May use ``pip3`` to go with your Python3 installation.
You also need to install TreeTagger…
TreeTagger
----------
Treetagger itself is is freely available for research, education and evaluation.
See `TreeTagger page`_.
.. _TreeTagger page: http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html
There is an installation procedure based on a script, where you download needed files
into the directory where you want to install TreeTagger, including the installation
script, and then launch the script to unzip and install right files in right
directories with right names.
For *Windows* users, there is a downloadable Windows binary, but no install script.
You have to download TreeTagger parameter files (since TreeTagger goes utf-8 they
are same on Linux and Windows), unzip them and install them in the right
place (``lib/``), with the right names (you can see these files names in
``treetaggerwrapper.py`` global dictionnary ``g_langsupport``,
in keys ``tagparfile`` and ``abbrevfile``.
If you install TreeTagger in a common place, there is normally a working autodetection
within ``treetaggerwrapper``.
But if you install it in a special place or with a special name, you will have
to provide this installation directory to the module (see ``TAGDIR`` in the doc).