Introduction to Python for Econometrics, Statistics and Data Analysis
Kevin Sheppard
University of Oxford
Tuesday 5
th
August, 2014
-
©2012, 2013, 2014 Kevin Sheppard
2
Changes since the Second Edition
Version 2.2.1 (August 2014)
• Fixed typos reported by a reader – thanks to Ilya Sorvachev
Version 2.2 (July 2014)
• Code verified against Anaconda 2.0.1.
• Added diagnostic tools and a simple method to use external code in the Cython section.
• Updated the Numba section to reflect recent changes.
• Fixed some typos in the chapter on Performance and Optimization.
• Added examples of joblib and IPython’s cluster to the chapter on running code in parallel
Version 2.1 (February 2014)
• New chapter introducing object oriented programming as a method to provide structure and orga-
nization to related code.
• Added seaborn to the recommended package list, and have included it be default in the graphics
chapter.
• Based on experience teaching Python to economics students, the recommended installation has
been simplified by removing the suggestion to use virtual environment. The discussion of virtual
environments as been moved to the appendix.
• Rewrote parts of the pandas chapter.
• Code verified against Anaconda 1.9.1.
Version 2.02 (November 2013)
• Changed the Anaconda install to use both create and install, which shows how to install additional
packages.
• Fixed some missing packages in the direct install.
• Changed the configuration of IPython to reflect best practices.
• Added subsection covering IPython profiles.
i
Version 2.01 (October 2013)
• Updated Anaconda to 1.8 and added some additional packages to the installation for Spyder.
• Small section about Spyder as a good starting IDE.
ii
Notes to the 2
nd
Edition
This edition includes the following changes from the first edition (March 2012):
• The preferred installation method is now Continuum Analytics’ Anaconda. Anaconda is a complete
scientific stack and is available for all major platforms.
• New chapter on pandas. pandas provides a simple but powerful tool to manage data and perform
basic analysis. It also greatly simplifies importing and exporting data.
• New chapter on advanced selection of elements from an array.
• Numba provides just-in-time compilation for numeric Python code which often produces large per-
formance gains when pure NumPy solutions are not available (e.g. looping code).
• Dictionary, set and tuple comprehensions
• Numerous typos
• All code has been verified working against Anaconda 1.7.0.
iii