# Pandas Bokeh
**Pandas Bokeh** provides a [Bokeh](https://bokeh.pydata.org/en/latest/) plotting backend for [Pandas](https://pandas.pydata.org/) and [GeoPandas](http://geopandas.org/), similar to the already existing [Visualization](https://pandas.pydata.org/pandas-docs/stable/visualization.html) feature of Pandas. Importing the library adds a complementary plotting method ***plot_bokeh()*** on **DataFrames** and **Series** (and also on **GeoDataFrames**).
With **Pandas Bokeh**, creating stunning, interactive, HTML-based visualization is as easy as calling:
```python
df.plot_bokeh()
```
For more information have a look at the [Examples](#Examples) below or at notebooks on the [Github Repository](https://github.com/PatrikHlobil/Pandas-Bokeh/tree/master/Documentation) of this project.
![Startimage](Documentation/Images/Startimage.gif)
<br>
## Installation
You can install **Pandas Bokeh** from [PyPI](TODO) via *pip*:
pip install pandas-bokeh
**Pandas Bokeh** is supported on Python 2.7, as well as Python 3.6 and above.
<br>
## How To Use
<p id="Basics"> </p>
The **Pandas-Bokeh** library should be imported after **Pandas**. After the import, one should define the plotting output, which can be:
* **pandas_bokeh.output_notebook()**: Embeds the Plots in the cell outputs of the notebook. Ideal when working in Jupyter Notebooks.
* **pandas_bokeh.output_file(filename)**: Exports the plot to the provided filename as an HTML.
For more details about the plotting outputs, see the reference [here](#Layouts) or the [Bokeh documentation](https://bokeh.pydata.org/en/latest/docs/user_guide/quickstart.html#getting-started).
### Notebook output (see also [bokeh.io.output_notebook](https://bokeh.pydata.org/en/latest/docs/reference/io.html#bokeh.io.output_notebook))
```python
import pandas as pd
import pandas_bokeh
pandas_bokeh.output_notebook()
```
<p id="output_file"> </p>
### File output to "Interactive Plot.html" (see also [bokeh.io.output_file](https://bokeh.pydata.org/en/latest/docs/reference/io.html#bokeh.io.output_file))
```python
import pandas as pd
import pandas_bokeh
pandas_bokeh.output_file("Interactive Plot.html")
```
<br>
<p id="Examples"></p>
## Lineplot
### Basic Lineplot
This simple **lineplot** already contains various interactive elements:
* a pannable and zoomable (zoom in plotarea and zoom on axis) plot
* by clicking on the legend elements, one can hide and show the individual lines
* a Hovertool for the plotted lines
**Note**: If the **x** parameter is not specified, the index is used for the x-values of the plot.
```python
import numpy as np
np.random.seed(42)
df = pd.DataFrame({"Google": np.random.randn(1000)+0.2,
"Apple": np.random.randn(1000)+0.17},
index=pd.date_range('1/1/2000', periods=1000))
df = df.cumsum()
df = df + 50
df.plot_bokeh(kind="line")
```
![ApplevsGoogle_1](Documentation/Images/ApplevsGoogle_1.gif)
#### Advanced Lineplot
There are various optional parameters to tune the plots, for example:
* **kind**: Which kind of plot should be produced. Currently supported are: *"line", "point", "scatter", "bar"* and *"histogram"*. In the near future many more will be implemented as horizontal barplot, boxplots, pie-charts, etc.
* **figsize**: Choose width & height of the plot
* **title**: Sets title of the plot
* **xlim**/**ylim**: Set visibler range of plot for x- and y-axis (also works for *datetime x-axis*)
* **xlabel**/**ylabel**: Set x- and y-labels
* **logx**/**logy**: Set log-scale on x-/y-axis
* **xticks**/**yticks**: Explicitly set the ticks on the axes
* **color**: Defines a single color for a plot.
* **colormap**: Defines the colors to plot. Can be either a list of colors or the name of a [Bokeh color palette](https://bokeh.pydata.org/en/latest/docs/reference/palettes.html)
* **hovertool**: If True a Hovertool is active, else if False no Hovertool is drawn.
* **kwargs****: Optional keyword arguments of [bokeh.plotting.figure.line](https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.line)
Try them out to get a feeling for the effects. Let us consider now:
```python
df.plot_bokeh(
kind="line",
figsize=(800, 450),
title="Apple vs Google",
xlabel="Date",
ylabel="Stock price [$]",
yticks=[0,100,200,300,400],
ylim=(0,400),
colormap=["red", "blue"])
```
![ApplevsGoogle_2](Documentation/Images/ApplevsGoogle_2.png)
#### Lineplot with data points
For **lineplots**, as for many other plot-kinds, there are some special keyword arguments that only work for this plotting type. For lineplots, these are:
* **plot_data_points**: Plot also the data points on the lines
* **plot_data_points_size**: Determines the size of the data points
* **marker**: Defines the point type *(Default: "circle")*. Possible values are: 'circle', 'square', 'triangle', 'asterisk', 'circle_x', 'square_x', 'inverted_triangle', 'x', 'circle_cross', 'square_cross', 'diamond', 'cross'
* **kwargs****: Optional keyword arguments of [bokeh.plotting.figure.line](https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.line)
Let us use this information to have another version of the same plot:
```python
df.plot_bokeh(
kind="line",
figsize=(800, 450),
title="Apple vs Google",
xlabel="Date",
ylabel="Stock price [$]",
yticks=[0,100,200,300,400],
ylim=(100,200),
xlim=("2001-01-01","2001-02-01"),
colormap=["red", "blue"],
plot_data_points=True,
plot_data_points_size=10,
marker="asterisk",
toolbar_location="right"
)
```
![ApplevsGoogle_3](Documentation/Images/ApplevsGoogle_3.png)
<br>
## Pointplot
If you just wish to draw the date points for curves, the **pointplot** option is the right choice. It also accepts the **kwargs** of [bokeh.plotting.figure.scatter](https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.scatter) like *marker* or *size*:
```python
import numpy as np
x = np.arange(-3, 3, 0.1)
y2 = x**2
y3 = x**3
df = pd.DataFrame({"x": x, "Parabula": y2, "Cube": y3})
df.plot_bokeh(
kind="point",
x="x",
xticks=range(-3, 4),
size=5,
colormap=["#009933", "#ff3399"],
title="Pointplot (Parabula vs. Cube)",
marker="x")
```
![Pointplot](Documentation/Images/Pointplot.gif)
<br>
## Scatterplot
A basic **scatterplot** can be created using the *kind="scatter"* option. For **scatterplots**, the **x** and **y** parameters have to be specified and the following optional keyword argument is allowed:
* **category**: Determines the category column to use for coloring the scatter points
* **kwargs****: Optional keyword arguments of [bokeh.plotting.figure.scatter](https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.scatter)
Note, that the **pandas.DataFrame.plot_bokeh()** method return per default a Bokeh figure, which can be embedded in Dashboard layouts with other figures and **Bokeh** objects (for more details about (sub)plot layouts and embedding the resulting Bokeh plots as HTML click [here](#Layouts)).
In the example below, we use the building *grid layout* support of **Pandas Bokeh** to display both the DataFrame (embedded in a *Div*) and the resulting **scatterplot**:
```python
#Load Iris Dataset from Scikit Learn:
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(iris["data"])
df.columns = iris["feature_names"]
df["species"] = iris["target"]
df["species"] = df["species"].map(dict(zip(range(3), iris["target_names"])))
df = df.sample(frac=1)
#Create Div with DataFrame:
from bokeh.models import Div
div_df = Div(text=df.head(10).to_html(index=False),
width=550)
#Create Scatte