User Guide of T-Drive Data
Version 1.0
Updated on August 16, 2011
1 Data Description
This dataset contains the GPS trajectories of 10,357 taxis during the period of Feb. 2 to Feb. 8, 2008
within Beijing. The total number of points in this dataset is about 15 million and the total distance of
the trajectories reaches to 9 million kilometers. Fig. 1 plots the distribution of time interval and distance
interval between two consecutive points. The average sampling interval is about 177 seconds with a distance
of about 623 meters. Each file of this dataset, which is named by the taxi ID, contains the trajectories of
one taxi. Fig. 2 visualizes the density distribution of the GPS points in this dataset.
0 2 4 6 8 10 12
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
minutes
proportion
(a) time interval
0 1000 2000 3000 4000 5000 6000 7000 8000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
meters
proportion
(b) distance interval
Figure 1: Histograms of time interval and distance between two consecutive points
(a) Data overview in Beijing (b) Within the 5th Ring Road of Beijing
Figure 2: Distribution of GPS points, where the color indicates the density of the points
1