
ANN Programming Manual
David M. Mount
∗
Department of Computer Science and
Institute for Advanced Computer Studies
University of Maryland, College Park, Maryland.
ANN, Version 1.1.1
Copyright, 2006 David M. Mount
∗
Email: mount@cs.umd.edu. The support of the National Science Foundation under grants CCR–9712379 and
CCR-0098151 is gratefully acknowledged.

Contents
1 Introduction 1
1.1 What is ANN? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Downloading and Using ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Compiling ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1 Compiling on Unix/Linux Systems . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.2 Compiling on Microsoft Windows Systems . . . . . . . . . . . . . . . . . . . . 2
1.3.3 Precompiled Files for Microsoft Windows . . . . . . . . . . . . . . . . . . . . 2
1.4 Compiling Applications that Use the Library . . . . . . . . . . . . . . . . . . . . . . 3
2 The ANN Library 4
2.1 Using ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Coordinates, Points, Distances, and Indices . . . . . . . . . . . . . . . . . . . 5
2.1.2 Nearest Neighbor Search Structure . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Point Utility Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.4 A Sample Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.5 Compiling Sample Applications that use ANN . . . . . . . . . . . . . . . . . . 11
2.2 Configuring ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Norms and Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 More on Points, Cordinates, and Distances . . . . . . . . . . . . . . . . . . . 14
2.2.3 Self Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Modifying the Search Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 ANN Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Internal Tree Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.3 Splitting Rules for kd-trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.4 Shrinking Rules for bd-trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Search Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 Standard Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.2 Priority Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.3 Early Termination Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Search Tree Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.1 Printing the Search Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.2 Saving and Restoring the Search Structure on a File . . . . . . . . . . . . . . 23
2.5.3 Gathering Statistics on Tree Structure . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Compile-Time Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Utility Programs 26
3.1 ann
test: A Test and Evaluation Program . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.2 Options Affecting Tree Structure: . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.3 Options Affecting Data and Query Point Generation . . . . . . . . . . . . . . 28
3.1.4 Options Affecting Nearest Neighbor Searching . . . . . . . . . . . . . . . . . . 30
3.1.5 Options Affecting the Amount of Output . . . . . . . . . . . . . . . . . . . . 30
3.1.6 Options Affecting the Validation . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 ann2fig: Visualization Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
i

1 Introduction
1.1 What is ANN?
ANN is a library written in the C++ programming language to support both exact and approximate
nearest neighbor searching in spaces of various dimensions. It was implemented by David M.
Mount of the University of Maryland and Sunil Arya of the Hong Kong University of Science
and Technology. ANN (pronounced like the name “Ann”) stands for the Approximate Nearest
Neighbor library. ANN is also a testbed containing programs and procedures for generating data
sets, collecting and analyzing statistics on the performance of nearest neighbor algorithms and data
structures, and visualizing the geometric structure of these data structures.
In the nearest neighbor problem a set P of data points in d-dimensional space is given. These
points are preprocessed into a data structure, so that given any query point q, the nearest (or
generally k nearest) points of P to q can be reported efficiently. ANN is designed for data sets that
are small enough that the search structure can be stored in main memory (in contrast to approaches
from databases that assume that the data resides in secondary storage). Points are assumed to
be represented as coordinate vectors of reals (or integers). The distance between two points can
be defined in many ways. ANN assumes that distances are measured using any class of distance
functions called Minkowski metrics. These include the well known Euclidean distance, Manhattan
distance, and max distance.
Answering nearest neighbor queries efficiently, especially in higher dimensions, seems to be very
difficult problem. It is always possible to answer any query by a simple brute-force process of
computing the distances between the query point and each of the data points, but this may be too
slow for many applications that require that a large number of queries be answered on the same
data set. Instead the approach is to preprocess a set of data points into a data structure from which
nearest neighbor queries are then answered. There are a number of data structures that have been
proposed for solving this problem. See, for example, [AMN
+
98, Ben90, Cla97, Kle97, PS85, Spr91],
for more information on this problem.
One difficulty with exact nearest neighbor searching is that for virtually all methods other
than brute-force search, the running time or space grows exponentially as a function of dimension.
Consequently these methods are often not significantly better than brute-force search, except in
fairly small dimensions. However, it has been shown by Arya and Mount [AM93b] and Arya, et
al. [AMN
+
98] that if the user is willing to tolerate a small amount of error in the search (returning
a point that may not be the nearest neighbor, but is not significantly further away from the query
point than the true nearest neighbor) then it is possible to achieve significant improvements in run-
ning time. ANN is a system for answering nearest neighbor queries both exactly and approximately.
This manual describes how to download and install ANN, how to use the library, how to change
its configuration for different distance functions and data representations, and finally how to use
its utility programs for testing and visualizing the data structure.
1.2 Downloading and Using ANN
The current version of ANN is version 1.1.1. The ANN source code and documentation is available
from the ANN web page:
http://www.cs.umd.edu/~mount/ANN
The unbundled software consists of the following major files and directories.
ReadMe.txt: General description of the library.
1

Copyright.txt: Copyright information.
License.txt: Conditions of use for the library.
include: Include files for compiling programs that use the library.
src: The source files for the library.
sample: A small sample program, which illustrates how to input a point set, build a search structure
for the set, and answer nearest neighbor queries. (See Section 2.1.4.)
test: A program that provides a simple script input for building search trees and comparing their
performance on point sets that are either read from an input file or generated according to
some probability distribution. (See Section 3.1.)
ann2fig: A program that generates a visual representation (in fig format) of the tree’s spatial
decomposition. (See Section 3.2.)
lib: Where the compiled library is stored.
bin: Where the compiled executables are stored.
doc: This documentation.
MS Win32: Solution and project files for compilation under Microsoft Visual Studio .NET.
1.3 Compiling ANN
ANN requires an ANSI standard C++ compiler. It has been compiled successfully on Unix and
Linux systems including Sun Workstations running SunOS 5.X (Solaris), Linux Red Hat 2.X, and
on DEC Alphas running Digital Unix v4.X, on SGIs running IRIX 6.X. Makefiles for all these
systems. It has also been compiled under Microsoft Windows XP (under Visual Studio .NET).
1.3.1 Compiling on Unix/Linux Systems
After downloading the sources, change to the ANN root directory (from which directories such
as bin, doc, and include branch off) and enter the command “make”. This will provide a list of
platforms and options under which to compile ANN. If you do not see your particular configuration
on this list, then you might try modifying the file Make-config for your particular system. The
authors welcome any additions to the list of supported platforms.
There are some additional compilation options that can be enabled or disabled. These are
described in Section 2.6. To recompile the library, enter the command “make realclean” to delete
the library and utility programs, and then reenter the make command for compilation.
1.3.2 Compiling on Microsoft Windows Systems
To compile under Visual C++ running within Microsoft Visual Studio .NET, go to the directory
MS
Win32. Open the solution file Ann.sln, select the “Release” configuration, and select “Build →
Build Solution.” After compilation, the file ANN.lib is stored in the directory MS Win32\dll\Release.
The file ANN.dll file and executables of the utility programs are stored in the directory bin (relative
to the ANN root directory). These two files, along with the files in the directory include/ANN are
needed for compiling applications that use ANN.
1.3.3 Precompiled Files for Microsoft Windows
For Microsoft Windows users that do not need to modify the software, a bundle containing all
the files you need to use ANN has been provided. This contains the include files in include/ANN
2

along with ANN.lib and ANN.dll. It can be downloaded directly from http://www.cs.umd.edu/
~mount/ANN. In order to use these with an application it is necessary to copy each of these files to
appropriate directories for the compiler and linker to locate them, and then to set the appropriate
compiler, linker, and path settings so the system can locate all these files. You should consult your
local system documentation for this information. An example of how to do this for the sample
program is given in Section 2.1.5.
1.4 Compiling Applications that Use the Library
In order to use ANN in a C++ program, the program must include the header file for ANN, which
contains the declarations for the ANN objects and procedures. This is done with the following
statement in the C++ source code.
#include <ANN/ANN.h>
This assumes that the ANN include directory is already on the compiler’s search path for include
files. On most Unix/Linux-based C++ compilers, this is done by setting the “-I” (capital letter
“i”) option on the compilation command line to point to the include directory in the ANN base
directory.
Then the program is linked with the ANN library. On Unix/Linux-based systems this is done
with the “-l” (lower-case letter “`”) option, assuming that the library search path has been set to
include the ANN library directory. The library search path can be set with the “-L” option. For
example, on my Unix system, the following command line could be used to compile the program
my
program.cpp using the ANN library and the GNU C++ compiler. Let ann denote the path to
root ANN directory.
g++ my_program.cpp -Iann/include -Lann/lib -lANN
Some additional information on compiling applications that use ANN on Microsoft Windows
systems can be found in Section 2.1.5.
3