FLANN - Fast Library for Approximate Nearest
Neighbors
User Manual
Marius Muja, mariusm@cs.ubc.ca
David Lowe, lowe@cs.ubc.ca
November 26, 2009
1 Introduction
We can define the nearest neighbor search (NSS) problem in the following way:
given a set of points P = p
1
, p
2
, . . . , p
n
in a metric space X, these points must
be preprocessed in such a way that given a new query point q ∈ X, finding the
point in P that is nearest to q can be done quickly.
The problem of nearest neighbor search is one of major importance in a
variety of applications such as image recognition, data compression, pattern
recognition and classification, machine learning, document retrieval systems,
statistics and data analysis. However, solving this problem in high dimensional
spaces seems to be a very difficult task and there is no algorithm that performs
significantly better than the standard brute-force search. This has lead to an
increasing interest in a class of algorithms that perform approximate nearest
neighbor searches, which have proven to be a good-enough approximation in
most practical applications and in most cases, orders of magnitude faster that
the algorithms performing the exact searches.
FLANN (Fast Library for Approximate Nearest Neighbors) is a library for
performing fast approximate nearest neighbor searches. FLANN is written in
the C++ programming language. FLANN can be easily used in many contexts
through the C, MATLAB and Python bindings provided with the library.
1.1 Quick Start
This section contains small examples of how to use the FLANN library from
different programming languages (C++, C, MATLAB and Python).
• C++
// file flann_example.cpp
#include <flann/flann.hpp>
#include <flann/io/hdf5.h>
#include <stdio.h>
int main(int argc, char** argv)
{
int nn = 3;
flann::Matrix<float> dataset;
flann::Matrix<float> query;
flann::load_from_file(dataset, "dataset.hdf5","dataset");
flann::load_from_file(query, "dataset.hdf5","query");
flann::Matrix<int> indices(new int[query.rows*nn], query.rows, nn);
flann::Matrix<float> dists(new float[query.rows*nn], query.rows, nn);
// construct an randomized kd-tree index using 4 kd-trees
flann::Index<flann::L2<float> > index(dataset, flann::KDTreeIndexParams(4));
index.buildIndex();
// do a knn search, using 128 checks
index.knnSearch(query, indices, dists, nn, flann::SearchParams(128));
flann::save_to_file(indices,"result.hdf5","result");
1
dataset.free();
query.free();
indices.free();
dists.free();
return 0;
}
• C
/* file flann_example.c */
#include "flann.h"
#include <stdio.h>
#include <assert.h>
/* Function that reads a dataset */
float* read_points(char* filename, int *rows, int *cols);
int main(int argc, char** argv)
{
int rows,cols;
int t_rows, t_cols;
float speedup;
/* read dataset points from file dataset.dat */
float* dataset = read_points("dataset.dat", &rows, &cols);
float* testset = read_points("testset.dat", &t_rows, &t_cols);
/* points in dataset and testset should have the same dimensionality */
assert(cols==t_cols);
/* number of nearest neighbors to search */
int nn = 3;
/* allocate memory for the nearest-neighbors indices */
int* result = (int*) malloc(t_rows*nn*sizeof(int));
/* allocate memory for the distances */
float* dists = (float*) malloc(t_rows*nn*sizeof(float));
/* index parameters are stored here */
struct FLANNParameters p = DEFAULT_FLANN_PARAMETERS;
p.algorithm = AUTOTUNED; /* or KDTREE, KMEANS, ... /*
p.target_precision = 0.9; /* want 90% target precision */
/* compute the 3 nearest-neighbors of each point in the testset */
flann_find_nearest_neighbors(dataset, rows, cols, testset, t_rows,
result, dists, nn, &p);
...
free(dataset);
free(testset);
free(result);
free(dists);
return 0;
}
• MATLAB
% create random dataset and test set
dataset = single(rand(128,10000));
testset = single(rand(128,1000));
% define index and search parameters
2
params.algorithm = ’kdtree’;
params.trees = 8;
params.checks = 64;
% perform the nearest-neighbor search
[result, dists] = flann_search(dataset,testset,5,params);
• Python
from pyflann import *
from numpy import *
from numpy.random import *
dataset = rand(10000, 128)
testset = rand(1000, 128)
flann = FLANN()
result,dists = flann.nn(dataset,testset,5,algorithm="kmeans",
branching=32, iterations=7, checks=16);
2 Downloading and compiling FLANN
FLANN can be downloaded from the following address:
http://www.cs.ubc.ca/∼mariusm/flann
After downloading and unpacking, the following files and directories should
be present:
• bin: directory various for scripts and binary files
• doc: directory containg this documentation
• examples: directory containg examples of using FLANN
• src: directory containg the source files
• test: directory containg unit tests for FLANN
To compile the flann library the CMake
1
build system is required. Below is
an example of how the FLANN library can be compiled on Linux (replace x.y
with the corresponding version number).
$ cd flann-x.y-src
$ BUILD_TYPE=release INSTALL_PREFIX=<some directory> make install
3 Using FLANN
3.1 Using FLANN from C++
The core of the FLANN library is written in C++. To make use of the full
power and flexibility of the templated code one should use the C++ bindings
1
http://www.cmake.org/
3
if possible. To use the C++ bindings, the library header file flann.hpp must
be included and the library libflann cpp.so (for linking dynamically) or the
libflann cpp s.a (for linking statically) must be linked in. An example of the
compile command that must be used will look something like this:
g++ flann_example.cpp -I $FLANN_ROOT/include -L $FLANN_ROOT/lib -o flann_example_cpp
-lflann_cpp
where $FLANN ROOT is the library main directory.
The following sections describe the public C++ API.
3.1.1 flann::Index
The FLANN nearest neighbor index class. This class is used to abstract different
types of nearest neighbor search indexes.
namespace flann
{
template<typename Distance>
class Index
{
typedef typename Distance::ElementType ElementType;
typedef typename Distance::ResultType DistanceType;
public:
Index(const Matrix<ElementType>& features, const IndexParams& params);
~Index();
void buildIndex();
void knnSearch(const Matrix<ElementType>& queries,
Matrix<int>& indices,
Matrix<DistanceType>& dists,
int knn,
const SearchParams& params);
int radiusSearch(const Matrix<ElementType>& query,
Matrix<int>& indices,
Matrix<DistanceType>& dists,
float radius,
const SearchParams& params);
void save(std::string filename);
int veclen() const;
int size() const;
const IndexParams* getIndexParameters();
};
}
flann::Index::Index Constructs a nearest neighbor search index for a given
dataset.
4