simkd - simple kd-tree construction and query
---------------------------------------------
The simkd application constructs a kd-tree from the contents of a data
file (e.g., example-data.csv) of k-dimensional vectors, and then performs
nearest neighbor searches within the kd-tree using query points from the
query file (e.g., example-queries.csv). The search can be either for K
nearest neighbors, or for all neighbors within some range (radius) of the
query point. (Annoying note: the k's in kd-tree and k-nearest neighbor are
not the same.) The dimensions of all data points and query points must be
the same, or the program will die horribly and yell at you.
The output of the program is a file (output.csv by default) with
lines of the form:
query_record,rank,data_record
where:
query_record: the record number of the current query point in
the query file (starting from 0)
data_record: the record number of a data point in the data file
that is "close to" the current query point (from 0)
rank: the "closeness rank" of the data_record point to
the query_record point: 0 is closest, 1 is next
closest, and so on. (If taskname is knn and k is 1
[the defaults], then rank will necessarily be 0 in
each output line.)
For example, this line in the output file,
3,1,321
means that, for the 4th query point, the 2nd closest data point is the
322nd point in the data file. (*Please note* that, if your data
file contains a header line and/or blank lines, the 332nd point
will *not* be described by the 332nd line in the data file. Ditto
for the query file.)
Command line arguments:
-----------------------
Required:
data <filename> example: example-data.csv
This file contains a sequence of vectors
in comma-separated value format
representing the points from which the
kd-tree will be built, and against which
nearest neighbor queries will be performed.
Can be 2-dimensional, as the example file,
or greater-than-2-dimensional, e.g.:
w, x, y, z
0.50000,0.00000,0.10000,0.12500
2.00000,2.00000,2.50000,0.00000
1.12500,1.00000,0.00000,3.50000
...
Optionally, *the first line only* can
contain field names, for your own
convenience (e.g., "w, x, y, z"). Blank
lines in the file are ignored.
query <filename> example: example-queries.csv
This file contains a sequence of vectors
in comma-separated value format
representing the points for which we want
to locate nearest neighbors from the data
file. (The dimension of the query
vector(s) must match the dimension
of the data vectors.)
Optionally, *the first line only* can
contain field names, for your own
convenience (e.g., "w, x, y, z"). Blank
lines in the file are ignored.
*Note* that it is fine to use the *same* file as both the data file
and the query file in order, for example, to find the 5 nearest
neighbors to each point within a single data set.
Optional:
out <filename> default: output.csv Output file name.
*See above for content*
taskname <string> default: knn Task to perform, one of:
knn - find K nearest neighbors
rangesearch - find points in range
(and eventually: rangecount -
count points in range)
k <int> default: 1 Only relevant if taskname == knn.
The number of nearest neighbors to
find for each query point.
range <double> default: 10 Only relevant if taskname != knn.
The range (or radius) around each
query point in which to search for
nearest neighbors. The range is in
the same units as those used for
the data and query points. A very
narrow range may find no data
points "near"--within range of--
a given query point; a very wide
range may find that *all* data
points are "near" a given query
point.
rmin <int> default: 50 The maximum number of points to
store in a leaf node of the
kd-tree: higher values may
increase speed of kd-tree
construction at cost of slower
search, and vice versa. Must
have rmin >= 1.
method <string> default: singletree
Can only by singletree right now.
Refers to the algorithm for
doing the task. (Later we hope
to include dualtree.)
Example Command Lines:
----------------------
$ ./simkd data example-data.csv query example-queries.csv
Constructs a kd-tree with the four 2-dimensional vectors from file
"example-data.csv", then performs the four 1-nearest neighbor
queries for the points from file "example-queries.csv". The output
will be in file "output.csv". (An ASCII-art diagram of the data and
query points for this example is in file "example.txt".)
$ ./simkd data your-d.csv query your-q.csv k 5 rmin 20 out your-out.csv
Constructs a kd-tree with the M k-dimensional vectors from
"your-d.csv", then performs the N 5-nearest neighbor queries for
the points from "your-q.csv". The output is "your-out.csv".
Since rmin is 20, there will be at most 20 data point per leaf of
the kd-tree, meaning that kd-tree construction will be "slower"
but queries will be "faster" than for the default rmin of 50.
(Depending on the size of the data set, the size of the query set,
and the amount of real memory on your machine, memory management may
dominate array traversals or vice versa: your mileage may vary.)
$ ./simkd data ex2-d.csv query ex2-q.csv taskname rangesearch range 500
Constructs a kd-tree from the one thousand 4-dimensional
没有合适的资源?快使用搜索试试~ 我知道了~
KDtree空间数据分类
共148个文件
c:32个
h:26个
obj:24个
需积分: 9 14 下载量 176 浏览量
2008-09-16
19:15:54
上传
评论
收藏 1.55MB RAR 举报
温馨提示
分类专用工具 里面是几个源码以及一些说明
资源推荐
资源详情
资源评论
收起资源包目录
KDtree空间数据分类 (148个子文件)
am_string_array.c 62KB
command.c 60KB
cluster2.c 54KB
cluster1.c 49KB
ambs.c 43KB
amiv.c 41KB
cli.c 39KB
am_time.c 37KB
amdyv.c 33KB
amar.c 32KB
clc.c 30KB
am_string.c 26KB
amma.c 25KB
amdyv_array.c 23KB
clx.c 21KB
am_file.c 18KB
hrect.c 16KB
cle.c 16KB
kdtree.c 15KB
cluster3.c 13KB
backtrace.c 10KB
kquery.c 8KB
am_proc.c 8KB
genarray.c 6KB
neighbors.c 6KB
kresults.c 4KB
distances.c 4KB
knn.c 3KB
kresult.c 2KB
rangesearch.c 2KB
cluster4.c 698B
main.c 300B
cluster 6KB
collect 1KB
copying 26KB
ex2-d.csv 33KB
ex2-q.csv 2KB
example-data.csv 30B
example-queries.csv 30B
k-d tree C++ source codes.doc 47KB
iris.dom 296B
wine.dom 236B
iris2d.dom 226B
test.dom 198B
sym.dom 198B
simkd.dsp 7KB
cli.dsp 5KB
cle.dsp 5KB
clc.dsp 5KB
clx.dsp 5KB
mcli.dsp 5KB
mcle.dsp 4KB
mclx.dsp 4KB
cluster.dsw 2KB
simkd.dsw 562B
simkd.exe 364KB
sym.fcm 480B
am_string_array.h 26KB
ambs.h 14KB
cluster.h 14KB
amiv.h 13KB
am_time.h 11KB
am_string.h 11KB
amdyv.h 11KB
hrect.h 9KB
kdtree.h 8KB
command.h 7KB
amdyv_array.h 6KB
standard.h 5KB
amar.h 5KB
am_file.h 4KB
kquery.h 4KB
neighbors.h 4KB
genarray.h 3KB
backtrace.h 3KB
amma.h 3KB
distances.h 2KB
kresults.h 1KB
knn.h 1KB
am_proc.h 1KB
kresult.h 1018B
am_time_private.h 893B
rangesearch.h 413B
BuildLog.htm 4KB
simkd_source_doc.html 2KB
simkd_src.iss 4KB
cluster.mak 7KB
makefile 7KB
Makefile 2KB
simkd.ncb 211KB
cluster.ncb 65KB
am_string_array.obj 138KB
command.obj 124KB
amiv.obj 122KB
amdyv.obj 115KB
ambs.obj 112KB
am_time.obj 93KB
amar.obj 82KB
am_string.obj 62KB
amma.obj 55KB
共 148 条
- 1
- 2
资源评论
亮金
- 粉丝: 1
- 资源: 6
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功