birch算法C语言源代码资源-CSDN文库

共88个文件

c：36个

h：30个

c~：8个

4星 · 超过85%的资源需积分: 9 46 浏览量 2012-05-08 07:34:02 上传评论 2 收藏 781KB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

package

Birch.rar （88个子文件）

folder

Birch

folder

Birch

folder

Birch

cfentry.C~ 13KB

kmeans.C 874B

asciiToBinary.cpp 1KB

rand.h 4KB

samples.C 4KB

phase1.C~ 3KB

vector.h~ 3KB

global.h~ 2KB

timeutil.C 5KB

clarans.C 8KB

asciiToBinary.C 1KB

vec_utils.C 3KB

cluster_fractal.C 685B

asciiToBinary.cpp~ 403B

vector.h 3KB

status.C 26KB

phase1.h~ 734B

grid.h 931B

samples.h 1KB

path.C 6KB

asciiToBinary.o 3KB

metric 1KB

parameter.h 2KB

kmeans.h 738B

test_components.C 5KB

main.C~ 4KB

util.h 1KB

summary.C 2KB

box_fractal.C 1KB

phase1.C 3KB

path.h 2KB

fix_recyqueue.h 1KB

ascii2Bin 260KB

components.C 6KB

cftree.C 26KB

summary.C~ 1KB

cfentry.C 13KB

buffer.h 1KB

cfentry.h 4KB

rectangle.C 6KB

rectangle.h 3KB

cftree.h~ 11KB

phase1.h 734B

hierarchy.h 2KB

fix_recyqueue.C 2KB

rand.C 16KB

vector.C~ 4KB

phase4.h 733B

parameter.C 3KB

phase4.C 12KB

timeutil.h 4KB

vector.C 4KB

clarans.h 991B

cutil.C 4KB

asciiToBinary.C~ 1KB

recyqueue.h 1KB

test_contree.C 4KB

lloyd.h 733B

components.h 2KB

cftree.C~ 26KB

lloyd.C 872B

global.h 2KB

grid.C 1KB

point_kernel.C 3KB

vec_utils.cpp~ 864B

contree.h 2KB

density.C 12KB

status.h 4KB

cutil.h 1KB

Makefile 5KB

phase3.h 1004B

box_fractal.h 741B

summary 295KB

vec_utils.C~ 3KB

phase3.C 4KB

recyqueue.C 3KB

main.C 5KB

hierarchy.C 13KB

cftree.h 11KB

cluster_fractal.h 734B

.depend 8KB

util.C 1KB

buffer.C 2KB

contree.C 7KB

density.h 3KB

phase2.h 734B

phase2.C 2KB

Birch算法.ppt 698KB

2003-5-19

BIRCH:

An Ecient Data Clustering Method for Very

Large Databases

主讲人：左子叶

2003-5-19

问题：



大数据集



I/O 开销

BIRCH



一次扫描



处理噪音



评估时间 / 空间效率



BIRCH/CLARANS 性能比较

2003-5-19

Introduction

聚类：识别稀疏 / 密集数据；发现数据及

的全局分布模式；可视化

两种数据： metric ， non-metric

给定 k 、 N ，距离度量函数，要寻找一

个数据集的分区，使函数最小化

2003-5-19

相关工作（ 1 ）

数据集太大，内存无法存放

基于概率的方法：



各个属性值的分布相互独立



簇的更新和存储开销很大：属性值的数目和属性的数目



非平衡树：性能与数据输入相关

基于距离的方法



所有数据点事先给定，反复扫描，同等对待



全局的度量方法，扫描所有的簇或数据点

全枚举

迭代最优：



从一个初始分区开始，计算所有可能使度量函数更小的点的移动



局部最优，初始敏感

2003-5-19

相关工作（ 2 ）

层次聚类 O(N

2

)

CLARANS



图的搜索



节点： K- 分区， K 个中心点；两个簇是邻居：只有一个中

心点不同



对当前节点，随机检查 maxneighbor 个邻居



如果有更好的邻居，移动到那个节点，继续



否则纪录当前节点为局部最优，重新选择一个新的节点



找到 numlocal 个局部最优为止，返回最佳的

与迭代最优方法相似



质量换时间： R* 树采样；相关数据点

内容反馈

tyreal_c

2013-12-19

好复杂，看不懂~~
YONGHU33

2019-06-03

好复杂的代码
ydp2tlh

2013-09-07

好复杂，看不懂~~
weixin_39008027

2017-12-03

东西还凑合
C寒晨C

2014-04-28

很复杂啊，还是转java的好了

前往

页

lmm4141

粉丝: 0
资源: 2

最新资源

资源上传下载、课程学习等过程中有任何疑问或建议，欢迎提出宝贵意见哦~我们会及时处理！点击此处反馈

feedback-tip