论文研究DensitySensitive Spectral Clustering Based on Natural Neighbor.pdf

基于自然最近邻的密度敏感谱聚类，雷大江，王明达，谱聚类算法的关键在于如何衡量数据之间的邻近关系，本文通过一种快速自然最近邻方法自适应选取邻居数目来描述数据点之间的邻近关
山国酬技论文在线 http://www.paper.edu.cn the relationship (Neighbor's Upper Bound). Natural neighborhood can be described by its neighbors 95 upper bound, and neighbor's upper bound is defined as Supk SuPk=maxr{r(vpq)A(p≠q)∧(pENM(q) Where ris the number of neighbors that dynamically change when searching, NN(g) indicates that q satisfies the defined number of neighbors when r neighbors are searched ( Stable Searching State). Natural neighborhood is defined as G=(V, E), where V 100 represents the vertex, E represents the variable set, hypothesis vi, y;E V and vi+ vj, the saturation state of the graph can be described as ED(y)A(D()≥1(2) WhereDDr(vi) represents the vertex adjacent to yi, when vi represents the most isolated vertex DDr(viI 105 Formula 3 indicates that the most isolated vertices have the nearest neighbor, and the graph is in a saturated state as a condition for the natural neighbor search to stop For densitysensitive spectral clustering algorithms, the correlation betw een data points is mostly based on the traditional euclidean distance between data points. And then researchers began to describe the similarity between data points based on Gaussian kernel functions. Although Gaussian kernel function can better describe the relationship between data points, the correct choice of kernel parameters of Gaussian kernel function is the key to the success of the algorithm And the description based on Gaussian kernel function is not suitable to deal with the multiscale parameter problem[9]. In 2007, Wang designing a densitysensitive similarity measure which is datadependent measure of 115 similarity, and introduced it into spectral clustering to propose a densitysensitive spectral clustering method [10]. This method enlarges the distance between data points in different highdensity areas meanwhile reducing the distance between data points in the same highdensity area. For densitysensitive distance, Lu proposed a densitysensitive hierarchical clustering algorithm[ll], and Sun proposed a semisupervised clustering algorithm with extended constraints[ 12] 120 In the densitysensitive spectral clustering algorithm, a line length with adjustable density is defined and according adjustable line length to define the distance between new data points. Then a larity matrix w is defined by the new define of dista (Density Adjustable Segment Length). Give two points x and y. The function dist(, y) is the Euclidean distance between the data points x and y, and pis a scaling factor. The 125 value range of p must p>1. Adjustable length of line segment is defined as follow dist(x,y) The length of the Euclidean line segment between data points is converted to a line segment of adjustable density and stretch the Euclidean distance between data points can be implemented. (Densitysensitive Distance). We can calculate a densitysensitive distance measure 130 by density adjustable segment length between data points. Data point V=(XJi=1 is taken as the vertex of graph G=(V, E). The densitysensitive distance between data points is the sum of the lengths of adjustable line segments on the shortest path. Deline as follow =mins∈P,2k=1L(Pk,P 山国技论文在线 http://www.paper.edu.cn A new densitysensitive distance between data points is used as the distance between data points 135 to establish a similarity matrix W. We apply w to classical spectral clustering to get the density sensitive spectral clustering algorithm and then the distance measure finds the shortest path on the same manifold. Therefore, densitysensitive spectral clustering algorithm can be applied to a variety of manifolds. The algorithm can better to satisfy the global consistency 140 According to the definition of densitysensitive spectral clustering algorithm, the key of density sensitive spectral clustering lies in the definition of a new distance measure and a new measure of line length. According to the definition of the length of the adjustable line, the length of the line with adjustable density is a complex function of power function. The length of the line with adjustable density transforms the traditional Euclidean distance by an exponential function, making the distance 145 between data points exponentially increase with increasing Euclidean distance. The distance between data points is defined by the densitysensitive distance. If the data points belong to the same class, all the data points on the shortest path between the data points belong to the same class, i. e. satisfies the global consistency principle of the clustering algorithm. When the relative position between data points does not change, the classification of the clustering algorithm should not change, i. e. the mutual 150 relationship between the data points unchanged. However, densitysensitive distance metrics may change as the distance between data points increases Fig 1 shows that when the distance between data points linear change, the shortest path between A and b changes when the phase position between data points does not change d In A 155 Fig. I Two paths bctwccn A and B (Densitysensitive Distance). In Fig. 1, path between the above direction of A andB is defined as follo 71=DAB=L(A,m1)+L(m1m2)+L(m2,B)=p4+p42+pd3 (6) And another path is defined as follow T2=D4B=L(A,n1)+L(n1,n2)+L(n2B)=p+pd+p°3(7) Assume thatT1 T2 before linear transformation, There is no guarantee that T1 will always be less than T2 whenfd1, d2, d3, d4, d5, d6 is linearly expanded The shortest path between data points has changed and the division of data points will change. according to the definition of line length adjustable density, scaling factor p needs to be made through empirical knowledge. It can be seen from 165 Fig 3 that the selection of scaling factor will also affect the clustering effect. Different scaling factors in the classic threecircle data of artificial data set will produce different clustering 4 山国武技论文在线 http:/iwww.paper.edu.cn *++,+t Fig 2 Thc Original Data Sct Th wA E4 Fig 3 The Effect of Scaling Factor on Data Division Therefore, in the above analysis, there are some problems in densitysensitive spectral clustering algorithrn 1. In densitysensitive spectral clustering algorithm, the value of scaling factor afTects the per formance of the algorithm 2. In the densitysensitive clustering algorithm, adjustable length of line segment will destroy the correlation between data points when the relative position of data points linear changes The accuracy of densitysensitive spectral clustering algorithms will have an impact when use densityadjustable line lengths l85 In order to solve these problems, we must consider the relationship between the clustering result and the data point distance. Inliterature [11, if the two points belong to a cluster division, the data points can be connected by a path Through the same sparse area. If the two points do not belong to the same cluster division, the connected paths between data points will pass through different sparse areas.Therefore, the shortest path between data points does not change with the distance between 190 scalability factors and data points are the key point to solve this problem From the idea of the natural neighbor method, we can see that when the most isolated point can get the nearest neighbor, then the graph is just in the saturated state. According to this theory, a fast search method is proposed to make the near neighbor of data fast to thesaturation state. The fast search method based on natural neighbor method(FNN method) directly obtains the furthest neighbors of data sets 200 and then looks for the nearest neighbor to find the nearest neighbor's critical situation, i.e. access to the Supi. When we know the rankings of all the farthest neighbors in the nearest neighbor list, we can calculate the maximum rank by the neighbor list. In other words, all the most remote data points can be found in neighbors, near neighbors are saturated and this is a criterion of stopping searching neighbor 山国科技论文在线 http://www.paper.edu.cn ( Criterion of Stopping Searching Neighbor ) The fast search method needs to return to the Supk, The nearest neighbor node xi of the furthest neighbor node of the data point xi is obtained by Farnn(xi), and then through the findRank(xi,xi) method to calculate rank of xi in the neighbor list of xi Supa=Max(n=find Rank(x, x)A(x, E Farnn(xD) (8) The Fnn method is to calculate 2 of the datasets. The time complexity of this method is o(*n). In fact, in highdimensional data, the n will be more than 20 but less than 30. So n always much smaller thanlog n, and FNN method is faster than the natural neighbor method In Algorithm 2 function findFN (X calculates the furthest neighbors of datasets. Function indKNN(xi 1) 205 calculates the nearest neighbor of the furthest neighbors of data. Function findRank(xi,xi) calculates the rank of xiin the neighbors list ofx r=1, FNNN=0, Rank =o NaN lambda= o 2:Vxk∈ⅹ 210 3: forall;∈ findEN(X) 4: FN_NN=findknN(i, 1) 5678 ∈FNNN Rank=findRank(xi, xi) 215 9: =Max(rank) 10: NaNlambda= NaN_lambda U xk, knn (k)] 11: Return: Nan lambda Algorithm 1 firstly finds the furthest neighbors of datasets. Secondly, get the nearest neighbor of 220 furthest neighbors, and step 4 gets its nearest neighbor Thirdly, step 68 get the rank of furthest neighbors and its nearest neighbor. Finally, the max of rank and NaN_lambda are our goal. According to the relationship of each data point stored in NaN_lambda, we can get the adjacentmatrix A, which can describe the relationship of nearest neighbor. If x i is the nature neighbor of xj, Ai is set to 1; otherwise Ai is set to O 225 The transformation of the distance between data points can be identified by the sparse degree of neighbors. It is common to measure the degree of sparse around a data point using the distance between the data point and the k nearestneighbors. Therefore, we describe the sparse degree of data points by describing the distance between the point and its local information points 230 (Scaling Factor Based on Local Information). In a data set, the local information of data point iis the distance from data point i to the local information points. In formula 9 Si represents the data point. Sho represents the data information point corresponding to the data pointiand function d(Si, Sno) represents the Euclidean distance between the data point i and its local informati point 235 According to the concept of local information, we can set a new scaling factor based on loc information. This scaling factor needs to satisfy some of the features of the scaling factor. According to the research of scaling operators in density sensitive spectral clustering algorithm, we design a scaling factor based on local information Define as follow 山国科技论文在线 httpiiwww.paper.edu.cn 240 nin Cr,o (ensityAdjustable Line Segment Length Based on Local Information). According to the new scaling factor we can define a new line length. Where dist(, y) represents the euclidean 245 distance between the data point x and y Define as follow L(x,y)=ndist(e,y) (1) The new designed length of line with adjustable density based on local information can satisfy the 50 requirement thatthe density adjustable segment length of data points and the shortest line segment length has not changedwhen the distance between data points increases linearly. Therefore, adjusting the length of the line segment based on the local information can eliminate the influence of the Euclidean distance expansion. (Similarity Measure Based on Local Information. We can further define the distance measure between data points based on the densilyadjustable line segment length of the local 255 information. It can be seen from definition 8 that the densitysensitive distance belween data points can describe the similarity between data points. We can calculate the similarity matri In order to ensure the integrity of the matrix, the diagonal of the similarity matrix W constructed by the densitysensitive similarity measure is 0, i.e. Wi=0 W1=1/(mnep1k(xx0121)+1)(12) 260 Fig 4. shows that the accuracy of the algorithm in the expansion of different multiples of the data values Through the different magnification of threecircle data, the new adjustable line length is applied to the spectral clustering algorithm to achieve the purpose of expanding the euclidean distance between data points Fig 4 The Accuracy of Algorithm It can be seen from Fig 4 thatafter multiple expansion of threecircle data, the algorithm can correctly divide the tricyclic data. Density sensitive spectral clustering algorithm based on local information can well adapt to the change of distance between data points. So, we can design a densitysensitive spectral clustering method based on local information In algorithm 2, firstly, Algorithm I is used to construct the neighborhoods between data and stored inadjacent matrix 4. Secondly, a similarity matrix wis constructed from the densitysensitive 275 similarity measure according to the adjacent matrix Aand the formulas. Step 45, we calculate the normalized Laplacian matrix. Then we can calculate eigenvectors corresponding to the first k largest eigenvalues and stored in V. Then, we need to normalize the line vector of v wl 7 山国技论文在线 http://www.paper.edu.cn V=Vii/oi Vii)2. Last, we use kmeans method to solve clustering results 275 1:={x=1k,Y,V 2:A=FNN(X),∈Rn2n 3: Wii similar Matrix(X)if aii = 1,W Rnxn 4:Dt=∑}1W 5:P=D12WD1/2 280 6:V=eigs(P,k),V=[v1,v2…,vk]∈Rnxk 7: V= normalize(v) 8 Y= Kmeans(v,k) 9: Return: Y 285 We would separately test the clustering effect of our method on the data set from the artificial data set and the uci data set, and compared with the original Kmeans algorithm and densitysensitive spectral clustering algorithm. In the uci data set, we use the correct number of classification ratio to evaluate the accuracy of the algorithm 290 Some "challenging" clusteringproblems are given by the literature [6]. We select a small part of the data in the literature It can be seen from Fig 4 that kmeans algorithm cannot be correctly divided the datasets, Densitysensitive spectral clustering DSSSC)algorithm could only divided the second group while p= 1.5, and cannot be correctly divided the third group while p= 2. but Densitysensitive spectral clustering based on natural neighbor (NDSSSC) correctly divides all data 295 sets. Therefore, we can see that the densitysensitive spectral clustering algorithm has shortcomings while only considers the scaling factor. We can use natural neighbor method selfadaptively describe neighborhoods, and provide local information for densitysensitive spectral clustering algorithm 到 (a) Original 300 (b) kme (c) Density sensitive spectral clustering(p =1.5) 山国酬技论文在线 http://www.paper.edu.cn 蒸 ∵;… 305 (d) Density sensitive spectral clustering(p= 2) .∵:"·∴。 ● ,“4 Densitysensitive spectral clustering based on natural neighbor ig. 3 Effect Comparison ofClustering 310 In order to further test the performance of densitysensitive spectral clustering algorithm based on natural neighbor, we design experiments and select a series dataset (Iris dataset, Ionosphere dataset, glass dataset and wine dataset)in the international UCi dataset From Table 1, the properties of each data set already presented 315 Tab. 1 The Properties of Each UCI Data Set 351 210 13 23332 60 During the experiment, the scaling factor parameters of densitysensitive spectral clustering algorithm were chosen as 1.5, and all the algorithms were repeated 50 times. The result of the algorithm 320 involves the problem of an algorithm renumbering. In the course of our classification, the actual number of samples may not be in accordance with the new number of our classification In order to solve this problem, we establish a mapping matrix. By selecting the maximum number of clusters and mapping values, we get the total number of correct classification, and then get the correct rate Table 2 shows that densitysensitive spectral clustering method based on natural neighbor has good performance in dataset Ionosphere, Iris, Seed, Sonar dataset. The kmeans algorithm on Wine data sets has a slight advantage over our method. Among the four algorithms, densitysensitive spectral clustering algorithms are weaker than the other two algorithms, it can be seen that the value of inappropriate scaling parameters will affect the clustering effect. But our method can overcome the selection problem of scaling parameters and have better performance on the data set 330 Tab 2 Comparison ofCorrect Rate 0.7123 0.6382 0.8933 0.7267 0.8952 0.6095 0.7022 0.5506 山国科技论文在统 http://www.paper.edu.cn 0.5129 0.5385 This paper analyzes the problem of parameter selection in density sensitive spectral clustering. The parameter selection based on natural neighbor methodadopts the adaptive schema to solve the 335 beforementioned problem. We propose a density clustering algorithm based on natural neighbor. Our algorithm achieves better performance on the"challenging"data sets. The experiments show that our prposed method can overcome the problem, that the exponential transformation of distance may affect the clustering performance when the distance between data points is relatively large. What is more, it is still an open question that how to describe the neighbor releationship between data pointsaccurately 340 The future work maybe design the neighbor graph via convex optimization This paper is supported by the following foundations or programs, including Natural Science Foundation of Chongqing of China(cstc2014jcyjA40049), Doctoral Scientific Research Foundation of Chongqing of University of Posts and Telecommunications(A201301) 345 [J Buhmann, Data clustering and learning LA]. The handbook of brain theory and neural networkslC] 1995.10591070 [2]J. Macqueen. Some Methods for Classification and Analysis of Multi Variate Observations [A]. Berkeley Symposium on Mathematical Statistics and Probability [C]. 1967. 281297 350 [3].L. Lauritzen. The EM Algorithm for Graphical Association Models with Missing Data[J]. Computational Statistics Data Analysis, 1995, ol. 19: 191201 [4J Shi, J. Malik Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis machine ntelligence,200o1.22:888905 [5]R Kannan, S. Vempala, A. Veta. On Clusterings: Good, Bad and Spectral[J]. Symposium on Foundations of 355 Computer Science, 2000, 367 [6A.Y. Ng, M.I. Jordan, Y. Weiss. On spectral clustering: analysis and an algorithm[A]. International Conference. Neural Information Processing Systems: Natural and Synthetic[C]. 2001, 849856 [7 M. Mcila, J Shi. Lcarning Scgmcntation by Random Walks[J]. Advances in Ncural Information Proccssing Systems.2001,873879 360 [8]L. ZelnikManor. Selftuning spectral clustering[J]. Advances in Neural Information Processing Systems, 2004, 7:16011608. []Q. S. Zhu, J. Feng, J. L Huang. Natural neighbor: A selfadaptive neighborhood method without parameter K[. Pattern Rccognition Lcttcrs, 2016, vol. 80: 3036 OTL. U Pengli, Z Wang DensitySensitive hierarchical clustering algorithm J. Computer Engineering 365 Applications, 2014 [II]L. Wang, B O. Lie Feng, L C Jiao DensitySensitive Spectral Clustering[. Acta Electronica Sinica, 2007 vol.35:15771581 [12]. Sun,YG. Yi,Y.H LV, H. T Cai, J Z Wang, A SemiSupervised Sparsity Discriminant Analysis Algorithm for Feature Extraction[]. Advanced Materials Research, 2012, vol 546547: 670674 370 [13]B Scholkopf, J. Weston,O. Bousquct, D. Zhou, Lcarning with Local and Global Consistency[J]. 2004 基于自然最近邻的密度敏感谱聚类 375 400065 400065 摘要: 380 10

20191202
 439KB
论文研究一种多尺度和聚类分析的变化检测 .pdf
20190815一种多尺度和聚类分析的变化检测，张小华，王乐，针对多时相遥感影像变化检测问题，提出了一种多尺度和聚类分析的变化检测方法。该方法在差异影像的基础上，利用均值平移算法对差
 174KB
2004 Localitysensitive hashing scheme based on pstable distributions.pdf
20200607Localitysensitive hashing using stable distributions p稳定局部敏感哈希算法（e2lsh）论文。
 569KB
论文研究基于CostSensitive主成分分析的人脸识别.pdf
20190907为此，提出一种基于代价敏感（CostSensitive）主成分分析的人脸识别方法，该方法在主成分分析理论中引入一个代价敏感函数，将不同错误识别所造成的损失进行分类划分，以确定不同的损失代价，实现更精确的人脸识别。...
 17KB
alexsensitivewordsfilter3.0.jar
20191213本版本为双向词汇版本，顺序扫描文本时，会判断正向词汇和反向词汇，有交叉的以等级高的为准，原理：http://blog.csdn.net/ranjio_z/article/details/75446147，欢迎指教询问打赏。使用说明： 1、本 Java工具包由...
 609KB
论文研究基于簇间距离自适应的软子空间聚类算法.pdf
20190910influences on clustering in the soft subspace clustering process, a selfadaptive soft subspace clustering algorithm has been proposed based on the compactness of intracluster compactness and the ...
 266KB
Topicsensitive PageRank  a contextsensitive ranking algorithm
20110714Topicsensitive PageRank  a contextsensitive ranking algorithm Taher H. Haveliwala Stanford University
 740KB
LayeredCostmapsforContextSensitiveNavigation.pdf
20210315LayeredCostmapsforContextSensitiveNavigation.pdf
 1.82MB
Graph Adversarial Training：Dynamically Regularizing Based on Graph Structure.pdf
20190809Abstract—Recent efforts show that neural networks are vulnerable to small but intentional perturbations on input features in visual classiﬁcation tasks. Due to the additional consideration of ...
 1.53MB
Graphbased Natural Language Processing and Information Retrieval
20140222After defining measures on cluster quality for graphs, spectral and nonspectral graph clustering methods are briefly introduced. Most of the chapter is to be understood as a presentation of general ...
 1.43MB
论文研究C5.0分类算法及在银行个人信用评级中的应用.pdf
20190919论文研究C5.0分类算法及在银行个人信用评级中的应用.pdf, 研究了商业银行的个人信用评级问题.由于个人信用记录中既涉及有数值型数据, 也涉及有非数据型数据,而决策树是...
 408KB
论文研究Gas sensitive properties of polysiloxane material to organophosphate vapor.pdf
20190815有机磷毒剂敏感材料的气敏特性研究，胡佳，杜晓松，本文报道了分别以苯酚、2氟苯酚、3氟苯酚、2,3二氟苯酚、3氟甲基苯酚为官能团的5种聚硅氧烷类敏感材料的合成路径和甲基磷酸二甲�
 1.34MB
Costsensitive Support Vector Machines
20190201The resulting algorithm avoids the shortcomings of previous approaches to costsensitive SVM design, and is shown to have superior experimental performance on a large number of cost sensitive and ...
 381KB
论文研究Accelerate convolutional neural networks for binary classification: a cascading costsensitive feature approach.pdf
20190822加速卷积神经网络在二元分类中的速度：级联代价敏感的特征方法，庞俊彪，林辉煌，目前,卷积神经网络在很多计算机视觉任务上都取得了领先的性能。但是,卷积神经网络的计算复杂度非常高。实证结果发现,卷积神经网络
 1.50MB
SpectralClusteringAlgorithms源码
202105263. Parallel Spectral Clustering based on Locality Sensitive Hashing(DASC) 跑步 您需要安装Spark 1.3.0或更高版本以及hadoop 1.0.4作为存储支持。 使用sbt构建后，使用sparksubmit工具将应用程序提交到Spark...
 589KB
论文研究基于相干探测的ΦOTDR多点振动传感技术研究 .pdf
20190816基于相干探测的ΦOTDR多点振动传感技术研究，吴晓晓，洪小斌，对基于相干探测的相位敏感光时域反射系统(phasesensitive Optical Time Domain Reflectometer, ΦOTDR)多点同时振动传感技术进行了研究。在基于相干
 1.43MB
论文研究新型偏好敏感决策树算法.pdf
20190722针对现有决策树模型在分类过程中没有考虑决策者对结果的偏好行为，因而不能很好地预测具有明显偏好倾向问题的不足，提出了一种偏好敏感决策树（preference sensitive decision tree，PSDT）分类算法。该算法引入了...
 1.14MB
QueryAware LocalitySensitive Hashing for Approximate Nearest Neighbor Search
20180121本文介绍的是已知位置信息与语义信息的临近点近似查询方法。
 942KB
论文研究SLSMOTE和CSRVM结合的电子设备故障检测方法.pdf
20190908针对电子设备故障检测问题中故障机理复杂、故障样本贫瘠的问题，提出一种SLSMOTE(Safe Level Synthetic Minority Oversampling TEchnique）和代价敏感相关向量机（Cost Sensitive Relevance Vector Machine，CSRVM...
 709KB
论文研究基于TLSHOG特征新方法的动态手势识别.pdf
20190912为了提高实际复杂场景的人机交互中动态手势识别的准确性和实时性，提出了一种时序局部敏感直方图（Temporal Locality Sensitive Histograms of Oriented Gradients，TLSHOG）特征新方法，用于描述手势运动的时序变化...

下载
ipmitool1.8.11.rar
ipmitool1.8.11.rar

下载
行业分类物理装置基于解析交互控制系统.zip
行业分类物理装置基于解析交互控制系统.zip

下载
合同范本之赠与合同 15份 #资源达人分享计划#
合同范本之赠与合同 15份 #资源达人分享计划#

下载
高中数学立体几何外接球问题.pdf
高中数学立体几何外接球问题.pdf

下载
前端面试题汇总【赠送】.pdf
前端面试题汇总【赠送】.pdf

下载
二重积分.xmind
二重积分.xmind

下载
Axure RP9 使用中继器实现数据分页及分页组件控制
Axure RP9 使用中继器实现数据分页及分页组件控制

下载
PCSWMM汇水区属性之——“子区域路由” 及“路由百分比”豆丁.pdf
PCSWMM汇水区属性之——“子区域路由” 及“路由百分比”豆丁.pdf

下载
行业分类物理装置基于卷积神经网络的结构一致立体图像风格迁移方法.zip
行业分类物理装置基于卷积神经网络的结构一致立体图像风格迁移方法.zip

下载
MC34063DesignGUI.zip
MC34063DesignGUI.zip