ExtremeLearningMachine资源共享Semisupervisedspectralhashingforfastsimilaritysear_2013_Neurocomputin.pdf 评分

ExtremeLearningMachine资源共享Semisupervisedspectralhashingforfastsimilaritysear_2013_Neurocomputin.pdf 小弟准备学习ELM，才收集到一些相关资料，发现论坛中并无相关资料，因此把自己手头上收集到的共享给大家，希望能够帮到大家，同时也帮到自己，一起学习，一起努力！
54 C. Yao et aL. / Neurocomputing 101(2013)5258 tasks, it is possible to obtain a few data pairs in which points are 3.1. Regularization known to be either neighbors or nonneighbors. Therefore several alternative approaches, SemiSupervised Hashing (SSH It is known that labeled data is expensive while sometimes eq 31]and PCAHashing(PCAH)32 are proposed recently to make (9)for hashing measures the empirical loss only. Therefore, it is use of semi supervision to yield better performance prone to overfitting especially when the size of labeled set is SSh and pcah used the hash function defined as small compared to the entire dataset. To achieve better general ization capability, one needs to impose regularization by utilizing y=sign(Wxt both labeled and unlabeled data to maximize the information where ye h", w is a d x m projection matrix each column of provided by each bit [36 which is a projection vector Wk satisfying Iw, 2=1, and t is an Motivated by Wang et al. [31, 32 we would like to maximize m x 1 thresholding vector. Typically, thresholding vector t is set the variance on each dimension of all points y e Y. However, it is to(1/n2i w xi. Without loss of generality, every dimension of hard to directly add the same regularizer into our objective training data is normalized to zero mean so that threshold t=0 function as in [31, 32 Instead we add a regularizer to maximize the summation of the square of the euclidean distances between can be omitted during training. To learn W that similar pairs data points yE Y. We can easily show that it is equivalent to the (x, x)EP has a small distance in Hamning space and dissimilar pairs (X, X)EN has a large distance in Hamming space,an regularizer in [31, 32 empirical loss is follows Lemma 1. The summation of the variances on each dimension of all L=aEdOn(yi,y)P] E(dHm(,y)N 5 points y e y is proportional to the summation of the square of the Euclidean distances between data points ye y where the expecta In practice, expectations can be replaced with the means on a set tions are estimated by the averages of positive pairs and negative pairs in the training dataset a is a Proof. The summation of the variances on each dimension of all positive parameter balancing the trade off between the false positive and false negative (higher a correspond to lower points ye y can be represented as follows: false negative rates ). SSh pcah defined the hamming V Dy ki distance dH u()as ∑(E(yh)E2 dem(v._= which is computed by counting the total number of nonzero bits ∑(∑%2(∑y in the XOR results between yi and y. Thus, they finally solve a maximization problem by replacing the expectation with the averages and merging the constants where D( denotes the variance. Here, we replace the expecta tions with the averages the summation of the square of the max trIW XSX W)+strAW XX WI Euclidean distances between data points ye y can be represented as follows where S is a pairwise label matrix, and each element Sy calculated by 0= 2∑ayk)2 1if(x,x)∈P 1i(x,x)∈N 2>>∑y+y2yk) 0 otherwise Different from PCAh, SSh relax the originality constraints by 22>%2(∑y adding a penalty to the objective function(Eq (7)) to reduce the error when converting the realvalue solution to the binary one. 3. Approach Therefore, we can add the regularizer (1/n)2(Xi, X;EX lyiy; 2, the summation of the pairwise square of the Euclidean In this section, we introduce the proposed SemiSupervised distances for all the data points in the training set, to maximize Spectral Hashing(S3H)in detail. Different from previous works on the variance on each dimension of all points ye Y. By replacing semisupervised hashing [31,32], we use the square of the the expectation with the averages, the final loss function shows as Euclidean distance to measure the hamming distance(Eq(6)) follows in the loss function (Eq (5)): ly; y; as same as in the spectral hashing [20]. Then the empirical loss for hashing according to ec L l y (5 can be P) L=OElViy 2 P)E(yy; 2N) B∑wy吗 (x Actually, the square of the euclidean distance leads to the same result as in Eq (6) for the binary value data. However, using (X1,x;∈X the square of the euclidean distance will lead to a more general Laplacian matrix [33] based solution after the relaxation by where np and n, represents the size of set P and N,a, B are two removing the binary constraints as shown in the next section positive tuning parameters. D C. Yao et al./ Neurocomputing 101(2013)5258 3. 2. Relaxing objective junction 3.3. Relaxing orthogonality constraint Direct minimization of the loss function Eq (10) is difficult The orthogonality constraints on the projection directions are since the terms y involve with a nondifferentiable sign non imposed in order to derive the hash bits in the previous section linearity. Typically, we can remove the sign( to derive the However, these orthogonality constraints sometimes incur more embedding. Therefore, after the relaxation, the objective function errors especially for higher bits when converting the realvalue can be written as maximizing(v solution to the binary one. This phenomenon will be empirically shown in Section 4. This is because the orthogonality constraints J(W) ∑WxWx force one to progressively pick those directions that have very low ∈P variance, substantially reducing the quality of higher bits, hence the whole embedding [31. Actually, we have to make a tradeoff between the novelty and the variance of the newly picked direction. Therefore, motivated by Wang et al. [31] we similarly convert the orthogonality constraints into a penalty term added ∑WXWx (11) to the objective function, and rewrite the objective function into the following form Eg. (11) can be written into a more concise matrix form (W)=tr(WAW)5lWW12 JW)=2∑WxWx12S1 (12) strw AW)tri(W' WD)(W'WDh (15) where There 6 is a positive parameter to modulate the orthogonality onstraints Hoy r, the eq (15)is and the global solution is not easier to find than the previous one to maximize the object function with respect to w: β n n2 I(x,xi)EN aW=Awe(Ww DW=0 others →I+aAWWw=0 Here the matrix s is similar to that in sh while sh used rbf kernel to measure the similarity between points. According to eq (12 we have Obviously, the above equation admits an unlimited number of solutions. However, if M=I+(1 0)A is positive definite, we can J(W) obtain a simple solution for the condition WWW=(I+AW (16) ∑rw(xx)S(x1x)W Since a is symmetric but not necessarily positive definite, M is >troW(x Six +x; SixT symmetric, too. We can easily find that m is positive definite if the coefficient e satisfies some conditions x; S; xiX S; xw) Proposition 1. The matrix M is positive definite if e>max(o, where /min is the smallest eigenvalue of a ∑ tr(W'X SiX,w∑ tr(.SiX1W Proof. since a is symmetric, it can be represented as ∑ tr(W X DiX; w∑ trfW XiSiX; w) a=U diag(1, ...,UT (17) tr(W XDX'WW XSX W) where all ni are real. Let imin=min(l,., a. Then M can be rewritten as r(W XLX W (13) M=1+ here d is a diagonal matrix whose entries are column (or row sums of S, i.e. Di=2Si. L=DS is called the Laplacian matrix 1+U diag [33 in the spectral graph theory. So we name the presented o…0 U method as semisupervised spectral hashing Imposing the orthogonality constraints on the projection directions coupled with unitnorm assumption leads to new constraints WW=I, the learning of optimal projections w becomes a ty pical eigenproblem, which can be easily solved by Since 0>0, if M will have all eigenvalues positive, imin/0+1 performing an eigenvalue decomposition on matrix AXLX If M is positive definite it d as m= cc maxJ(W) using Cholesky decomposition. It can be easily found that Eq. (16) will be satisfied when w=cu. to obtain a d x m matrix we use a W=e1.eml (14) truncated matrix by selecting the first m columns of CU as a meaningful approximation that lead to the final solution where 21>12>...>Am are the top m eigenvalues of XLX and W=CUm, where Um are the eigenvectors corresponding to the ek,k=1,., m are their corresponding eigenvectors top m eigenvalues of a C. Yao et aL. / Neurocomputing 101(2013)5258 4. Experiments Recall (20) In this section, we perform several experiments to study the ffectiveness of the proposed SemiSupervised spectral here rs is the size of scanned points in the rank list. The Hashing(s3H) precisionrecall is evaluated at 48 bits for each approach, and rs is progressively increased to n. 4.1. datasets 43. Compared methods USPS: The usps is a handwritten digit database. We use a popular subset containing 9298 16x 16 handwritten digit images To demonstrate the performance of our proposed Semi in total. each sample is associated with a label from 0 to 9. We Supervised Spectral Hashing method, we compare it against three randomly partition the dataset into two parts: a training set with stateoftheart approaches mentioned in Section 2. To make fair 8298 samples and a test set with 1000 samples. For the semi comparisons, we list the experimental settings for each method as supervised case, the label information of 1000 sainples from the allowS training set are used and others are treated unlabeled for the LSH Locality Sensitive Hashing proposed in [29], projections unsupervised case, all the training samples are treated unlabeled. are randomly selected from a standard gaussian ISOLET: The ISoLeT is a spoken letter recognition database. The BinarizedLSI. Semantic Hashing proposed in [19] dataset was generated as follows. Onehundred and fifty subjects SH Spectral Hashing proposed in [20] spoke the name of each letter of the alphabet twice. Hence, the SSH SemiSupervised Hashing proposed in [31] dataset has 26 categories and 7797 samples. The dimension of PCAH PCA Hashing proposed in [32] every sample is 617. Similarly, we randomly partition the dataset S3HI. Our proposed s3h with orthogonality constraints, the into two parts: a training set with 6797 samples and a test set parameters 2, are set to 1 and 10, respectively with 1000 samples. For the semisupervised case, the label S3H2. Our proposed s3H without orthogonality constraints, information of 1000 samples from the training set are used and the parameters a, B,0 are set to 1, 10, 1, respectively others are treated unlabeled for the unsupervised case, all the All the parameters are carefully tuned by crossvalidation, and training samples are treated as unlabeled ones. the data points are normalized to unit norm SiFTiM: The siftim dataset is made of one million sift descriptors [37 extracted from random images. Each descriptor 4.4. Results is a 128dirmensional histograms of gradient orientations. We employ one million samples for training and additional 10 K as Fig. 1(a),(c)and(e) plots the map results for all the compared test points. Following the evaluation criterion in [20, 31,32], the methods on the three datasets. We find that the performance Euclidean distance is used to determine the nearest neighbors and improves with the number of the used bits for most methods a returned point is considered as a true neighbor if it lies in the However, it can be clearly seen that S3H1 with orthogonality top two percentile closest points to a query. For the semi constraints has superior performance for 16 bits on ISOLET and supervised methods, we randomly select 8 K points from the leads most methods on the remaining two datasets. Its perfor training set and use the same criterion to determine the positive mance drops significantly when bit length becomes longer since pairs. For the negative pair, a point belongs to the different classes variance drops significantly with orthogonality constraints. PCAH if it lies in the top two percentile farthest points also suffers from this drawback since the orthogonality con For all three datasets, we test each compared method 10 times straints force them to progressively pick those directions that by randomly splitting the samples. Then the average results are have very low variance, substantially reducing the quality of used to show the performance higher bits as mentioned in Section 3.3. Obviously, two semi supervised methods Ssh and s3 h2 without orthogonality con 4.2. Evaluation metric straints outperforms other algorithms on three datasets due to utilizing both the information of labeled data and unlabeled data Hamming ranking is employed to evaluate the hashing per meanwhile making the tradeoff between the novelty and the formance For a given dataset, we issue the query with each point variance of the newly picked direction. However, our proposed n the test set, and all points in the training set are ranked S3H2 performs much better than ssh because of the more general according to their hamming distances from the query. Due to the Laplacian matrix based solution after the relaxation by removing binary representation, Hamming ranking is essentially fast in the binary constraints practice Fig. 1(b),(d)and(f)shows the precisionrecall curve for 48 Two performance metrics are used to evaluate Hamming bits on three datasets. Similar to MAP results, it can be clearly ranking: mean of average precision( MAP), precisionrecall curve. seen that our proposed method S3H2 is superior to other Average precision is defined as follows: compared methods. Higher precision and recall for S3H2 indicates the advantage of the novel semisupervised hashing algorithm Average precision (18) where ri is set to one if the ith point in rank list has the same 5. Conclusion and future worko label as the query, and n is the size of the entire dataset MAP is the mean of average precision for all the queries in the test set In this paper, a semisupervised spectral hashing method is which approximates the area under precisionrecall curve [38 II proposed to take advantage of the information of both the labeled our experiments, MAP is evaluated from 16 bits to 128 bits for all data and unlabeled data. We use the square of the euclidean methods distance to measure the hamming distance, which leads to a Precision and recall are computed by more general Laplacian matrix based solution after the relaxation by removing the binary constraints. We also relax the orthogon Precision 19 ality constraints to reduce the error when converting the real value solution to the binary one. The experimental evaluations on C. Yao et al./ Neurocomputing 101(2013)5258 a0.5 0.5 0.45 日LSl 令 0.4 A SH PCAH 0.35 S3H1 S3H2 t S3H2 0.25 1624324864 06 Number of bits c d 0.2 0.35 = LSI SH 0.3 7PCAH V PCAH 025 +k s3H1 *S3H2 0.4 Number of bits e0.3 e LSH e LSH 0.25 吾LS C.15 A SSH 7  PCAH ★S3H1 S3H1 0.15 安S3H2 0.1a + S3I 12 0.05L金 1624324864 128 0.6 0.8 Number of bits reca Fig. 1. Results on three datasets(a),(c), and(e) Map results for different number of bits on USPS, ISOLET, dnd SIFTIM. (b),(d), and(n piecisionlecall curve for 48 bits on USPS IsoLET and siftim three benchmark datasets show the superior performance of the [5] M.S. Lew, N Sebe, C. Djeraba, R Jain, Contentbased multimedia information proposed method over the stateoftheart approaches. In the retrieval: state of the art and challenges, ACM Trans. Multimedia Comput future, we will extend the proposed approach to the unsupervised Commun. Appl. 2(2006)119 [6] A Torralba, R Fergus, W.T. Freeman, 80 Million tiny images: a large data set case and apply it to some interesting applications in multimedia for nonparametric object and scene recognition, IEEE Trans. Pattern Anal information retrieval Mach. ntell30(2008)19581970. [7]H. cgou, M. Douzc, C. Schmid, Packing bagoffcaturcs, in: IEEE 12th International Conference on Computer vision, 2009, pp. 23572364 [8] A.M. BIonsteill, M.M. Bronstein, L J. Guibas, M. Ovsjdnikov, Shape google Acknowledgments geometric words and expressions for invariant shape retrieval, ACM Trans Graph.30(2011)1:11:z0 [91JH. Fricdman, J.L. Bentley, R.A. Finkel, An algorithm for finding best matches The authors appreciate the reviewers for their extensive and in logarithmic expected time, ACM Trans. Math. Softw. 3(1977)2019226 informative comments for the improvement of this manuscript. [10 S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, A.Y. Wu, An optimal algorithm for approximate nearest neighbor searching fixed dimensions, .AcM45(1998)891923 References 1]P. Ciaccia, M. Patella, P. Zezula, Mtrcc: an cfficicnt access mcthod for similarity search in metric spaces. In: Proceedings of the 23rd International Conference on Very Large Data Bdses(VLDB), Pp. 426435 [1 M. Henzinger, Finding nearduplicate web pages: a largescale evaluation of [12 A Beygelzimer, S Kakade, J. Langford, Cover trees for nearest neighbor, in algorithms, in: Proceedings of the 29th Annual International ACM SIGIR Proceedings of the 23rd International Conference on Machine learning ICMLO6, ACM, New York, NY, USA, 2006, pp. 97104 Conference on Research and Development in Information Retrieval, SIGIR,O6, 13.K. Uhlmann, Satisfying general proximity/similarity queries with metric ACM, New York, NY, USA, 2006, pp. 28429 trees, Inf Process. Lett. 40(1991)175179. documents,in:Proceedings of the 30th Aunual International ACM SIGIR 14] M.S. Charikar, Similarity estimation techniques from rounding algorithms, in Conference on Research and Development in Information Retrieval, SIGIR'07 Proceedings of the 34th Annual ACM Symposium on Theory of Computing. ACM, New York, NY, USA, 2007, Pp. 825826 STOCO2, ACM, New York, NY, USA, 2002, pp 380388. 3]Y Koren, Factorization meets the neighborhood: a multifaceted collaborative 15 [15P Indyk, R Motwani, Approximate nearest neighbors towards removing the filtering model, in: Proceeding of the 14th ACM SIGKDD International curse of dimensionality, in: Proceedings of the 30th Alnudl ACM SymposiuM Confcrcncc on Knowledge Discovcry and Data Mining, KDD'08, ACM, Ncw on Theory of Computing, STOC'98, ACM, New York, NY, USA, 1998, pp 604 k,NY,USA,2008,pp.426434 [4] S. Pandey, A. Broder, F. Chierichletti, V. Jusifovski, R. Kumar, S. Vassilvitskii, 116] M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni, Localitysensitive hashing Nearestneighbor caching for contentmatch applications, in: Proceedings of scheme based on pstable distributions, in: Proceedings of the 20th Annual the 18th International conference on world wide web. www 09. Acm Symposium on Computational Geometry, SCG04, ACM, New York, NY, USA, New York, NY, USA, 2009, pp. 441450 2004,pp.253262 58 C. Yao et aL. / Neurocomputing 101(2013)5258 [17]P lain, B Kulis, K. Grauman, Fast image search for learned metrics, in: IEEE Chengwei Yao received the master degree in Com Conference on Computer Vision and Pattcrn Rccognition(CVPR 2008), 2008. ter Scicncc from Zhejiang University, China, in 2000 pp.21432157 He is currently a candidate for a phd degree in [18]K. Grdunanl, T. Darrell, Pyramid match hashing: sublinedr time indexing Computer Science at Zhejiang UniveIsity. His reseaich over partial correspondences, in: IEEE Computer Society Conference on interests include Data mining, Software Engineering Computer Vision and Pattern Recognition, vol. 0, 2007. [19) R. Salakhutdinov, G. Hinton, Semantic hashing, Int. ]. Approx. Reason. 50 (2009)969978. [Special Section on Graphical Models and Information Retrieval [20Y. Weiss, A.B. Torralba, R. Fergus, Spectral hashing, in: Advances in Neural InforIndtion Processing SysteIIs (NIPS ) voL. 21, Vancouver, Canladd, pp.17531760. [21 A. Torralba, R. Fergus, Y. Weiss, Small codes and large image databases for recognition, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 0. 2008. pp. 1 [22] D Kulis, T. Darrell, Learning to hash with binary reconstructive embeddings, in Advances in Neural Information Processing Systems(NipS), pp. 10421050 Jiajun Bu received the bs and phd degrees in Compu [23]J. He, W. Liu, S.F. Chang, Scalable similarity search with optimized kernel ter Science from Zhejiang University, China, in 1995 hashing, in: Proceedings of the 16th ACM SIGKDD International Conference and 2000. respectively. He is a professor in College of on Knowledge discovery and data Mining, KDD'10, ACM, New York, NY, USA Computer Science, Zhejiang University. His research 2010,pp.11291138. interests include embedded system, data mining, 124 D. Zhang J. Wang D Cai,]. Lu, Selftaught hashing for fast similarity search. inforation retrieval and mobile database in: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'10, ACM, New York, NY USA,2010.pp.1825 [25]D Knuth, The Art of Computer Programming, vol 3, AddisonWesley, 1997. [26 P. Wegner, a technique for counting ones in a binary computer, Commun ACM3(1960)322 [27 B Stein, Principles of hashbased text retrieval, in: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and development in [281 T Mitchell, Machine Learning, McGraw Hill, 1997(international edit/on/Y Information Retrieval, SIGIR 'O7, ACM, New York, NY, USA, 2007, Pp. 52753 [29] A. Gionis, P. Indyk, R Motwani, Similarity search in high dimensions via Chenxia Wu is currently a Master candidate in College hashing. in: Proceedings of the 25th International Conterence on Very large of Computer Science in Zhejiang University. H Data Bases, VLDB 99, Morgan Kaufmann Publishers Inc, San francisco, CA, received his bs degree in Computer Science from USA.1999.pp.518529 Southeast University, China. Ilis research interests [30] C. SilpaAnan, R. Hartley, Optimised kdtrees for fast image descriptor include machine learning. computer vision and multi matching, in: IEEE Conference on Computer Vision and Pattern Recognition media information retrieval CvPR2008,2008. [31J. Wang, S Kumar, S. F Chang, Semisupervised hashing for scalable image etrieval, in: IEEE Computer Society Conference on Computer vision and Pattern Recognition, vol. 0, 2010, Pp. 34243431 [32]J. Wang, S Kumar, S.F. Chang, Sequential projection learning for Hashing with compact codes, in: Proceedings of the 27th International Conference on aifa. IsI [33] Russell Merris, Laplacian matrices of graphs: a survey Linear Algebra Appl (1994)143176 [34 M.W. Berry. S.T. Dumais, G.W. O'Brien. Using linear algebra for intelligent information retrieval, SIAM Rev. 37(4)(1995)573595 [35S Deerwester, S.T. Dumais, G.W. Furnas, T.K. I. andauer, R Harshman, Index 大学计机科学;5 ience, Zhejiang University. His research interests ing by latent selllantic anlalysis, J. Am. Soc. Inf. Sci. 41(6)(1990)391407. include dBms, data mining, CSCw and information [36]S Baluja, M. Covell, Learning to hash: forgiving hash functions and applica tions, Data Min Knowl. Discov. 17(2008)402430 [37] D.G. Lowe, Object recognition from local scalcinvariant fcatures, in: IEEE International Conference on Computer Vision, vol 2, 1999, p. 1150 [38 A Turpin, F Scholer, User performance versus precision measures for simple search tasks, in: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR06 ACM, New York, NY, USA, 2006, Pp. 1118.所需积分/C币：5 上传时间：20190812 资源大小：311KB

html+css+js制作的一个动态的新年贺卡
该代码是http://blog.csdn.net/qq_29656961/article/details/78155792博客里面的代码，代码里面有要用到的图片资源和音乐资源。
立即下载

c语言程序设计pdf——谭浩强.pdf
C语言是一门通用计算机编程语言，应用广泛。C语言的设计目标是提供一种能以简易的方式编译、处理低级存储器、产生少量的机器码以及不需要任何运行环境支持便能运行的编程语言。
立即下载

高等数学第七版（同济大学）下册pdf
高等数学第七版（同济大学）下册教材pdf （PS：高等数学第七版上下册均有，因上传文件容量有限，因此分为两次上传，请有需要上册的朋友点开我的资源下载页进行下载）
立即下载