manifold learning

所需积分/C币:11 2014-08-12 20:29:51 1.57MB PDF

quantitative analysis of nonlinear embeding
ZHANG et al. QUANTITATIVE ANALYSIS OF NONLINEAR EMBEDDING 1989 As illustrated in Fig. 1, the area of a parallelogram Ax=|Yxyx‖sin A Here 0 is the angle of vIx and Vix. Therefore, we have A2=Vx sin20 Vx-lv (Vx, v (9) Since in the low-dimensional space, the changes along the Fig. 2. Example of the magnification factor direction of z(0) and that of z(2)are orthogonal. we have A Therefo IIL PROPOSED EVALUATION CRITERIA A ME= IVx112v2x2 ( VIx, V2x)2 In this section, we propose four evaluation criteria to Vx2Vx(2)2Vx()2Vz(2)2 measure the embedding qualities of the embedding algorithms ax dx x ax )ALSTD/ALED: measuring the global smoothness. (1) 7(2) 2)ALCD: measuring the co-directional consistence 3)GSCD: measuring the GSCd simultaneously (10) Considering that a Jacobi matrix is A. alst and ALED To estimate the global smoothness. a crucial issue is that the (a2)∈MN,2 low-dimensional representations obtained by different nonlin- ear embedding algorithms have different scaling levels, whi where Mw.2 denotes a matrix with N rows and two columns, then io can be rewritten as make a unified evaluation difficult. We thus propose an affine invariant criterion. ALSTD MF=vdet(JTJ)=vdet(vzf) vzf).(12) ALST ∑ STD (1) It is not difficult to see that the mf is the squared root of the first fundamental form in the differential geometry. Assuming Orz∈ 2 IR. it is als to prove that the where volume/area of a manifold is STD ∑ log MF(xi) ∑ log MF(ri) det(v f)V2f)dz x∈(x) x∈N(x) (2) It is worth noting that although in this example, the MF Here k; represents the number of the neighboring samples is calculated under a 2-D case, the computation of 12 in the neighborhood N(xi) of the original high-dimensional independent of the dimension of embedding space sample xi, N is the number of samples, and MF measures the Proposition 1 The criterion ALSTD iS affine-invariant degree of stretching or compressing each local region to its Proof: Let z= Dz be an affine transformation of z, where original high-dimensional space D is a non-singular matrix. By chain rule, we have vz/f Definition 1: MF is the volumetric stretching degree of V2v/r Dv, Let the MFz() be the MF parameterized transforming a differential element dze iRd to dx r, here with z, MFz(x)parameterized with z/. Then by definition in z is the counterpart of the original data point x. Assuming (12), MF (x)=det DIMF,(r). By definition of ALSTD, that the transformation function is differentiable given a considering one term in(2) low-dimensional sample z, since ∑(ogMF2(x)-ogMP2(x x dx d=(vfd x;∈N(x;) the MF MF(r) at point x is [301 ∑(ogM:(y)-gMF()(14 MF(x)=v det((vaf)vif) (4) Here we give a simple example to illustrate the computation where the overline means the average and the equality holds because the mean is f the me Assuming that a sample x E ir- is mapped into the low- log me ∑ log MF,(xi) dimensional space, and its counterpart is z E R.If z= x;∈N(x;) () z(2), then the derivative of x with respect to z is log det D+ ∑ log me,(xi) (1) (5) r∈N(x;) og det D+log MF (15 V 7(2) (6) and the constant log det D cancelled out 1990 IEEE TRANSACTIONS ON NEURAL NETWORKS. VOL 22. NO. 12 DECEMBER 2011 Algorithm 1 get_GsCD Algorithm 2 get_lcd Input high-dimensional samples X, Me, and PSDs of X Input a set of PSDs iPsDilj=l,N Output: the GsCD value Output: the co-directional consistence lcd 1: for each sample x i do 1: Initialize w with a randomly selected unit vector 2: Generate the signature Xi from Zi 2. while 1 do 3. end for 3:1)old=1); 4: for i,j= l to n do 4: Identify vi with a vector of PsDi which has the minimal 5. Find the k-nearest neighbors of xi: N(i)=xj]; angle with the vector w; 6: Compute the standard deviation of logarithm of (M Fi: 5: Utilize the maximal eigenvector of ww to update the lstd vector 1. where W (1 N) 7:(Algorithm 2)Compute the co-directional consistence of 6: if 1-w wold<10-/then iPSDi: ledi break. 8. end for 8. end if 9: Compute GSCD= Average(lstdi/lcdiD) 9: end while 10:Compute lcd =WWWw/N Consequently, we can use the ALSTD criterion to compare the embedding qualities of different nonlinear embedding Therefore algorithms without suffering from the influence of affine transformation. Similarly, we can define an alternative affine- (vzfvzf z diagIK(a) invariant evaluation criterion, ALED XA Adiag(K(z)) ALed= ∑( nax log MF(xj)-min log MF(j where e denotes a vector of ones. Different from our previous x∈N(x) work [18], our proposed criteria are not sensitive to affine Note that the affine invariance of ALED can be proved transformation, and represent the degree of global smoothness Consequently, they can be used as unified evaluation frame on the same principle as that of ALSTD. Intuitively, the work of nonlinear embedding algorithms proposed two criteria measure the global smoothness since the MF characterizes the first-order derivative along the manifold When the alSt or ALED is small, it means that the learned B. alcd and GSCD manifold is smooth in statistical sense To compute V,f in 3, we employ a similar strategy as our A main disadvantage of the ALSTd and ALEd is that previous work [18]. First, we use the radial basis function they cannot identify which direction dominates the changing (RBF), which is proved to be a universal approximator [32], direction of global smoothness. Assuming that the MF is four, to learn an inverse mapping function from the low-dimensional fo e corresponding ng p data to their original counterparts. Specifically, let the inverse the two principal directions in original 2-D space can be 1 x 4 mapping function f RdH Robe or 2 X 2. Therefore, we define a new criterion to compute the local co-directional consistence of the embedding algorithms 71,) as follows. f(z)=Ar(z), K(z) (16) k( CD;= max ll‖-=1NV ∑(0,01)2∈0,1 where k(z, z/)=exp(-(//l2)/(202), n, and d denote the dimensions of the original space and the reduced space where Ni is the number of data points in the neighborhood respectively. Unlike our previous work where the parameter o of point zi, vi E y is one of the psds at point zi, and the vector wn has the minimum angle with all the vectors vi i is predefined [181, we tune it with cross-validation. Because the corresponding low-dimensional counterpart Z of the data I,..., Ni inside the neighborhood N(zi) matrix X can be attained using some nonlinear embedding PSDs 37 When a sample z is inversely mapped into the algorithm, the Gram matrix is K= K(Z) and the inverse high-dimensional space, its PSDs are the dominate eigenvec mapping matrix A is A= XKt, where+' means Moore Penrose inverse. Since the rbf network is employed to fit th tors of the matrix(Vf)TV VT,A=diag(a1,., dd)be Actually, if let Vzf=UAV function f, we have the singular value decomposition of the Jacobi matrix Vzf then the PSDs at z are column vectors of v, and the product of Vf=ANX()=-1,/k(,1)(z-x1)r the MFs is i=; [31]. With this way, it is easy to compute the psd and me Intuitively, CD i measures the degree of co-directional con k(z, zN(Z-ZN) sistence between the PsDs at each point and its neighboring ZHANG et al. QUANTITATIVE ANALYSIS OF NONLINEAR EMBEDDING 1991 LTSA: 0.017 HLLE: O 024 MVU:0.0329 0,1-0.0500.050.1 0.1-0.0500.050.1 400 LLE:0.116 Isomap: 0.188 Diffusion: 0.218 Sammon: 0.243 0 t-SNE:0341 LMVU. 0373 LE:0732 CFM:0797 0.0 0.05 0 0 0.01 心2 0.05 50 0.050.1 0.02-0.0100.010.02 Fig 3. Visualization results based on PSDs. From left to right from top to bottom: the original S-curve data, the embedding results, and corresponding PSDs oblained by different nonlinear eMbedding lechniques. The litle of each subplot denotes the naIne of a nonlinear enbedding algorithn and the GSCD result MVU:0.00261 ×104 LTSA:0.00837 HLLE:0.0108 ?栅删H套“ 005 10 5 05 0.05 0.05 0.1-0.0500.05 LLE:0.129 LMVU:0.173 t-SNE:0.328 Isomap: 0.34 40 600-400-2000200400 2040 60-40-200204 LE:043 Diffusion: 0.512 CFM:0.574 Sammon: 0.693 0.01 0.04 0.005 10 0.02 0 0 0 0.005 0.01 0.04 0.015 0.05 0.02-00100.01 10632李1020 Fig. 4. Visualization results based on PSDs From left to right, trom top to bottom: the original 3-D Swiss-roll data, the embedding results and corresponding PSDs obtained by different nonlinear embedding techniques points, and can be solved by After several iterations, we can obtain the suboptimal vector wo with a fast convergence. The reason using the local co ∑(0,01}2= wy wu directional consistence is that for some topological structures such as torus it is unreasonable to estimate a co-directional consistence in the whole space. To make a trade-off between where w is the maximal eigenvector of ww, and w the global co-directional and local co-directional consistence ON;).Since the optimal vector w is actually we define a global evaluation criterion based on CD, ALCD unknown, and which PSD of each point has the minimal we d angle with the vector w cannot be determined in advance we thus employ an iterative strategy to solve the vector w ALCD ∑ CD (1 Specifically, we randomly select a vector w, and identify the vector Di which has the minimal angle with the vector u. Then It is easy to see that ALCD measures the average we utilize the first eigenvector of w to update the vector w. cO-directional consistence of the PSDs in the whole space 1992 IEEE TRANSACTIONS ON NEURAL NETWORKS. VOL 22. NO. 12 DECEMBER 2011 ammon: 0.0307 LTSA: 0.1 somap: 0.146 0.05 0.5 0.5 0.5 0.5 0.5 0.0500052 10 t-SNE:0.307 Diffusion: 0.335 MvU:0.338 LLE:0.341 10 0 50 20-1001020 LE:0.398 CFM:0412 LMU:0.499 HLLE:0.537 0.05 0.05 0.05 0.0l 5 -0.0500.05 0.02-0.0100.010.0 10-50510 005 0.05 Fig. 5. Visualization results based on PSDs. From left to right, from top to bottom: the original punctured-sphere dataset, the embedding results and corresponding PSDs obtained by different nonlinear embedding techniques. The title of each subplot indicates the abbreviation of the nonlinear embeddin algorithm and the gscd result HLLE: 0.111 LTSA:0.119 Isomap: 0.296 0.5 0.05 0.05 LLE:0.318 t-SNE:0.343 Sammon:0446 MVU:0467 0 Diffusion:0.531 LMVU:0.578 CFM:0.529 LE:0.63 0.0l 0.04 0.02 0015 0.05 Fig. 6. Visualization results based on PSDs From left to right, from top to bottom: the original twinpeaks dataset, the embedding results and corresponding PSDs obtained by different nonlinear embedding techniques. The title of each subplot indicates the abbreviation of the nonlinear embedding algorithm and the gscd result Since we want to measure both the GSCD simultaneously, we when GSCD is low, it means that an out-of-sample can be also design a simple yet effective criterion, GSCD mapped into its practical location with high possibility, up to cale rotation and translation fa GSCD STDi 20) For better understanding the proposed several criteria, we N CDi show a pseudo-code of the proposed GSCD as in Algo thm 1. A matlaB lon of the proposed c OnepossibleadvantageofusingGscdisthatitcanhelpteriaisavailablefromhttp://www.iipl.fudanedu.cn/zhangjp/ us to partially alleviate the problem of out-of-samples since sourcecode/GSCD.rar ZHANG et al. QUANTITATIVE ANALYSIS OF NONLINEAR EMBEDDING 1993 MvU:0.29 Isomap: 0 311 日日日EE 导 感省论 国国园 LTSA: 0.35 t-SNE:0.43 : ∷:身L 53 日恩 繆旨 Sammon: 0.415 LLE:0.467 LMVU:0.475 可 国 , t E ∵:, 台它 后E Fig. 7. Visualization results of sculpture face based on PSDs. From left to right, from top to bottom: the embedding results and corresponding PSDs obtained by different nonlinear embedding techniques. The title of each subplot indicates the abbreviation of the nonlinear embedding algorithm and the (scD result TABLE I TABLEⅡ PARAMETER SETTINGS. HERE k and ki(i=1, 2)DENOTE DIFFERENT FIVE BENCHMARK DATA SETS. HERE 'DIM'MEANS THE DIMENSIONS OF NEIGHBOR FACTORS, Perp DENOTES THE PERPLEXITY OF THE THE ORIGINAL DATA AND NUM MEANS THE NUMBER OF SAMPLES CONDITIONAL PROBABILITY DISTRIBUTION, THE VARIABLE O IS THE VARIANCE OF THE GAUSSIAN t DENOTES THE ITERATIVE TIMES Data set Dim Num 500 Algorithus ellis swissroll LLE [12]. LE [II, MVU [IOJ. CFM 351 punctured sphere 3 900 Isomap [91, HLLE [19], LTSA [20] k=8 twin-peaks 800 Sammon mapping [34 None sculpture 4096698 SNE [15] pErp Diffusion maps [21] default values as in Table l. The five benchmark dataset shown Landmark MvU [331 k1-3,k2=11 in Table II includes four synthetic datasets(s-curve, swiss roll punctured sphere, twin peaks), which are generated by the Mani matlab demo [36 and sculpture face images, whose I. EXPERIMENTS intrinsic dimension is close to two. It is also worth pointin In the experiments, we evaluate 11 published nonlinear out that for the practical dataset, we employ PCa to reduce the embedding algorithms on five datasets. The ll algorithms data into a 40-D subspace. Furthermore, the parameter o of the rBf network is selected using cross-validation technique 1)five global methods, i. e, Isomap [9], diffusion maps within the interval /10-8, 10*97 [21, MVU [10 and its landmark version(L-MVU)33 and Sammon mapping [34] A. Visualization and quantitative evaluation 2)six local methods, i.e., HLLE [19 tTSA [201,t-SNE We show the quantitative results based on PSDs as in [151, LLE [12], LE [11], and CFM [35] Figs. 3-7. In the figures, the longer line at each sample We employ these algorithms to reduce the original data represents the first PSD, and the second line is orthogonal to the 2-D spaces. Note that for practical dataset, we tune to the first PSD. Furthermore, each nonlinear embeadin g parameter k within an interval [4, 24] and report the best algorithm and the corresponding gsCd result are shown in ALCD and GsCD results, and other parameters are set to be the title of each subplot. Note that for real datasets, the results IEEE TRANSACTIONS ON NEURAL NETWORKS. VOL 22. NO. 12 DECEMBER 2011 TABLE III ALCD (UPPER PART) AND GSCD(BOTTOM) RESULTS FOR THE FIVE BENCHMARK DATASETS AND I I NONLINEAR EMBEDDING TECHNIQUES Algorithm LLE Isomap LE LTSA HLLE MVU Summon(-SNE CFM DFM LMVU S-curve 09770.8690.9190.98 0.997 0.999 09750.8890.9220.9660.977 Swiss 09910.93809851.000 1.000 1.000 0.876 0.88330.9960910.936 Punc- sphere0.9680.9710.95209730.9690.922 0.9740.93909550.9260.908 Twin peaks0.9540.911|0.9390.964 0.973 0.873 0.87 0.880.9460.8660.99 Sculpture08980950%0956 0.981 08830.8920.965N/A|0923 S-curve01601880.7320.017002410032902430341079702180373 Swiss 0.1290340.4310.00837001080.002610.6930.3280.5740.5120.173 0.3410.1460.398 0.5370.338 0.0307 3070.4120335049 Twin peaks0.3180.2960.630.190.l0.4670.44603430.6290.5310.578 Sculpture0467031[0422035NA02920415043510342NA0475 AL except for several ideal datasets. A similar phenomenon can be observed in diffusion maps and sammon mapping since they HLLE LTSA CFM LLE MVU Sammon diffusion Isomap LMvU t-SNE try to preserve different global distances; 2)MVU regards the GSCD preservation of local structures as constraint terms, and utilizes TSA MVU LLE Isomap HLLE Sammon t-SNE Diffusion LMVULE CFM the optimization function to preserve the global structures Consequently, such a trade-off between the preservation of global and local structures leads to better GSCD. When the Fig.8. With the significance level of 5%, grouping of the algorithms landmark strategy is employed, however, the performance of according to the paired t-test on the ALCD and GSCD. The embedding LMUv will be degenerated since landmark technique will algorithms within a group are statistically indistinguishable in terms of ALCD (top)and GSCD(bottom) impair the preservation of local topological structure; 3)HLLE focuses on second-order smoothness and thus has better GSCD. However, its performance is sensitive to different of HLLE and diffusion map do not be reported due to some datasets; 4) Since LTSA employs an alignment strategy to problems such as out of memory. align the local subspaces, it leads to better co-directional Since the intrinsic dimensions of s-curve and swiss roll consistence and global smoothness in most datasets; and 5)t- datasets are definitely known as two, it's helpful for us to SNE pays more attention to alleviating the crowding problem understand the embedding performances and corresponding which means that data of intrinsic high-dimension will be quantitative results of nonlinear embedding algorithms. As crowded when being projected into a lower-dimensional space shown in Fig 3, LTSA HLLE, and mvu perform very well Consequently, the GSCD are less considered both in the gscd. By contrast, LE, CFM, and LMVU produce Moreover, we perform the paired t-test to the ALCD and distinct non-uniform stretching or compression. In Fig. 4. GSCD results as shown in Table IV and Fig.8. With a MVU. LTSA and hlle are still ranked in the first three, significance level of 5%0, we find that: 1) in ALCD, the It is to be noted that in visual sense, LMVU seems better than five mentioned global methods have similar performance each lle for unravelling the data. its gscd are worse than those other. Meanwhile, the local method, LLE performs similarly of LLE. As for punctured sphere and twin peaks, the intrinsic with other four local methods including LE, LTSA, HLLe, dimensions are not definitely considered to be 2. So there may and cFM plus MvU. Although being categorized as a global be no absolute standard to tell which embedding performanc ce method, MVU regards the preservation of local structures is good or not. Instead, we could make a comparison in as constraint terms, and utilizes the optimization function respect of GSCD for the two datasets. For embedding results to preserve the global structures. Consequently, MVU plays of the real dataset sculpture face in Fig. 7, MVU and Isomap a bridge between the local and global methods in ALCD defeated all other algorithms as they flatten the data evenly and 2)in GSCD, such differences between the global and into a space very close to square in shape. In contrast to this local methods will be decreased. It means that the two serious squeeze and compression appears on the results of Lle strategies may consider different trade-off between the gScD and LMVU. In summary, we can conclude from the figures Therefore, the proposed unified evaluation criteria provide a that: 1)most of the embedding algorithms show a good co- novel geometrical insight to these embedding algorithms. The directional consistence, and 2)the smaller the deviation of the ALSTD and ALED have similar results as shown in Table v logarithmic Mes. the smaller the GSCD, and the higher the respectivel global smoothness. We also report all the quantitative ALCD and GSCD B. Oul-o/-Samples, Controlled Manyfold Topology and Com- results as in Table iil. It can be seen from the table that: putational Cost 1)isomap pays less attention to the preservation of local The embedding of a new data point is a difficult task structure, so its co-directional consistence is relatively low for many nonlinear embedding algorithms. Although [37] ZHANG et al. QUANTITATIVE ANALYSIS OF NONLINEAR EMBEDDING 1995 TABLE IV WITH THE SIGNIFICANCE LEVEL OF 5%, THE P-VALUE OF THE PAIRED t-TEST RESULTS FOR THE ALCD (UPPER PART) AND GSCD(BOTTOM)IN 11 PUBLISHED NONLINEAR EMBEDDING TECHNIQUES. THE P-VALUE IN BOLD TYPE INDICATES A REJECTION OF THE NULL HYPOTHESIS AT THE 5%O SIGNIFICANCE LEVEL. WHICH MEANS THERE IS SIGNIFICANT DIFFERENCE BETWEEN THE TWO ALGORITHMS Algorithm LLE Isomap LE LTSA HLLE MVU Sammon I-SNE CFM DFMLMVU ALCD LE1.000.15207440.3040.1830.9330.910.0160.9690.1720.250 Isomap01521000176002300150293086402660.1240.8910584 LE 0.7440.17610000920.04308890.23400080.7320.2100.290 LTSA0.3040.02300921000060204210.0580.0000.1960520.07 HLLE 0.1830.0150.0430.6021.0000.3100.0430.0000.0980.0390.009 MVU093302930890421|0310100002890080095002790450 Sammon(.1910.86402340.0580.0430.2891.00.5020.17909660.553 t-SNE 0.0160.2660.0080.0000.0000.0800.502 1.000 0.0060.4250056 CFM 0.9690.1240.7320.1960.0980 0.1790.0061.000.1580.93 DEM 0.1720.8910.2100.0520.0390.2790.9660.4250.1581.0000.543 LMvU0250058402900017[000045005530.0560.1930.543100 GSCD LLE1.0000.822003001270498068305030.3280.03102560.171 0.82 1.0000.0120.1020.5500.7720.3920.0700.0180.1550.084 LE 0.0300.01210000.0020.0580.0320.2650.0600.7950.2600.318 LTSA 0.1270.1020.0021.0000.72503570.097 0.0170.0030.0260.012 HLE04980.55000580725100007290.2820.2430.0470.1770.143 MVU0.683077200320.3570.7291.0000.3590.2440.0280.1840.131 Sammon0.5030.39202650.0970.2820.35910000.900.2150.8090.692 t-SN0.3280.0700.0600.0170.2430.2440.9021.0000.0670.5740.392 CFM0.0310.0180.7950.0030.0470.0280.2150.0671.0000.2100.254 DEM 0.2560.1550.2600.0260.1770.1840.8090.5740.2101.0000.846 LMvU0.170.08403180.0120.1430.1310.6920.3920.2540.8461.00 TABLE V ALSTD AND ALED RESULTS FOR THE FIVE BENCHMARK DATASETS AND II PUBLISHED NONLINEAR EMBEDDING TECHNIQUES orth LLE Isomap LE LTSA HLLE MVU Sammon t-SNE CFM DFM LMVU ALST -curve 0.120.630.6710.01690.024003290232030207340.2090.362 Swiss 0.1260.3160.4220.008370.01080.002610.6040.2890.570.4650.164 Punc- sphere0.3270.140.3750.09560.52 0.311 0.0290.2890.390.3090.447 ks03010.2670.5890.1140.1080.407 038703010.593046 ).527 Sculpture0.4160.2820.4060.334 n/A 0.2840 03650.3870.33N/A0 ALED S-curve0.4090.6592.530.06610.09220.1250.8131.262.680.5491.37 Swiss 0.4661.2 1520.03660.04970.00946 2.3 1.182.111.770.595 Punc- sphere1.12|0.4621.240.315 175 102 0.1 1.28 102 172 1.09 2.160.436 2.15 1.66 205 Sculpture 1.4 0.934 1.32 1.09 n/A 0.9130 1.19 1.28 1.07 N/A 1.43 proposed a unified framework to deal with the out-of-sample algorithms. Then we employ the weighting strategy of lle to issue, such a process pays less attention to the topological compute the weights of each test sample with the training sam- structure of data sets. To address the issue, we propose a ples. The low-dimensional representation of each test sample is topology-related way, i.e., embedding the new data points calculated by multiplying the weights and the low-dimensional to some low-dimensional space in which the embedding counterparts Z of training samples X. Then we employ an algorithm has lower GSCD. To verify the rationality of our affine transformation [29] to align Z to their ground-truths strategy, we select Swiss-roll and S-curve as two benchmark followed by computing the mean-squared errors in between datasets, each of which is generated by a known intrinsic The reported results are the average of 20 repetitions. From function. We divide 800 samples into 720 training samples Table Vl, Fig. 9 we can see that when GSCD is small, the and 80 test samples without overlap. The training samples are corresponding mse, which does not consider the smoothness used to attain their low-dimensional representations and the is also low with high possibility. It means that the proposed corresponding GSCD values produced by different embedding approach is effective. Note that since we only employ affine 1996 IEEE TRANSACTIONS ON NEURAL NETWORKS. VOL 22. NO. 12 DECEMBER 2011 0 0.5 0.5 0.5 0.5 1.5 0.5 iiii GSCD GSCD ……:MSE NPR NPR 5 5 Fig 9. GSCD with the corresponding mean squared errors and neighboring preservation rate of ll published nonlinear embedding algorithms on Swiss-roll daTaset (lefl) and S-curve dataset (right) Isomap: 0.114 LLE:0.262 40 20 40 ×104 MVU:0.339 HLLE:0.344 Diffusion: 0.351 0.05 0.0 ×10 0.05 0.05 the abbreviation of the algorithm and the GSCD. Color reflects the value of the mF, which is consistent with the volumetric stretching degiec wo Fig. 10. (a)Nine representative images in the generated dataset.(b)-(f)Results of five nonlinear embedding algorithms. The title of each subplot indicates transformation to align out-of-samples, diffusion map and an extrinsic dimensionality of 3136. Because the noise is Sammon map cannot be aligned well and thus have very randomly distributed on the background, the distance between high MSEs in the swiss-roll dataset since their attained low- neighboring pictures should be somewhat uniform. Therefore dimensional representations are of higher curvature. Compared the ideal reduced-dimensionality result can be expected to be with GSCD, the neighboring preservation rate (NPR)[16] very close to a square in shape is not consistent with the corresponding Mse according to As is shown in Fig. 10, it's not difficult to find that, with Table VI and Fig. 9. A possible reason is that the NPr will the lowest GSCD, the result from Isomap is the most linearly change under affine transformation separable and closest to the pre-designed topology. Although Furthermore, as is introduced in [29], we perform a design LLE produces some shape stretching, it does obviously better experiment involving a controlled manifold topology where in the preservation of smoothness compared to MvU. In a manifold is generated by translating a picture over a the embedding result from HLLE, we can see that great background with random noise as shown in Fig. 10(a). The compression or volumetric distortion appears on the corner manifold is sampled with 729 images, each encoded as a and the marginal points. Compared to this, Mvu does a vector of 3 136 pixel intensity values. As horizontal position slightly better especially on the corners. The performance of and vertical position are used to generate the dataset, the diffusion maps is poor both in the preservation of smoothness data can be interpreted as sampling from a manifold with and unfolding the marginal points. As a result, it leads to the an intrinsic dimensionality of two resided in a space with highest GSCd value for the algorithm

...展开详情
试读 12P manifold learning

评论 下载该资源后可以进行评论 1

sstoney 内容与标题不算很对应,不过论文本身还行
2014-11-05
回复
img

关注 私信 TA的资源

上传资源赚积分,得勋章
    最新推荐
    manifold learning 11积分/C币 立即下载
    1/12
    manifold learning第1页
    manifold learning第2页
    manifold learning第3页
    manifold learning第4页

    试读已结束,剩余8页未读...

    11积分/C币 立即下载 >