Matlab Toolbox for Dimensionality Reduction (v0.7.1b)
=====================================================
Information
-------------------------
Author: Laurens van der Maaten
Affiliation: University of California, San Diego / Delft University of Technology
Contact: lvdmaaten@gmail.com
Release date: June 25, 2010
Version: 0.7.1b
Installation
-------------------------
Copy the drtoolbox/ folder into the $MATLAB_DIR/toolbox directory (where $MATLAB_DIR indicates your Matlab installation directory). Start Matlab and select 'Set path...' from the File menu. Click the 'Add with subfolders...' button, select the folder $MATLAB_DIR/toolbox/drtoolbox in the file dialog, and press Open. Subsequently, press the Save button in order to save your changes to the Matlab search path. The toolbox is now installed.
Some of the functions in the toolbox use MEX-files. Precompiled versions of these MEX-files are distributed with this release, but the compiled version for your platform might be missing. In order to compile all MEX-files, type cd([matlabroot '/toolbox/drtoolbox']) in your Matlab prompt, and execute the function MEXALL.
Features
-------------------------
This Matlab toolbox implements 32 techniques for dimensionality reduction. These techniques are all available through the COMPUTE_MAPPING function or trhough the GUI. The following techniques are available:
- Principal Component Analysis ('PCA')
- Linear Discriminant Analysis ('LDA')
- Multidimensional scaling ('MDS')
- Probabilistic PCA ('ProbPCA')
- Factor analysis ('FactorAnalysis')
- Sammon mapping ('Sammon')
- Isomap ('Isomap')
- Landmark Isomap ('LandmarkIsomap')
- Locally Linear Embedding ('LLE')
- Laplacian Eigenmaps ('Laplacian')
- Hessian LLE ('HessianLLE')
- Local Tangent Space Alignment ('LTSA')
- Diffusion maps ('DiffusionMaps')
- Kernel PCA ('KernelPCA')
- Generalized Discriminant Analysis ('KernelLDA')
- Stochastic Neighbor Embedding ('SNE')
- Symmetric Stochastic Neighbor Embedding ('SymSNE')
- t-Distributed Stochastic Neighbor Embedding ('tSNE')
- Neighborhood Preserving Embedding ('NPE')
- Linearity Preserving Projection ('LPP')
- Stochastic Proximity Embedding ('SPE')
- Linear Local Tangent Space Alignment ('LLTSA')
- Conformal Eigenmaps ('CCA', implemented as an extension of LLE)
- Maximum Variance Unfolding ('MVU', implemented as an extension of LLE)
- Landmark Maximum Variance Unfolding ('LandmarkMVU')
- Fast Maximum Variance Unfolding ('FastMVU')
- Locally Linear Coordination ('LLC')
- Manifold charting ('ManifoldChart')
- Coordinated Factor Analysis ('CFA')
- Gaussian Process Latent Variable Model ('GPLVM')
- Autoencoders using stack-of-RBMs pretraining ('AutoEncoderRBM')
- Autoencoders using evolutionary optimization ('AutoEncoderEA')
Furthermore, the toolbox contains 6 techniques for intrinsic dimensionality estimation. These techniques are available through the function INTRINSIC_DIM. The following techniques are available:
- Eigenvalue-based estimation ('EigValue')
- Maximum Likelihood Estimator ('MLE')
- Estimator based on correlation dimension ('CorrDim')
- Estimator based on nearest neighbor evaluation ('NearNb')
- Estimator based on packing numbers ('PackingNumbers')
- Estimator based on geodesic minimum spanning tree ('GMST')
In addition to these techniques, the toolbox contains functions for prewhitening of data (the function PREWHITEN), exact and estimate out-of-sample extension (the functions OUT_OF_SAMPLE and OUT_OF_SAMPLE_EST), and a function that generates toy datasets (the function GENERATE_DATA).
The graphical user interface of the toolbox is accessible through the DRGUI function.
Usage
-------------------------
Basically, you only need one function: mappedX = compute_mapping(X, technique, no_dims);
Try executing the following code:
[X, labels] = generate_data('helix', 2000);
figure, scatter3(X(:,1), X(:,2), X(:,3), 5, labels); title('Original dataset'), drawnow
no_dims = round(intrinsic_dim(X, 'MLE'));
disp(['MLE estimate of intrinsic dimensionality: ' num2str(no_dims)]);
mappedX = compute_mapping(X, 'Laplacian', no_dims, 7);
figure, scatter(mappedX(:,1), mappedX(:,2), 5, labels); title('Result of dimensionality reduction'), drawnow
It will create a helix dataset, estimate the intrinsic dimensionality of the dataset, run Laplacian Eigenmaps on the dataset, and plot the results. All functions in the toolbox can work both on data matrices as on PRTools datasets (http://prtools.org). For more information on the options for dimensionality reduction, type HELP COMPUTE_MAPPING in your Matlab prompt. Information on the intrinsic dimensionality estimators can be obtained by typing the HELP INTRINSIC_DIM.
Other functions that are useful are the GENERATE_DATA function and the OUT_OF_SAMPLE and OUT_OF_SAMPLE_EST functions. The GENERATE_DATA function provides you with a number of artificial datasets to test the techniques. The OUT_OF_SAMPLE function allows for out-of-sample extension for the techniques PCA, LDA, LPP, NPE, LLTSA, Kernel PCA, and autoencoders. The OUT_OF_SAMPLE_EST function allows you to perform an out-of-sample extension using an estimation technique, that is generally applicable.
Many of the available functions are also available through the GUI, which can be executed by running the function DRGUI.
Pitfalls
-------------------------
When you run certain code, you might receive an error that a certain file is missing. This is because in some parts of the code, MEX-functions are used. I provide a number of precompiled versions of these MEX-functions in the toolbox. However, the MEX-file for your platform might be missing. To fix this, type in your Matlab:
mexall
Now you have compiled versions of the MEX-files as well. This fix also solves slow execution of the shortest path computations in Isomap.
If you encounter an error considering CSDP while running the FastMVU-algorithm, the binary of CSDP for your platform is missing. If so, please obtain a binary distribution of CSDP from https://projects.coin-or.org/Csdp/ and place it in the drtoolbox/techniques directory. Make sure it has the right name for your platform (csdp.exe for Windows, csdpmac for Mac OS X (PowerPC), csdpmaci for Mac OS X (Intel), and csdplinux for Linux).
Many methods for dimensionality reduction perform spectral analyses of sparse matrices. You might think that eigenanalysis is a well-studied problem that can easily be solved. However, eigenanalysis of large matrices turns out to be tedious. The toolbox allows you to use two different methods for eigenanalysis:
- The original Matlab functions (based on Arnoldi methods)
- The JDQR functions (based on Jacobi-Davidson methods)
For problems up to 10,000 datapoints, we recommend using the 'Matlab' setting. For larger problems, switching to 'JDQR' is often worth trying.
Papers
-------------------------
For more information on the implemented techniques and for a theoretical and empirical comparison, please have a look at the following papers:
- L.J.P. van der Maaten, E.O. Postma, and H.J. van den Herik. Dimensionality Reduction: A Comparative Review. Tilburg University Technical Report, TiCC-TR 2009-005, 2009.
Version history
-------------------------
Version 0.7.1b:
- Small bugfixes.
Version 0.7b:
- Many small bugfixes and speed improvements.
- Added out-of-sample extension for manifold charting.
- Added first version of graphical user interface for the toolbox. The GUI was developed by Maxim Vedenev with the help of Susanth Vemulapalli and Maarten Huybrecht. I made some changes in the initial version of the GUI code.
- Added implementation of Gaussian Process Latent Variable Model (GPLVM).
- Removed Simple PCA as probabilistic PCA is more appropriate.
Version 0.6b:
- Resolved bug in LLE that was introduced with v0.6b.
- Added implementation of t-SNE.
- Resolved small bug in data generation function.
- Improved RBM implementation in au
没有合适的资源?快使用搜索试试~ 我知道了~
34种数据降维方法代码.zip
共253个文件
m:196个
fig:20个
c:7个
2 下载量 198 浏览量
2023-09-15
22:47:40
上传
评论
收藏 1.17MB ZIP 举报
温馨提示
34种数据降维方法代码.zip
资源推荐
资源详情
资源评论
收起资源包目录
34种数据降维方法代码.zip (253个子文件)
mexCCACollectData.c 8KB
kernel_function.c 7KB
mexCCACollectData2.c 6KB
find_nn.c 4KB
computegr.c 3KB
._mexCCACollectData.c 167B
._kernel_function.c 82B
dijkstra.cpp 27KB
._dijkstra.cpp 171B
csdplinux 1.62MB
csdpmac 81KB
csdpmaci 88KB
dijkstra.dll 9KB
mexCCACollectData2.dll 8KB
computegr.dll 7KB
mexCCACollectData.dll 7KB
csdp.exe 1.06MB
mapping_parameters.fig 12KB
no_history.fig 8KB
not_calculated.fig 7KB
not_loaded.fig 7KB
drtool.fig 6KB
load_data_vars.fig 5KB
load_data.fig 5KB
load_xls.fig 4KB
load_data_1_var.fig 3KB
choose_method.fig 3KB
._mapping_parameters.fig 225B
._load_xls.fig 225B
._drtool.fig 225B
._not_calculated.fig 225B
._load_data_1_var.fig 225B
._load_data_vars.fig 225B
._load_data.fig 225B
._no_history.fig 225B
._choose_method.fig 225B
._not_loaded.fig 225B
fibheap.h 3KB
jdqz.m 77KB
jdqr.m 71KB
drtool.m 52KB
mapping_parameters.m 24KB
compute_mapping.m 19KB
cca.m 15KB
out_of_sample.m 11KB
intrinsic_dim.m 9KB
lmvu.m 8KB
minimize.m 8KB
load_data_vars.m 8KB
writesdpa.m 8KB
not_calculated.m 8KB
not_loaded.m 7KB
no_history.m 7KB
sdecca2.m 7KB
sammon.m 7KB
autoencoder_ea.m 7KB
load_data.m 6KB
cfa.m 6KB
csdp.m 6KB
tsne.m 6KB
kernel_function.m 5KB
choose_method.m 5KB
generate_data.m 5KB
mppca.m 5KB
load_data_1_var.m 5KB
load_xls.m 5KB
lle.m 5KB
fastmvu.m 4KB
npe.m 4KB
llc.m 4KB
train_rbm.m 4KB
train_lin_rbm.m 4KB
kernel_pca.m 4KB
hlle.m 4KB
readsol.m 4KB
plotn.m 4KB
gda.m 4KB
spe.m 4KB
cg_update.m 4KB
dijk.m 4KB
x2p.m 4KB
scattern.m 4KB
laplacian_eigen.m 3KB
em_pca.m 3KB
ltsa.m 3KB
lltsa.m 3KB
sne.m 3KB
sym_sne.m 3KB
lpp.m 3KB
landmark_isomap.m 3KB
charting.m 3KB
find_nn_adaptive.m 3KB
combn.m 3KB
backprop.m 3KB
backprop_gradient.m 3KB
find_nn.m 3KB
isomap.m 3KB
reconstruction_error.m 2KB
lda.m 2KB
run_llc.m 2KB
共 253 条
- 1
- 2
- 3
资源评论
抱抱宝
- 粉丝: 7891
- 资源: 66
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功