用于非负矩阵分解的Julia包_Julia_下载.zip资源-CSDN文库

共29个文件

jl：23个

md：2个

yml：2个

版权申诉

151 浏览量 2023-04-13 23:41:33 上传评论收藏 30KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

用于非负矩阵分解的Julia包_Julia_下载.zip （29个子文件）

NMF.jl-master

LICENSE.md 1KB

.github

workflows

ci.yml 2KB

TagBot.yml 648B

src

spa.jl 2KB

coorddesc.jl 6KB

NMF.jl 895B

greedycd.jl 5KB

common.jl 2KB

multupd.jl 6KB

projals.jl 3KB

utils.jl 2KB

initialization.jl 3KB

interf.jl 4KB

alspgrad.jl 13KB

examples

densenmf.jl 2KB

separable_nmf.jl 386B

test

spa.jl 782B

coorddesc.jl 583B

runtests.jl 389B

greedycd.jl 629B

multupd.jl 730B

testproblems.jl 533B

utils.jl 1KB

initialization.jl 1KB

interf.jl 1KB

alspgrad.jl 607B

.gitignore 13B

README.md 14KB

Project.toml 796B

## NMF.jl A Julia package for non-negative matrix factorization (NMF). [![CI](https://github.com/JuliaStats/NMF.jl/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/JuliaStats/NMF.jl/actions/workflows/ci.yml) [![CodeCov](https://codecov.io/github/JuliaStats/NMF.jl/badge.svg?branch=master)](https://codecov.io/github/JuliaStats/NMF.jl?branch=master) --------------------------- ## Development Status **Note:** Nonnegative Matrix Factorization is an area of active research. New algorithms are proposed every year. Contributions are very welcomed. #### Done - Lee & Seung's Multiplicative Update (for both MSE & Divergence objectives) - (Naive) Projected Alternate Least Squared - ALS Projected Gradient Methods - Coordinate Descent Methods - Random Initialization - NNDSVD Initialization - Sparse NMF - Separable NMF #### To do - Probabilistic NMF ## Overview *Non-negative Matrix Factorization (NMF)* generally refers to the techniques for factorizing a non-negative matrix ``X`` into the product of two lower rank matrices ``W`` and ``H``, such that ``WH`` optimally approximates ``X`` in some sense. Such techniques are widely used in text mining, image analysis, and recommendation systems. This package provides two sets of tools, respectively for *initilization* and *optimization*. A typical NMF procedure consists of two steps: (1) use an initilization function that initialize ``W`` and ``H``; and (2) use an optimization algorithm to pursue the optimal solution. Most types and functions (except the high-level function ``nnmf``) in this package are not exported. Users are encouraged to use them with the prefix ``NMF.``. This way allows us to use shorter names within the package and makes the codes more explicit and clear on the user side. ## High-Level Interface The package provides a high-level function ``nnmf`` that runs the entire procedure (initialization + optimization): **nnmf**(X, k, ...) This function factorizes the input matrix ``X`` into the product of two non-negative matrices ``W`` and ``H``. In general, it returns a result instance of type ``NMF.Result``, which is defined as ```julia struct Result W::Matrix{Float64} # W matrix H::Matrix{Float64} # H matrix niters::Int # number of elapsed iterations converged::Bool # whether the optimization procedure converges objvalue::Float64 # objective value of the last step end ``` The function supports the following keyword arguments: - ``init``: A symbol that indicates the initialization method (default = ``:nndsvdar``). This argument accepts the following values: - ``random``: matrices filled with uniformly random values - ``nndsvd``: standard version of NNDSVD - ``nndsvda``: NNDSVDa variant - ``nndsvdar``: NNDSVDar variant - ``spa``: Successive Projection Algorithm - ``custom``: use custom matrices ``W0`` and ``H0`` With the `nndsvd` variants, you can optionally supply the SVD result via `initdata`, e.g., `initdata=svd(X)`. If not supplied, the factorization is computed using a randomized svd (`rsvd` from [RandomizedLinAlg](https://github.com/JuliaLinearAlgebra/RandomizedLinAlg.jl)). - ``alg``: A symbol that indicates the factorization algorithm (default = ``:greedycd``). This argument accepts the following values: - ``multmse``: Multiplicative update (using MSE as objective) - ``multdiv``: Multiplicative update (using divergence as objective) - ``projals``: (Naive) Projected Alternate Least Square - ``alspgrad``: Alternate Least Square using Projected Gradient Descent - ``cd``: Coordinate Descent solver that uses Fast Hierarchical Alternating Least Squares (implemetation similar to scikit-learn) - ``greedycd``: Greedy Coordinate Descent - ``spa``: Successive Projection Algorithm - ``maxiter``: Maximum number of iterations (default = ``100``). - ``tol``: tolerance of changes upon convergence (default = ``1.0e-6``). - ``replicates``: Number of times to perform factorization (default = ``1``). - ``W0``: Option for custom initialization (default = ``nothing``). - ``H0``: Option for custom initialization (default = ``nothing``). **Note:** ``W0`` and ``H0`` may be overwritten. If one needs to avoid it, please pass in copies themselves. - ``update_H``: Option for specifying whether to update H (default = ``true``). - ``verbose``: whether to show procedural information (default = ``false``). ## Initialization - **NMF.randinit**(X, k[; zeroh=false, normalize=false]) Initialize ``W`` and ``H`` given the input matrix ``X`` and the rank ``k``. This function returns a pair ``(W, H)``. Suppose the size of ``X`` is ``(p, n)``, then the size of ``W`` and ``H`` are respectively ``(p, k)`` and ``(k, n)``. Usage: ```julia W, H = NMF.randinit(X, 3) ``` For some algorithms (*e.g.* ALS), only ``W`` needs to be initialized. For such cases, one may set the keyword argument ``zeroh``to be ``true``, then in the output ``H`` will be simply a zero matrix of size ``(k, n)``. Another keyword argument is ``normalize``. If ``normalize`` is set to ``true``, columns of ``W`` will be normalized such that each column sum to one. - **NMF.nndsvd**(X, k[; zeroh=false, variant=:std]) Use the *Non-Negative Double Singular Value Decomposition (NNDSVD)* algorithm to initialize ``W`` and ``H``. Reference: C. Boutsidis, and E. Gallopoulos. SVD based initialization: A head start for nonnegative matrix factorization. Pattern Recognition, 2007. Usage: ```julia W, H = NMF.nndsvd(X, k) ``` This function has two keyword arguments: - ``zeroh``: have ``H`` initialized when it is set to ``true``, or set ``H`` to all zeros when it is set to ``false``. - ``variant``: the variant of the algorithm. Default is ``std``, meaning to use the standard version, which would generate a rather sparse ``W``. Other values are ``a`` and ``ar``, respectively corresponding to the variants: *NNDSVDa* and *NNDSVDar*. Particularly, ``ar`` is recommended for dense NMF. - **NMF.spa**(X, k) Use the *Successive Projection Algorithm (SPA)* to initialize ``W`` and ``H``. Reference: N. Gillis and S. A. Vavasis, Fast and robust recursive algorithms for separable nonnegative matrix factorization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 4, pp. 698-714, 2013. Usage: ```julia W, H = NMF.spa(X, k) ``` ## Factorization Algorithms This package provides multiple factorization algorithms. Each algorithm corresponds to a type. One can create an algorithm *instance* by choosing a type and specifying the options, and run the algorithm using ``NMF.solve!``: #### The NMF.solve! Function **NMF.solve!**(alg, X, W, H) Use the algorithm ``alg`` to factorize ``X`` into ``W`` and ``H``. Here, ``W`` and ``H`` must be pre-allocated matrices (respectively of size ``(p, k)`` and ``(k, n)``). ``W`` and ``H`` must be appropriately initialized before this function is invoked. For some algorithms, both ``W`` and ``H`` must be initialized (*e.g.* multiplicative updating); while for others, only ``W`` needs to be initialized (*e.g.* ALS). The matrices ``W`` and ``H`` are updated in place. #### Algorithms - **Multiplicative Updating** Reference: Daniel D. Lee and H. Sebastian Seung. Algorithms for Non-negative Matrix Factorization. Advances in NIPS, 2001. This algorithm has two different kind of objectives: minimizing mean-squared-error (``:mse``) and minimizing divergence (``:div``). Both ``W`` and ``H`` need to be initialized. ```julia MultUpdate(obj::Symbol=:mse, # objective, either :mse or :div maxiter::Integer=100, # maximum number of iterations verbose::Bool=false, # whether to show procedural information tol::Real=1.0e-6, # tolerance of changes on W and H upon convergence update_H

评论收藏

内容反馈

版权申诉