2008-ImprovingMaximumMarginMatrixFactorization.pdf资源-CSDN文库

版权申诉

179 浏览量 2022-04-26 13:45:32 上传评论收藏 351KB PDF 举报

【摘要】中的“Improving Maximum Margin Matrix Factorization”是指一种改进的最大边界矩阵分解（Maximum Margin Matrix Factorization, MMMF）方法。MMMF是协同过滤（Collaborative Filtering, CF）中的一种机器学习策略，用于个性化产品推荐。这种方法通过最大化边界来优化模型，以提高推荐的准确性和效果。协同过滤是一种广泛应用于电子商务网站的推荐技术，例如亚马逊、苹果和网飞等，这些网站会根据用户的购买历史和偏好为客户提供个性化的商品或服务推荐。推荐系统的质量直接影响着客户的购买意愿，好的推荐能增加购买概率。 MMMF最初提出时，主要关注于二元评分数据，即用户对物品的喜好程度仅表示为喜欢或不喜欢。近年来，为了处理更复杂的评分结构，如顺序或排名信息，MMMF已被扩展到结构化排名损失（structured ranking losses）。本文中，作者探讨了MMMF的几个扩展：引入偏置项（offset terms），物品依赖的正则化（item-dependent regularization）以及基于推荐者图的图核（graph kernel on the recommender graph）。其中，图核可以捕捉物品之间的关联性，以更好地理解用户行为模式。作者还展示了图核与Mnih和Salakhutdinov在2008年神经信息处理系统进展中的MMMF扩展之间的等价性。实验结果显示，这些引入的扩展提升了原始MMMF模型的性能。这意味着改进后的MMMF可以更有效地处理大规模数据集，提供更精准的推荐，并可能增强用户的满意度和忠诚度。关键词包括协同过滤、结构化估计和推荐系统。这表明论文不仅关注算法的理论改进，也关注实际应用和性能评估。通过对不同方法的比较和实验，研究者旨在优化推荐系统，使其在真实世界场景中更加有效。总结来说，这篇2008年的研究论文提出了改进最大边界矩阵分解的方法，以提升协同过滤的推荐性能。通过引入新的模型元素和理论，如偏置项、物品依赖的正则化以及图核，作者展示了如何增强MMMF在处理复杂推荐任务时的准确性和实用性。这些改进对于理解和改进现代推荐系统的设计至关重要，对于电子商务和在线服务的用户体验有着深远的影响。

资源推荐

资源详情

资源评论

Mach Learn (2008) 72: 263–276

DOI 10.1007/s10994-008-5073-7

Improving maximum margin matrix factorization

Markus Weimer ·Alexandros Karatzoglou ·Alex Smola

Received: 21 June 2008 / Revised: 21 June 2008 / Accepted: 21 June 2008 / Published online: 10 July 2008

Springer Science+Business Media, LLC 2008

Abstract Collaborative ﬁltering is a popular method for personalizing product recommen-

dations. Maximum Margin Matrix Factorization (MMMF) has been proposed as one suc-

cessful learning approach to this task and has been recently extended to structured ranking

losses. In this paper we discuss a number of extensions to MMMF by introducing offset

terms, item dependent regularization and a graph kernel on the recommender graph. We

show equivalence between graph kernels and the recent MMMF extensions by Mnih and

Salakhutdinov (Advances in Neural Information Processing Systems 20, 2008). Experimen-

tal evaluation of the introduced extensions show improved performance over the original

MMMF formulation.

Keywords Collaborative ﬁltering · Structured estimation · Recommender systems

1 Introduction

Collaborative ﬁltering has gained much attention in the machine learning community due to

its applications in electronic commerce sites such as those of Amazon, Apple and Netﬂix.

Such sites typically offer personalized recommendations to their customers. The quality of

these suggestions is crucial to the overall success, since good recommendations will increase

the propensity of a purchase.

Editors: Walter Daelemans, Bart Goethals, Katharina Morik.

M. Weimer (



)

Technische Universität Darmstadt, Darmstadt, Germany

e-mail: weimer@acm.org

A. Karatzoglou

INSA de Rouen, LITIS, Rouen, France

e-mail: alexis@ci.tuwien.ac.at

A. Smola

NICTA, Canberra 2601, Australia

e-mail: alex.smola@gmail.com

264 Mach Learn (2008) 72: 263–276

However, suggesting the right items is a highly nontrivial task: (1) There are many items

to choose from. (2) Customers are willing to consider only a small number of recommenda-

tions (typically in the order of ten). Collaborative ﬁltering addresses this problem by learning

the suggestion function for a user from ratings provided by this and other users on items.

The known data can be thought of a sparse n ×m matrix Y of rating/purchase informa-

tion, where n denotes the number of users and m is the number of items. In this context, Y

indicates the rating of item j given by user i. Typically, the rating is given on a ﬁve star scale

and thus Y ∈{0,...,5}

n×m

, where the value 0 indicates that a user did not rate an item. In

this sense, 0 is special since it does not indicate that a user dislikes an item but rather that

data is missing.

Related work A common approach to collaborative ﬁltering is to ﬁt a factor model to the

data. For example by extracting a feature vector for each user and item in the data set such

that the inner product of these features minimizes an explicit or implicit loss functional

(e.g. Hoffman 2004 following a probabilistic approach). The underlying idea behind these

methods is that both user preferences and item properties can be modeled by a number of

factors.

The basic idea of matrix factorization approaches is to ﬁt the original Y matrix with

a low rank approximation F . More speciﬁcally, the goal is to ﬁnd such an approximation

that minimizes the sum of the squared distances between the known entries in Y and their

predictions in F . One possibility of doing so is by using a Singular Value Decomposition

of Y and by using only a small number of the vectors obtained by this procedure. In the

information retrieval community this numerical operation is commonly referred to as Latent

Semantic Indexing.

Note, however, that this method does not do justice to the way Y was formed. An entry

=0 indicates that we did not observe a (user,object) pair. It does, however, not indicate

that user i disliked object j . In (Srebro and Jaakkola 2003), an alternative approach is sug-

gested which is the basis of the method described in this paper. We aim to ﬁnd two matrices

U and M where U ∈R

n×d

and M ∈R

d×m

such that F =UM with the goal to approximate

the observed entries in Y rather than approximating all entries at the same time.

In general, ﬁnding a globally optimal solution of the low rank approximation problem

is unrealistic: in particular the approach proposed by (Srebro and Jaakkola 2003)forcom-

puting a weighted factorization, which is relevant in collaborative ﬁltering settings, requires

semideﬁnite programming which is feasible only for hundreds, at most, thousands of terms.

Departing from the goal of minimizing the rank, Maximum Margin Matrix Factorization

(MMMF) aims at minimizing the Froebenius norms of U and M, resulting in a set of con-

vex problems when taken in isolation and thus tractable by current optimization techniques.

It was shown in (Srebro et al. 2005; Srebro and Shraibman 2005) that optimizing the Froebe-

nius norm is a good proxy for optimizing the rank in its application to model complexity

control. Similar ideas based on matrix factorization have been also proposed in (Salakhutdi-

nov and Mnih 2008; Takács et al. 2007).

Recently (Weimer et al. 2008) proposed to extend the general MMMF framework in

order to minimize structured (ranking) losses instead of the sum of squared errors on the

known ratings. Key in the reasoning is that collaborative ﬁltering is often at the heart of

recommender systems. For those, only the ranking of unrated items in terms of the user

preferences matter. To enable effective optimization of the structured ranking loss, a novel

optimization technique (Smola et al. 2008) was used to minimize the loss in terms of the

Normalized Discounted Cumulative Gain (NDCG).

Mach Learn (2008) 72: 263–276 265

Our contribution Building upon the results outlined above, we introduce a number of ex-

tensions to the these MMMF models.

− An efﬁcient means of computing the gradient in multiclass ordinal regression.

− A bias for movies and users to deal with heterogeneity of movies and users.

− An automatic adaptive regularization scheme which can deal with the varying number of

movies per user and likewise users per movie.

− A graph kernel that captures similarities between users and items in the recommender

graph in the spirit of (Salakhutdinov and Mnih 2008; Basilico and Hofmann 2004). We

prove that both methods are essentially equivalent and equal to a graph kernel on the

recommender graph.

For each of these extensions, we show how they can be integrated into the MMMF frame-

work such that structured estimation as proposed in (Weimer et al. 2008) is still feasible.

The paper is organized as follows: Sect. 2 describes the general MMMF model, its gen-

eralization to structured estimation and the use of state-of-the-art optimization methods to

train the model. Section 3 describes our extensions to that model. In Sect. 4, we discuss our

experimental evaluation and Sect. 5 concludes the paper with remarks on future work.

2 Maximum margin matrix factorization

2.1 Optimization problem

MMMF computes a dense approximation F of the sparse matrix Y which forms the training

data. The approximation is based on the modeling assumption that any particular rating of

item j by user i is a linear combination of item and user features. Thus, the approximation

can be written as F =UM. Here, U

i∗

represents the feature vector for user i and M

∗j

is the

feature vector for item j . The predicted rating of item j by user i is then the inner product

between these feature vectors:

=U

i∗

∗j

.

Finding the appropriate matrices U and M is achieved by minimizing the regularized loss

functional where the Froebenius norm (U

=tr UU



)oftheU and M matrices is used

for capacity control and thus overﬁtting prevention. The Froebenius norm has been intro-

duced to the MMMF framework and shown to be a proper norm on F (Srebro and Shraib-

man 2005). This leads us to the following optimization problem:

minimize

U,M

L(F, Y ) +

M

U

. (1)

Here λ

, λ

are the regularization parameters for the M and U matrix respectively and

F =UM. Moreover, L(F, Y ) is a loss measuring the discrepancy between Y and F .

The optimization problem can be solved exactly by using a semideﬁnite reformula-

tion (Srebro et al. 2005). However, this dramatically limits the size of the problem to several

thousand users/movies. Instead, we exploit the fact that the problem is convex in U and M

respectively when the other variables are ﬁxed to perform subspace descent (Rennie and

Srebro 2005).

剩余13页未读，继续阅读

评论收藏

内容反馈

版权申诉

普通网友

粉丝: 13w+
资源:
9195

2008-Improving Maximum Margin Matrix Factorization.pdf

最新资源

2008-Improving Maximum Margin Matrix Factorization.pdf

High Performance in-memory computing with Apache Ignite.pdf

藏经阁-Improving Python and Spark Per.pdf

Baier, Katoen - 2008 - Principles of Model Checking.pdf

Signal Integrity - Simplified(Eric Bogatin).pdf

mcts70-515_microsoft_trainingkit.pdf

AngularJS - Novice to Ninja.pdf.pdf )

IMPROVING GPU UTILIZATION WITH MULTI-PROCESS SERVICE (MPS).pdf

sympy-docs-pdf-1.0.pdf

CUDA11.0-C-Programming-Guide.pdf

HotSDN-paper-2014-ONOS-Towards-an-Open-Distributed-SDN-OS.pdf

藏经阁-Improving Python and Spark.pdf

藏经阁-UNDER EXAMINATION_IMPROVING YO.pdf

Learning-Scala-Practical-Functional-Programming-for-the-JVM.pdf

elasticsearch-py-readthedocs-io-en-7.7.1.pdf

UCAM-CL-TR-696.pdf

refactoring-improving the design of existing code.pdf

Chapter 5 - Improving Managed Code Performance.pdf

building-real-time-data-pipelines.pdf

Refactoring-Improving the Design of Existing Code.pdf

Image-visibility-improving-master.zip_图像模糊_增强细节_模糊增强_水下_水下 提取

Embedded System development Coding Reference guide-unlocked.pdf

韩国-2009-国家报告full_report.pdf

aws-sdk-cpp-1.11.4（x86-windows）

ScienceDirect_articles_11Jul2017_07-46-46.899.zip

Infineon-PCIM_2014_Improving_Efficiency_in_AC_drives-ED-v1_0-en.pdf

藏经阁-Improving Resource Efficiency.pdf

最新资源

Image-visibility-improving-master.zip_图像模糊_增强细节_模糊增强_水下_水下提取