recommender systems.pdf

所需积分/C币:50 2015-07-10 10:48:36 3.9MB PDF
收藏 收藏

L. Lu et aL. /Physics Reports 519 (2012)1-49 most popular items or suggesting other products by the same producer to complicated data mining techniques. People soon realized that there is no unique best recommendation method. Rather, depending on the context and density of the available data, different methods adapting to particular applications are most likely to succeed. Hence there is no panacea, and the best one can do is to understand the underlying premises and recommender mechanisms, then one can tackle many diverse application problems from the real life examples. This is also reflected in this review where we do not try to highlight any ultimate approach to recommendation. Instead we review the basic ideas, methods and tools with particular emphasis on physics-rooted approaches The motivation for writing this review is multifold Firstly, while extensive reviews of recommender systems by computer scientists already exists [13-15, the view of physicists is different from that of computer scientists by using more the complex networks approach and adapting various classical physics processes(such as diffusion) for information filtering We thus believe that this review with its structure and emphasis on respective topics can provide a novel point of view Secondly, the past decade has already seen a growing interest of physicists in recommender systems and we hope that this review can be a useful source for them by describing the state of the art in language which is more familiar to the physics community Finally, the interdisciplinary approach presented here might provide new insights and solutions for open problems and challenges in the active field of information filtering This review is organized as follows. To better motivate the problem, In Section 2 we begin with a discussion of real applications of recommender systems. Next, in Section 3 we introduce basic concepts -such as complex networks recommender systems, and metrics for their evaluation -that form a basis for all subsequent exposition. Then we proceed to description of recommendation algorithms where traditional approaches(similarity-based methods in Section 4 and dimensionality reduction techniques in Section 5) are followed by network-based approaches which have their origin in the random walk process well known to all physicists (in Section 6). Methods based on external information, such as social relationships (in Section 7), keywords or time stamps (in Section 8), are also included We conclude with a brief evaluation of methods' performance in Section 9 and a discussion on the outlook of the field in Section 10 2. Real applications of recommender systems Thanks to the ever-decreasing costs of data storage and processing, recommender systems gradually spread to most areas of our lives. Sellers carefully watch our purchases to recommend us other goods and enhance their sales, social web sites analyze our contacts to help us connect with new friends and get hooked with the site, and online radio stations remember skipped songs to serve us better in the future(see more examples in Table 1). In general, whenever there is plenty of diverse products and customers are not alike personalized recommendation may help to deliver the right content to the right person this is particularly the case for those Internet-based companies that try to make use of the so-called long-tail [16 of goods which are rarely purchased yet due to their multitude they can yield considerable profits(sometimes they are referred to as"worst-sellers"). For example on Amazon, between 20% and 40% of sales is due to products that do not belong to the shop's 100,000 most sold products [17 a recommender system may hence have significant impact on a company's revenues: for example, 60% of DVDs rented by Netflix are selected based on personalized recommendations As discussed in[18, recommender systems not only help decide which products to offer to an individual customer, they also increase cross-sell by suggesting additional products to the customers and improve consumer loyalty because consumers tend to return to the sites that best serve their needs(see [19 for an empirical analysis of the impact of recommendations and consumer feedback on sales at Amazon. com) Since no recommendation method serves best all customers, major sites are usually equipped with several distinct recommendation techniques ranging from simple popularity-based recommendations to sophisticated techniques many of which we shall encounter in the following sections Further, new companies emerge(see for example, string. com) which aim at collecting all sorts of user behavior (ranging from pages visited on the web and music listened on a personal player to"liking "or purchasing items )and using it to provide personalized recommendations of different goods or services 2.1. Netflix prize anon October 2006, the online dvd rental company Netflix released a data set containing approximately 100 million onymous movie ratings and challenged researchers and practitioners to develop recommender systems that could beat the accuracy of the company's recommendation system, Cinematch [20]. Although the released data set represented only a mall fraction of the company's rating data, thanks to its size and quality it fast became a standard in the data mining and machine learning community. The data set contained ratings in the integer scale from 1 to 5 which were accompanied by dates For each movie, title and year of release were provided No information about users was given. Submitted predictions were evaluated by their root mean squared error( rmse on a qualifying data set containing over 2817, 131 unknown ratings Out of 20,000 registered teams, 2000 teams submitted at least one answer set. On 21 September 2009 the grand prize of $1000,000 was awarded to a team that overperformed the cinematch's accuracy by 10%. At the time when the contest As presented by jon Sanders(recommendation Systems Engineering, Netflix during the talk"Research Challenges in Recommenders"at the 3rd aCm Conference on Recommender Systems(2009) L. Li et aL./ Physics Reports 519 (2012)1-49 Table 1 Popular sites using recommender systems. Besides, there are also some companies devoting themselves to recommendation techniquessuchasBaifendian(,Baynote (,Choicestream(, Goodrec(,andothers Site what is recommended amaze Books/ other products Facebook Friends Wefollow Friends Movielens Movies Nanocrowd Movies Findory News Digg News Zite Beehive DVDS CDNOW CDS/DVDs charme Dates Chemist Dates Date Perfectmatch Dates Careerbuilder Jobs Monster Jobs Mufin StumbleUp Web site was closed, there were two teams that achieved the same precision. the prize was awarded to the team that submitted heir results 20 min earlier than the other one. (See [11] for a popular account on how the participants struggled with the thalle There are several lessons that we have learned in this competition [21]. Firstly, the company gained publicity and a superior recommendation system that is supposed to improve user satisfaction. Secondly, ensemble methods showed their potential of improving accuracy of the predictions. Thirdly, we saw that accuracy improvements are increasingly demanding when rmse drops below a certain level. Finally, despite the company's effort anonymity of its users was not sufficiently ensured [25. As a result, Netflix was sued by one of its users and decided to cancel a planned second competition 2. 2. Major challenges Researchers in the field of recommender systems face several challenges which pose danger for the use and performance of their algorithms. Here we mention only the major ones 1. Data sparsity. Since the pool of available items is often exceedingly large(major online bookstores offer several millions of books, for example), overlap between two users is often very small or none. Further, even when the average number of evaluations per user/item are high they are distributed among the users items very unevenly(usually they follow a power-law distribution[26] or a Weibull distribution [27 ])and hence majority of users items may have expressed received only a few ratings. Hence, an effective recommender algorithm must take the data sparsity into account 28 2. Scalability. While the data is mostly sparse for major sites it includes millions of users and items. It is therefore essential to consider the computational cost issues and search for recommender algorithms that are either little demanding or easy to parallelize (or both ). Another possible solution is based on using incremental versions of the algorithms where as the data grows, recommendations are not recomputed globally (using the whole data but incrementally (by slightly adjusting previous recommendations according to the newly arrived data)[29, 30]. This incremental approach is similar to perturbation techniques that are widely used in physics and mathematics 31 3. Cold start. When new users enter the system, there is usually insufficient information to produce recommendation for them. The usual solutions of this problem are based on using hybrid recommender techniques(see Section 8.4)combining ontent and collaborative data 32, 33] and sometimes they are accompanied by asking for some base information(such as age, location and preferred genres from the users. Another way is to identify individual users in different web services 2 The ensemble methods deal with the selection and organization of many individual algorithms to achieve better prediction accuracy. In fact, the winning team, called BellKor's Pragmatic Chaos, was a combined team of BellKor [22], Pragmatic Theory [23 and Big Chaos [24(of course, it was not a simple combination but a sophisticated design), and each of them consists of many individual algorithms. For example, the Pragmatic Theory solution considered 453 individual algorithms L. Lu et aL. Physics Reports 519 (2012)1-49 For example, Baifendian developed a technique that could track individual users' activities in several e-commerce sites, so that for a cold-start user in site A, we could make recommendation according to her records in sites b, C, D, etc [34] 4. Diversity vs. accuracy. When the task is to recommend items which are likely to be appreciated by a particular user, it is usually most effective to recommend popular and highly rated items. Such recommendation, however has very little value for the users because popular objects are easy to find (often they are even hard to avoid) without a recommendeR system a good list of recommended items hence should contain also less obvious items that are unlikely to be reached by the users themselves 35]. Approaches to this problem include direct enhancement of the recommendation list's diversity [36-38 and the use of hybrid recommendation methods [39] 5. Vulnerability to attacks. Due to their importance in e-commerce applications, recommender systems are likely targets of malicious attacks trying to unjustly promote or inhibit some items[40 There is a wide scale of tools preventing this kind of behavior, ranging from blocking the malicious evaluations from entering the system to sophisticated resistant recommendation techniques 41. However, this is not a easy task since the strategies of attackers also get more and more advanced as the developing of preventing tools. As an example, Burke et al. [42] introduced eight attacking strategies, which are further divided into four classes: basic attack, low-acknowledge attack, nuke attack and informed attack 6. The value of time While real users have interests with widely diverse time scales( for example, short term interests related to a planned trip and long term interests related to the place of living or political preferences ), most recommendation algorithms neglect the time stamps of evaluations. It is an ongoing line of research whether and how value of old opinions should decay with time and what are the typical temporary patterns in user evaluations and item relevance[43, 44 7. Evaluation of recommendations. While we have plenty of distinct metrics(see Section 3.4), how to choose the ones best corresponding to the given situation and task is still an open question. Comparisons of different recommender algorithms are also problematic because different algorithms may simply solve different tasks. Finally, the overall user experience with a given recommendation system-which includes user's satisfaction with the recommendations and user's trust in the system-is difficult to measure in"offline"evaluation Empirical user studies thus still represent a welcome source of feedback on recommender systems. 8. User interface. It has been shown that to facilitate users' acceptance of recommendations the recommendations need to be transparent [45, 46]: users appreciate when it is clear why a particular item has been recommended to them another issue is that since the list of potentially interesting items may be very long it needs to be presented in a simple way and it should be easy to navigate through it to browse different recommendations which are often obtained by distinct approaches. Besides the above long-standing challenges, many novel issues appear recently. Thanks to the development of methodology in related branches of science, especially the new tools in network analysis, scientists started to consider the effects of network structure on recommendation and how to make use of known structural features to improve recommendation. For example, Huang et al. [47 analyzed the consumer-product networks and proposed an improved recommendation algorithm preferring edges that enhance the local clustering property and sahebi et al.48 designed an improved algorithm making use of the community structure. Progress and propagation of new techniques also bring new challenges. For example, the gPS equipped mobile phones have become mainstream and the Internet access is ubiquitous, hence the location-based recommendation is now feasible and increasingly significant. Accurate recommendation asks for both the high predictability of human movements 49, 50 and quantitative way to define similarities between locations and people [51,52. Lastly, intelligent recommender systems should take into account the different behavioral patterns of different people. For example, new users tend to visit very popular items and select similar items, while old users usually have more specific interests 53, 54, and users behave much differently between low-risk(e.g, collecting bookmarks downloading music, etc. and high-risk(e. g buying a computer, renting a house, etc activities 55, 56 3. Definitions of subjects and problems We briefly review in this chapter basic concepts that are useful in the study of recommender systems 3. 1. Networks Network analysis is a versatile tool in uncovering the organization principles of many complex systems [5-9]. A network f elements(called nodes or vertices with connections(called edges or links between them Many social, biological technological and information systems can be described as networks with nodes representing individuals or organizations and edges capturing their interactions. The study of networks, referred to as graph theory in mathematical literature has a long history that begins with the classical Konigsberg bridge problem solved by euler in 18th century [57]. Mathematically speaking, a network G is an ordered pair of disjoint sets (v, e)where v is the set of nodes and the set of edges, e, is a subset of V xV[58. In an undirected network, an edge joining nodes x and y is denoted by x <>y, and x<> y and y<>x mean exactly the same edge. In a directed network, edges are ordered pairs of nodes and an edge from x to y is denoted by x>y Websites like Foursquare, Gowalla, Google Latitude, Facebook, Jiapang, and others already provide location-based services and show that many people want to share their location information and get location-based recommendations 6 L. Li et aL./ Physics Reports 519 (2012)1-49 Edgesx->yandy>x are distinct and may be present simultaneously. Unless stated otherwise, we assume that a network does not contain a self-loop an"edge"joining a node to itselfor a multi-edge(several"edges"joining the same pair of nodes) In a multinetwork both loops and multi-edges are allowed In an undirected network G(V, E), two nodes x and y are said to be adjacent to each other if x<>y EE. The set of nodes adjacent to a node x, the neighborhood of x, is denoted by Ix Degree of node x is defined as kx=x The degree distribution, P(k), is defined as the probability that a randomly selected node is of degree k In a regular network, every node has the same degree ko and thus P(k)= Sk. ko. In the classical Erdos-Renyi random network [59] where each pair of nodes is connected by an edge with a given probability p, the degree distribution follows a binomial form [60] N P() k p(1-p) There n= v is the number of nodes in the network. This distribution has a characterized scale represented by the average degreek- p(N-1). At the end of the last century, researchers turned to investigation of large-scale real networks where it turned out that their degree distributions often span several orders of magnitude and approximately follow a power-law form P(k) k 2 with y being a positive exponent usually lying between 2 and 3 [5]. Such networks are called scale-free networks as they lack a characteristic scale of degree and the power-law function P(k) is scale-invariant [61. Note that detection of power-lay distributions in empirical data requires solid statistical tools [62,63]. For a directed network, the out-degree of a node x denoted by kou, is the number of edges starting at x, and the in-degree kin is the number of edges ending at x. The in-and out-degree distribution of a directed network in general differ from each other. Generally speaking, a network is said to be assortative if its high-degree nodes tend to connect with high-degree nodes and the low-degree nodes tend to connect with low-degree nodes(it is said to be disassortative if the situation is opposite) This degree-degree correlation can be characterized by the average degree of the nearest neighbors [64,65 or a variant of Pearson coefficient called assortativity coefficient [66,67]. The assortativity coefficient r lies in the range -1<r< 1. If r>0 the network is assortative; if r <0, the network is disassortative. Note that this coefficient is sensitive to degree heterogeneity. For example, r will be negative in a network with very heterogeneous degree distribution(e.g. the Internet) regardless to the network's connecting patterns 68 The number of edges in a path connecting two nodes is called length of the path, and distance between two nodes is defined as the length of the shortest path that connects them. the diameter of a network is the maximal distance among all node pairs and the average distance is the mean distance averaged over all node pairs as N(N-1) ∑ (3 where dxy is the distance between x and y. Many real networks display a so-called small-world phenomenon: their average distance does not grow faster than the logarithm of the network size [70, 7 The importance of triadic clustering in social interaction systems has been realized for more than 100 years [72]. In social network analysis [73, this kind of clustering is called transitivity, defined as three times the ratio of the total number of triangles in a network to the total number of connected node triples. In 1998, Watts and Strogatz [71 proposed a similar index to quantify the triadic clustering, called clustering coefficient. For a given node x, this coefficient is defined as the ratio of the number of existing edges between x's neighbors to the number of neighbor pairs, 号kx(kx-1) where ex denotes the number of edges between kx neighbors of node x(this definition is meaningful only if kx>1). The network clustering coefficient is defined as the average of Cx over all x with kx> 1. It is also possible to define the clustering coefficient as the ratio of 3 x number of triangles in the network to the number of connected triples of vertices, which is sometimes referred to as"fraction of transitive triples"[7. note that the two definitions can give substantially different results Fig. 1 illustrates the above definitions for a simple undirected network. For more information about network measurements, readers are encouraged to refer an extensive review article [74] on characterization of networks. When no path exists between two nodes, we say that their distance is infinite which makes the average distance automatically infinite too This problem can be avoided either by excluding such node pairs from averaging or by using the harmonic mean [ 7, 69 L. Lu et aL. Physics Reports 519 (2012)1-49 5 Fig 1. A simple undirected network with 6 nodes and 7 edges. The node degrees are k1=1,k= ks =3 and k3=k4=k6= 2, corresponding to the distribution P(1)=1/6, P(2)=1/2 and P(3)= 1/3. The diameter and average distance of this network are dmax =3 and d= 1.6, respectively. The clustering coefficients are C2=6.C3=C4=0, C5=3 and c6= 1, and the average clustering coefficient is C=0.3 3.2. Bipartite networks and hypergraphs A network G(V, E)is a bipartite network if there exists a partition (V1, v2)such that V1UV2=V,Vinv2=0, and every edge connects a node of vi and a node of V2. Many real systems are naturally modeled as bipartite networks: the metabolic network [75 consists of chemical substances and chemical reactions, the collaboration network [76 consists of acts and actors, the Internet telephone network consists of personal computers and phone numbers [77], etc. We focus on a particular class of bipartite networks, called web-based user-object networks 53], which represent interactions between users and objects in online service sites, such as collections of bookmarks in delicious. com and purchases of books in amazon. com as we shall see later, these networks describe the fundamental structure of recommender systems. Web-based user-object networks are specific by their gradual evolution where both nodes and links are added gradually. By contrast, this cannot happen in, for example, act-actor networks (e.g, one cannot add authors to a scientific paper after its publication) Most web-based user-object networks share some structural properties. Their object-degree distributions obey a power law-like form P(k)k, with y 1.6 for the Internet Movie database(IMdb)[78],y a 1. 8 for the music-Sharing site audioscrobbler. com [79], y N 2.3 for the e-commerce site amazon. com [53], and y N 2.5 for the bookmark-sharing site delicious. com [53]. The form of the user-degree distribution is usually between an exponential and a power law[ 53 and can be well fitted by the Weibull distributiondistribution [27(also called the stretched exponential distribution in the literature[80D) P(k)k-l exp[-(/ko)" where ko is a constant and u is the stretching exponent Connections between users and objects exhibit a disassortative mixing pattern 53, 78 A straightforward extension of the definition of bipartite network is the So-called multipartite network. For an r-partite network g(V,E, there is an r-partition V,V2,…, Vr such that v=V1Uv2∪∴Uv,V∩v= whenever i≠j, and no edge joins two nodes in the same set Vi for all 1<i< r. The tripartite network representation has found its application in collaborative tagging systems(also called folksonomies in the literature)181-84, where users assign tags to online resources, such as photographs in flickr. com, references in CiteULike. com and bookmarks in delicious. com Note that some information is lost in the tripartite representation. For example, given an edge connecting a resource and a tag, we do not know which user (or users)contributed to this edge to resolve this, hypergraph 85 can be used to give an exact representation of the full structure of a collaborative tagging system. In a hypergraph H(v, e), the hyperedge set e is a subset of the power set of v, that is the set of all subsets of V. link e can therefore connect multiple nodes. analogously to ordinary networks, node degree in a hypergraph is defined as the number of hyperedges adjacent to a node and the distance between two nodes is defined as the minimal number of hyperedges connecting these nodes The clustering coefficient 182, 86 and community structure [86, 87 can also be defined and quantified following the definitions in ordinary networks Notice that there is a one-to-one correspondence between a hypergraph and a bipartite network. Given a hypergraph H(V, E), the corresponding bipartite network G(V, E) contains two node sets, as V=V U E, and x E v is connected with Y E e if and only if x E Y(see Fig. 2 for an illustration) Hypergraph representation has already found applications in ferromagnetic dynamics[88, 89, population stratifica tion [90], cellular networks [91, academic team formation [92, and many other areas. Here we are concerned more about the hypergraph representation of collaborative tagging systems [ 86, 93, 94] where each hyperedge joins three nodes (repre sented by a triangle-like in Fig 3), user u, resource r and tag t, indicating that u has given t to r. a resource can be collected by many users and given several tags by a user, and a tag can be associated with many resources, resulting in small-world hypergraphs[86, 94(Fig 3 shows both the basic unit and extensive description ). Moreover, hypergraphs for collaborative 8 L. Li et aL./ Physics Reports 519 (2012)1-49 a Z 86800 Fig. 2. An illustration of the one-to-one correspondence between a hypergraph(a)and a bipartite network (b). There are three hyperedges, X=(1, 2, 4) Y={4,5,6}andZ={2,3,5,6} R U R Resource RI Fig. 3. A hypergraph illustration of collaborative tagging networks (left )a triangle-like hyperedge 93 which contains three types of vertices, depicted by one blue circle, one green rectangle and one brown triangle which respectively represent a user, a resource and a tag (right) A descriptive hypergraph consists of two users, four resources and three tags. Take user U2 and resource R, for example, the measurements are denoted as: (i> has participated in six hyperedges, which means its hyperdegree is 6: (i1)U2 has directly connected to three resources and three tags. As defined by Eq. (6), it suggests it possibly has 3 x 3=9 hyperedges in maximal. Thus its clustering coefficient equals 6/9 a 0.667, where 6 is its hyperdegree: Comparatively, as defined by Eq (7), its clustering coefficient is Dh(U2)=12-0=0.75;(iii) the shortest path from U2 to Ri is U2-T1-Rl, which indicates the distance between U2 and ri is 2. For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article. tagging systems have been shown to be highly clustered, with heavy-tailed degree distribution and of community struc- ture[86, 94 A model for evolving hypergraphs can be found in [94] Generally, to evaluate a hypergraph from the perspective of complexity science, the following quantities(Fig 3 gives a detailed description of these quantities)can be applied (i)hyperdegree: The degree of a node in a hypergraph can be naturally defined as the number of hyperedges adjacent to it (ii) hyperdegree distribution: defined as the proportion that each hyperdegree occupies, where hyperdegree is defined as the number of hyperedges that a regular node participates in (iii) clustering coefficients: defined as the proportion of real number of hyperedges to all the possible number of hyperedges that a regular node could have [82]. e. g, the clustering coefficient for a user, cu, is defined as ku RuT There ku is the hyperdegree of user u, Ru is the number of resources that u collects and tu is the number of tags that u possesses. The above definition measures the fraction of possible pairs present in the neighborhood of u. a larger CL indicates that u has more similar topic of resources, which might also show that u has more concentrated on personalized or special topics, while smaller Cu might suggest that s/he has more diverse interests. Similar definitions can also be defined for measuring the clustering coefficient of resources and tags An alternative metric, named hyperedge density, is proposed by Zlatic et al. [86. Taking a user node u again as an example, they define the coordination number of u as z(u)= Ru t tu Given k(u, the maximal coordination number is 5 The degrees of users resources and tags are usually investigated separately For flickr. com and CiteulikE.cOm, the user and tag degree distributions are power-law-like, while the resource degree distributions are much narrower because in flickr. com, a photograph is only collected by a single user and in CiteULike. com a reference is rarely collected by many users 86 By contrast, in delicious. com, a popular bookmark can be collected by thousands of users and thus the resource degree distribution is of a power-law kind 94] L. Lu et aL. Physics Reports 519 (2012)1-49 9 I like it Category t Year Attribute Nationali lt IS OK 1t Harry is a very spe Profiles I dislike it Users Fig 4. Illustration of a recommender system consisted of five users and four books. The basic information contained by every recommender system is the relations between users and objects that can be represented by a bipartite graph. this illustration also exhibits some additional information frequently exploited in the design of recommendation algorithms, including user profiles, object attributes and object content Zmar(u)=2k(u), while the minimal coordination number is zmin(u)=2n for n(n-1)< k(u)<n2 and Zmin (u)=2n+1 for n2< k(u< n(n+ 1), with n some integer. Obviously a local tree structure leads to maximal coordination number while the maximum overlap corresponds to the minimal coordination number Therefore, they define the hyperedge density as[86] zmax(u-z(u) Dh(u= 0≤Dn(u)≤1. Zmax(u)-Zmin(u) The definition of hyperedge density for resources and tags is similar. Empirical analysis indicates a high clustering behavior under both metrics [82, 86]. The study of hypergraph for the collaborative tagging networks has just been unfolding, and how to properly quantify the clustering behavior, the correlations and similarities between nodes, and the community structure is still an open problem (iv)average distance: defined as the average shortest path length between two random nodes in the whole network 3.3. Recommender systems a recommender system uses the input data to predict potential further likes and interests of its users. Users' past evaluations are typically an important part of the input data. let m be the number of users and let n be the number of all objects that can be evaluated and recommended note that object is simply as a generic term which can represent book movies, or any other kind of consumed content To stay in line with standard terminology, we sometimes use item which has the same meaning. To make the notation more clear, we restrict to Latin indices i and j when enumerating the users and to Greek indices a and B when enumerating the objects. Evaluation rating of object a by user i is denoted as ria. This evaluation is often numerical in an integer rating scale (think of Amazon's five stars)-in this case we speak of explicit ratings. Note that the common case of binary ratings(like/dislike or good/bad ) also belongs to this category When objects are only collected (as in bookmark sharing systems or simply consumed (as in online newspaper or magazine without rating systems)or when"like"is the only possible expression (as on Facebook), we are left with unary ratings. In this case, rio 1 represents a collected consumed liked object and ria =0 represents a non-existing evaluation(see Fig. 4). Inferring users'confidence levels of ratings is not a trivial task, especially from the binary or unary ratings. Accessorial information about users'behavior may be helpful, for example, the users' confidence levels can be estimated by their watching time of television shows and with the help of this information, the quality of recommendation can be improved 95 even if we have explicit ratings, it does not mean we know how and why people vote with these ratings-do they have standards of numerical ratings or they just use ratings to present orders? Recent evidence [96 to some extent supports the latter ansat The goal of a recommender system is to deliver lists of personalized"recommended"objects to its users. To this end, evaluations can be predicted or, alternatively, recommendation scores can be assigned to objects yet unknown to a given user Objects with the highest predicted ratings or the highest recommendation scores then constitute the recommendation list that is presented to the target user There is an extensive set of performance metrics that can be used to evaluate the resulting recommendation lists(see Section 3. 4). The usual classifications of recommender systems is as follows [15] 1. Content-based recommendations: Recommended objects are those with content similar to the content of previously preferred objects of a target user. We present them in Section 4.2.3 L. Li et aL./ Physics Reports 519 (2012)1-49 Table 2 Recommendation process in a nutshell: to estimate the potential favorable opinion of Carol about Casablanca, one can use the similarity of her with those of Alice Alternatively, one can note that ratings of Titanic and Casablanca follow a similar pattern, suggesting that people who liked the former might also like the latter Alice Bob Carol Titanic 5 5 2001: A Space Odyssey 2 Casablanca 2. Collaborative recommendations: Recommended objects are selected on the basis of past evaluations of a large group of users. An example is given in Table 2. They can be divided into (a) Memory-based collaborative filtering: Recommended objects are those that were preferred by users who share similar preferences as the target user, or, those that are similar to the other objects preferred by the target user We present them in Sections 4(Standard similarity-based methods and 7(methods employing social filtering) (b) Model-based collaborative filtering: Recommended objects are selected on models that are trained to identify patterns in the input data. We present them in Sections 5(dimensionality reduction methods)and 6(diffusion-based methods) 3. Hybrid approaches: These methods combine collaborative with content-based methods or with different variants of other collaborative methods We present them in Section 8.4 3. 4. Evaluation metrics for recommendation Given a target user i, a recommender system will sort all i's uncollected objects and recommend the top-ranked objects To evaluate recommendation algorithms, the data is usually divided into two parts The training set e and the probe set -p The training set is treated as known information, while no information from the probe set is allowed to be used for recommendation In this section we briefly review basic metrics that are used to measure the quality of recommendations How to choose a particular metric (or metrics) to evaluate recommendation performance depends on the goals that the system is supposed to fulfill. Of course, the ultimate evaluation of any recommender system is given by the judgment of its users 3.4.1. Accuracy metrics Rating accuracy metrics. The main purpose of recommender systems is to predict users' future likes and interests. A multitude of metrics exist to measure various aspects of recommendation performance two notable metrics, Mean absolute Error(mae)and root mean squared error (rmse), are used to measure the closeness of predicted ratings to the true ratings If rig is the true rating on object a by user i, ria is the predicted rating and er is the set of hidden user-object ratings, MAE and rmse are defined as 1 MAE= EPI ∑ (8) (i,a)∈EP 1/2 RMSE E∑ (9) (ia)∈EP Lower Mae and rmse correspond to higher prediction accuracy Since rmse squares the error before summing it, it tends to penalize large errors more heavily. As these metrics treat all ratings equally no matter what their positions are in the recommendation list, they are not optimal for some common tasks such as finding a small number of objects that are likely to be appreciated by a given user (Finding good objects Yet due to their simplicity rmse and mae are widely used in the evaluation of recommender systems. Rating and ranking correlations. Another way to evaluate the prediction accuracy is to calculate the correlation between the predicted and the true ratings. There are three well-known correlation measures, namely the pearson product-moment correlation [97, the Spearman [98 correlation and Kendall's Tau [99]. The Pearson correlation measures the extent to which a linear relationship is present between the two sets of ratings. It is defined as ∑(a-)(ra-f) PCC= (10) ∑Ga-)2/∑(n-F)2 where ro and ra are the true and predicted ratings, respectively. The spearman correlation coefficient p is defined in the same manner as the Pearson correlation, except that ra and ra are replaced by the ranks of the respective objects. Similarly

试读 49P recommender systems.pdf
立即下载 低至0.43元/次 身份认证VIP会员低至7折
    recommender systems.pdf 50积分/C币 立即下载
    recommender systems.pdf第1页
    recommender systems.pdf第2页
    recommender systems.pdf第3页
    recommender systems.pdf第4页
    recommender systems.pdf第5页
    recommender systems.pdf第6页
    recommender systems.pdf第7页
    recommender systems.pdf第8页
    recommender systems.pdf第9页
    recommender systems.pdf第10页

    试读结束, 可继续读5页

    50积分/C币 立即下载 >