recommender systems.pdf

最权威的推荐系统综述，看懂了就可以对推荐系统有很深刻的了解，十分有用
L. Lu et aL. /Physics Reports 519 (2012)149 most popular items or suggesting other products by the same producer to complicated data mining techniques. People soon realized that there is no unique best recommendation method. Rather, depending on the context and density of the available data, different methods adapting to particular applications are most likely to succeed. Hence there is no panacea, and the best one can do is to understand the underlying premises and recommender mechanisms, then one can tackle many diverse application problems from the real life examples. This is also reflected in this review where we do not try to highlight any ultimate approach to recommendation. Instead we review the basic ideas, methods and tools with particular emphasis on physicsrooted approaches The motivation for writing this review is multifold Firstly, while extensive reviews of recommender systems by computer scientists already exists [1315, the view of physicists is different from that of computer scientists by using more the complex networks approach and adapting various classical physics processes(such as diffusion) for information filtering We thus believe that this review with its structure and emphasis on respective topics can provide a novel point of view Secondly, the past decade has already seen a growing interest of physicists in recommender systems and we hope that this review can be a useful source for them by describing the state of the art in language which is more familiar to the physics community Finally, the interdisciplinary approach presented here might provide new insights and solutions for open problems and challenges in the active field of information filtering This review is organized as follows. To better motivate the problem, In Section 2 we begin with a discussion of real applications of recommender systems. Next, in Section 3 we introduce basic concepts such as complex networks recommender systems, and metrics for their evaluation that form a basis for all subsequent exposition. Then we proceed to description of recommendation algorithms where traditional approaches(similaritybased methods in Section 4 and dimensionality reduction techniques in Section 5) are followed by networkbased approaches which have their origin in the random walk process well known to all physicists (in Section 6). Methods based on external information, such as social relationships (in Section 7), keywords or time stamps (in Section 8), are also included We conclude with a brief evaluation of methods' performance in Section 9 and a discussion on the outlook of the field in Section 10 2. Real applications of recommender systems Thanks to the everdecreasing costs of data storage and processing, recommender systems gradually spread to most areas of our lives. Sellers carefully watch our purchases to recommend us other goods and enhance their sales, social web sites analyze our contacts to help us connect with new friends and get hooked with the site, and online radio stations remember skipped songs to serve us better in the future(see more examples in Table 1). In general, whenever there is plenty of diverse products and customers are not alike personalized recommendation may help to deliver the right content to the right person this is particularly the case for those Internetbased companies that try to make use of the socalled longtail [16 of goods which are rarely purchased yet due to their multitude they can yield considerable profits(sometimes they are referred to as"worstsellers"). For example on Amazon, between 20% and 40% of sales is due to products that do not belong to the shop's 100,000 most sold products [17 a recommender system may hence have significant impact on a company's revenues: for example, 60% of DVDs rented by Netflix are selected based on personalized recommendations As discussed in[18, recommender systems not only help decide which products to offer to an individual customer, they also increase crosssell by suggesting additional products to the customers and improve consumer loyalty because consumers tend to return to the sites that best serve their needs(see [19 for an empirical analysis of the impact of recommendations and consumer feedback on sales at Amazon. com) Since no recommendation method serves best all customers, major sites are usually equipped with several distinct recommendation techniques ranging from simple popularitybased recommendations to sophisticated techniques many of which we shall encounter in the following sections Further, new companies emerge(see for example, string. com) which aim at collecting all sorts of user behavior (ranging from pages visited on the web and music listened on a personal player to"liking "or purchasing items )and using it to provide personalized recommendations of different goods or services 2.1. Netflix prize anon October 2006, the online dvd rental company Netflix released a data set containing approximately 100 million onymous movie ratings and challenged researchers and practitioners to develop recommender systems that could beat the accuracy of the company's recommendation system, Cinematch [20]. Although the released data set represented only a mall fraction of the company's rating data, thanks to its size and quality it fast became a standard in the data mining and machine learning community. The data set contained ratings in the integer scale from 1 to 5 which were accompanied by dates For each movie, title and year of release were provided No information about users was given. Submitted predictions were evaluated by their root mean squared error( rmse on a qualifying data set containing over 2817, 131 unknown ratings Out of 20,000 registered teams, 2000 teams submitted at least one answer set. On 21 September 2009 the grand prize of $1000,000 was awarded to a team that overperformed the cinematch's accuracy by 10%. At the time when the contest As presented by jon Sanders(recommendation Systems Engineering, Netflix during the talk"Research Challenges in Recommenders"at the 3rd aCm Conference on Recommender Systems(2009) L. Li et aL./ Physics Reports 519 (2012)149 Table 1 Popular sites using recommender systems. Besides, there are also some companies devoting themselves to recommendation techniquessuchasBaifendian(www.baifendian.com),Baynote (www.baynote.com),Choicestream(www.cHoicestream.com), Goodrec(www.goodrec.com),andothers Site what is recommended amaze Books/ other products Facebook Friends Wefollow Friends Movielens Movies Nanocrowd Movies Findory News Digg News Zite Beehive DVDS CDNOW CDS/DVDs charme Dates Chemist Dates True.com Date Perfectmatch Dates Careerbuilder Jobs Monster Jobs Mufin StumbleUp Web site was closed, there were two teams that achieved the same precision. the prize was awarded to the team that submitted heir results 20 min earlier than the other one. (See [11] for a popular account on how the participants struggled with the thalle There are several lessons that we have learned in this competition [21]. Firstly, the company gained publicity and a superior recommendation system that is supposed to improve user satisfaction. Secondly, ensemble methods showed their potential of improving accuracy of the predictions. Thirdly, we saw that accuracy improvements are increasingly demanding when rmse drops below a certain level. Finally, despite the company's effort anonymity of its users was not sufficiently ensured [25. As a result, Netflix was sued by one of its users and decided to cancel a planned second competition 2. 2. Major challenges Researchers in the field of recommender systems face several challenges which pose danger for the use and performance of their algorithms. Here we mention only the major ones 1. Data sparsity. Since the pool of available items is often exceedingly large(major online bookstores offer several millions of books, for example), overlap between two users is often very small or none. Further, even when the average number of evaluations per user/item are high they are distributed among the users items very unevenly(usually they follow a powerlaw distribution[26] or a Weibull distribution [27 ])and hence majority of users items may have expressed received only a few ratings. Hence, an effective recommender algorithm must take the data sparsity into account 28 2. Scalability. While the data is mostly sparse for major sites it includes millions of users and items. It is therefore essential to consider the computational cost issues and search for recommender algorithms that are either little demanding or easy to parallelize (or both ). Another possible solution is based on using incremental versions of the algorithms where as the data grows, recommendations are not recomputed globally (using the whole data but incrementally (by slightly adjusting previous recommendations according to the newly arrived data)[29, 30]. This incremental approach is similar to perturbation techniques that are widely used in physics and mathematics 31 3. Cold start. When new users enter the system, there is usually insufficient information to produce recommendation for them. The usual solutions of this problem are based on using hybrid recommender techniques(see Section 8.4)combining ontent and collaborative data 32, 33] and sometimes they are accompanied by asking for some base information(such as age, location and preferred genres from the users. Another way is to identify individual users in different web services 2 The ensemble methods deal with the selection and organization of many individual algorithms to achieve better prediction accuracy. In fact, the winning team, called BellKor's Pragmatic Chaos, was a combined team of BellKor [22], Pragmatic Theory [23 and Big Chaos [24(of course, it was not a simple combination but a sophisticated design), and each of them consists of many individual algorithms. For example, the Pragmatic Theory solution considered 453 individual algorithms L. Lu et aL. Physics Reports 519 (2012)149 For example, Baifendian developed a technique that could track individual users' activities in several ecommerce sites, so that for a coldstart user in site A, we could make recommendation according to her records in sites b, C, D, etc [34] 4. Diversity vs. accuracy. When the task is to recommend items which are likely to be appreciated by a particular user, it is usually most effective to recommend popular and highly rated items. Such recommendation, however has very little value for the users because popular objects are easy to find (often they are even hard to avoid) without a recommendeR system a good list of recommended items hence should contain also less obvious items that are unlikely to be reached by the users themselves 35]. Approaches to this problem include direct enhancement of the recommendation list's diversity [3638 and the use of hybrid recommendation methods [39] 5. Vulnerability to attacks. Due to their importance in ecommerce applications, recommender systems are likely targets of malicious attacks trying to unjustly promote or inhibit some items[40 There is a wide scale of tools preventing this kind of behavior, ranging from blocking the malicious evaluations from entering the system to sophisticated resistant recommendation techniques 41. However, this is not a easy task since the strategies of attackers also get more and more advanced as the developing of preventing tools. As an example, Burke et al. [42] introduced eight attacking strategies, which are further divided into four classes: basic attack, lowacknowledge attack, nuke attack and informed attack 6. The value of time While real users have interests with widely diverse time scales( for example, short term interests related to a planned trip and long term interests related to the place of living or political preferences ), most recommendation algorithms neglect the time stamps of evaluations. It is an ongoing line of research whether and how value of old opinions should decay with time and what are the typical temporary patterns in user evaluations and item relevance[43, 44 7. Evaluation of recommendations. While we have plenty of distinct metrics(see Section 3.4), how to choose the ones best corresponding to the given situation and task is still an open question. Comparisons of different recommender algorithms are also problematic because different algorithms may simply solve different tasks. Finally, the overall user experience with a given recommendation systemwhich includes user's satisfaction with the recommendations and user's trust in the systemis difficult to measure in"offline"evaluation Empirical user studies thus still represent a welcome source of feedback on recommender systems. 8. User interface. It has been shown that to facilitate users' acceptance of recommendations the recommendations need to be transparent [45, 46]: users appreciate when it is clear why a particular item has been recommended to them another issue is that since the list of potentially interesting items may be very long it needs to be presented in a simple way and it should be easy to navigate through it to browse different recommendations which are often obtained by distinct approaches. Besides the above longstanding challenges, many novel issues appear recently. Thanks to the development of methodology in related branches of science, especially the new tools in network analysis, scientists started to consider the effects of network structure on recommendation and how to make use of known structural features to improve recommendation. For example, Huang et al. [47 analyzed the consumerproduct networks and proposed an improved recommendation algorithm preferring edges that enhance the local clustering property and sahebi et al.48 designed an improved algorithm making use of the community structure. Progress and propagation of new techniques also bring new challenges. For example, the gPS equipped mobile phones have become mainstream and the Internet access is ubiquitous, hence the locationbased recommendation is now feasible and increasingly significant. Accurate recommendation asks for both the high predictability of human movements 49, 50 and quantitative way to define similarities between locations and people [51,52. Lastly, intelligent recommender systems should take into account the different behavioral patterns of different people. For example, new users tend to visit very popular items and select similar items, while old users usually have more specific interests 53, 54, and users behave much differently between lowrisk(e.g, collecting bookmarks downloading music, etc. and highrisk(e. g buying a computer, renting a house, etc activities 55, 56 3. Definitions of subjects and problems We briefly review in this chapter basic concepts that are useful in the study of recommender systems 3. 1. Networks Network analysis is a versatile tool in uncovering the organization principles of many complex systems [59]. A network f elements(called nodes or vertices with connections(called edges or links between them Many social, biological technological and information systems can be described as networks with nodes representing individuals or organizations and edges capturing their interactions. The study of networks, referred to as graph theory in mathematical literature has a long history that begins with the classical Konigsberg bridge problem solved by euler in 18th century [57]. Mathematically speaking, a network G is an ordered pair of disjoint sets (v, e)where v is the set of nodes and the set of edges, e, is a subset of V xV[58. In an undirected network, an edge joining nodes x and y is denoted by x <>y, and x<> y and y<>x mean exactly the same edge. In a directed network, edges are ordered pairs of nodes and an edge from x to y is denoted by x>y Websites like Foursquare, Gowalla, Google Latitude, Facebook, Jiapang, and others already provide locationbased services and show that many people want to share their location information and get locationbased recommendations 6 L. Li et aL./ Physics Reports 519 (2012)149 Edgesx>yandy>x are distinct and may be present simultaneously. Unless stated otherwise, we assume that a network does not contain a selfloop an"edge"joining a node to itselfor a multiedge(several"edges"joining the same pair of nodes) In a multinetwork both loops and multiedges are allowed In an undirected network G(V, E), two nodes x and y are said to be adjacent to each other if x<>y EE. The set of nodes adjacent to a node x, the neighborhood of x, is denoted by Ix Degree of node x is defined as kx=x The degree distribution, P(k), is defined as the probability that a randomly selected node is of degree k In a regular network, every node has the same degree ko and thus P(k)= Sk. ko. In the classical ErdosRenyi random network [59] where each pair of nodes is connected by an edge with a given probability p, the degree distribution follows a binomial form [60] N P() k p(1p) There n= v is the number of nodes in the network. This distribution has a characterized scale represented by the average degreek p(N1). At the end of the last century, researchers turned to investigation of largescale real networks where it turned out that their degree distributions often span several orders of magnitude and approximately follow a powerlaw form P(k) k 2 with y being a positive exponent usually lying between 2 and 3 [5]. Such networks are called scalefree networks as they lack a characteristic scale of degree and the powerlaw function P(k) is scaleinvariant [61. Note that detection of powerlay distributions in empirical data requires solid statistical tools [62,63]. For a directed network, the outdegree of a node x denoted by kou, is the number of edges starting at x, and the indegree kin is the number of edges ending at x. The inand outdegree distribution of a directed network in general differ from each other. Generally speaking, a network is said to be assortative if its highdegree nodes tend to connect with highdegree nodes and the lowdegree nodes tend to connect with lowdegree nodes(it is said to be disassortative if the situation is opposite) This degreedegree correlation can be characterized by the average degree of the nearest neighbors [64,65 or a variant of Pearson coefficient called assortativity coefficient [66,67]. The assortativity coefficient r lies in the range 1<r< 1. If r>0 the network is assortative; if r <0, the network is disassortative. Note that this coefficient is sensitive to degree heterogeneity. For example, r will be negative in a network with very heterogeneous degree distribution(e.g. the Internet) regardless to the network's connecting patterns 68 The number of edges in a path connecting two nodes is called length of the path, and distance between two nodes is defined as the length of the shortest path that connects them. the diameter of a network is the maximal distance among all node pairs and the average distance is the mean distance averaged over all node pairs as N(N1) ∑ (3 where dxy is the distance between x and y. Many real networks display a socalled smallworld phenomenon: their average distance does not grow faster than the logarithm of the network size [70, 7 The importance of triadic clustering in social interaction systems has been realized for more than 100 years [72]. In social network analysis [73, this kind of clustering is called transitivity, defined as three times the ratio of the total number of triangles in a network to the total number of connected node triples. In 1998, Watts and Strogatz [71 proposed a similar index to quantify the triadic clustering, called clustering coefficient. For a given node x, this coefficient is defined as the ratio of the number of existing edges between x's neighbors to the number of neighbor pairs, 号kx(kx1) where ex denotes the number of edges between kx neighbors of node x(this definition is meaningful only if kx>1). The network clustering coefficient is defined as the average of Cx over all x with kx> 1. It is also possible to define the clustering coefficient as the ratio of 3 x number of triangles in the network to the number of connected triples of vertices, which is sometimes referred to as"fraction of transitive triples"[7. note that the two definitions can give substantially different results Fig. 1 illustrates the above definitions for a simple undirected network. For more information about network measurements, readers are encouraged to refer an extensive review article [74] on characterization of networks. When no path exists between two nodes, we say that their distance is infinite which makes the average distance automatically infinite too This problem can be avoided either by excluding such node pairs from averaging or by using the harmonic mean [ 7, 69 L. Lu et aL. Physics Reports 519 (2012)149 5 Fig 1. A simple undirected network with 6 nodes and 7 edges. The node degrees are k1=1,k= ks =3 and k3=k4=k6= 2, corresponding to the distribution P(1)=1/6, P(2)=1/2 and P(3)= 1/3. The diameter and average distance of this network are dmax =3 and d= 1.6, respectively. The clustering coefficients are C2=6.C3=C4=0, C5=3 and c6= 1, and the average clustering coefficient is C=0.3 3.2. Bipartite networks and hypergraphs A network G(V, E)is a bipartite network if there exists a partition (V1, v2)such that V1UV2=V,Vinv2=0, and every edge connects a node of vi and a node of V2. Many real systems are naturally modeled as bipartite networks: the metabolic network [75 consists of chemical substances and chemical reactions, the collaboration network [76 consists of acts and actors, the Internet telephone network consists of personal computers and phone numbers [77], etc. We focus on a particular class of bipartite networks, called webbased userobject networks 53], which represent interactions between users and objects in online service sites, such as collections of bookmarks in delicious. com and purchases of books in amazon. com as we shall see later, these networks describe the fundamental structure of recommender systems. Webbased userobject networks are specific by their gradual evolution where both nodes and links are added gradually. By contrast, this cannot happen in, for example, actactor networks (e.g, one cannot add authors to a scientific paper after its publication) Most webbased userobject networks share some structural properties. Their objectdegree distributions obey a power lawlike form P(k)k, with y 1.6 for the Internet Movie database(IMdb)[78],y a 1. 8 for the musicSharing site audioscrobbler. com [79], y N 2.3 for the ecommerce site amazon. com [53], and y N 2.5 for the bookmarksharing site delicious. com [53]. The form of the userdegree distribution is usually between an exponential and a power law[ 53 and can be well fitted by the Weibull distributiondistribution [27(also called the stretched exponential distribution in the literature[80D) P(k)kl exp[(/ko)" where ko is a constant and u is the stretching exponent Connections between users and objects exhibit a disassortative mixing pattern 53, 78 A straightforward extension of the definition of bipartite network is the Socalled multipartite network. For an rpartite network g(V,E, there is an rpartition V,V2,…, Vr such that v=V1Uv2∪∴Uv,V∩v= whenever i≠j, and no edge joins two nodes in the same set Vi for all 1<i< r. The tripartite network representation has found its application in collaborative tagging systems(also called folksonomies in the literature)18184, where users assign tags to online resources, such as photographs in flickr. com, references in CiteULike. com and bookmarks in delicious. com Note that some information is lost in the tripartite representation. For example, given an edge connecting a resource and a tag, we do not know which user (or users)contributed to this edge to resolve this, hypergraph 85 can be used to give an exact representation of the full structure of a collaborative tagging system. In a hypergraph H(v, e), the hyperedge set e is a subset of the power set of v, that is the set of all subsets of V. link e can therefore connect multiple nodes. analogously to ordinary networks, node degree in a hypergraph is defined as the number of hyperedges adjacent to a node and the distance between two nodes is defined as the minimal number of hyperedges connecting these nodes The clustering coefficient 182, 86 and community structure [86, 87 can also be defined and quantified following the definitions in ordinary networks Notice that there is a onetoone correspondence between a hypergraph and a bipartite network. Given a hypergraph H(V, E), the corresponding bipartite network G(V, E) contains two node sets, as V=V U E, and x E v is connected with Y E e if and only if x E Y(see Fig. 2 for an illustration) Hypergraph representation has already found applications in ferromagnetic dynamics[88, 89, population stratifica tion [90], cellular networks [91, academic team formation [92, and many other areas. Here we are concerned more about the hypergraph representation of collaborative tagging systems [ 86, 93, 94] where each hyperedge joins three nodes (repre sented by a trianglelike in Fig 3), user u, resource r and tag t, indicating that u has given t to r. a resource can be collected by many users and given several tags by a user, and a tag can be associated with many resources, resulting in smallworld hypergraphs[86, 94(Fig 3 shows both the basic unit and extensive description ). Moreover, hypergraphs for collaborative 8 L. Li et aL./ Physics Reports 519 (2012)149 a Z 86800 Fig. 2. An illustration of the onetoone correspondence between a hypergraph(a)and a bipartite network (b). There are three hyperedges, X=(1, 2, 4) Y={4,5,6}andZ={2,3,5,6} R U R Resource RI Fig. 3. A hypergraph illustration of collaborative tagging networks (left )a trianglelike hyperedge 93 which contains three types of vertices, depicted by one blue circle, one green rectangle and one brown triangle which respectively represent a user, a resource and a tag (right) A descriptive hypergraph consists of two users, four resources and three tags. Take user U2 and resource R, for example, the measurements are denoted as: (i> has participated in six hyperedges, which means its hyperdegree is 6: (i1)U2 has directly connected to three resources and three tags. As defined by Eq. (6), it suggests it possibly has 3 x 3=9 hyperedges in maximal. Thus its clustering coefficient equals 6/9 a 0.667, where 6 is its hyperdegree: Comparatively, as defined by Eq (7), its clustering coefficient is Dh(U2)=120=0.75;(iii) the shortest path from U2 to Ri is U2T1Rl, which indicates the distance between U2 and ri is 2. For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article. tagging systems have been shown to be highly clustered, with heavytailed degree distribution and of community struc ture[86, 94 A model for evolving hypergraphs can be found in [94] Generally, to evaluate a hypergraph from the perspective of complexity science, the following quantities(Fig 3 gives a detailed description of these quantities)can be applied (i)hyperdegree: The degree of a node in a hypergraph can be naturally defined as the number of hyperedges adjacent to it (ii) hyperdegree distribution: defined as the proportion that each hyperdegree occupies, where hyperdegree is defined as the number of hyperedges that a regular node participates in (iii) clustering coefficients: defined as the proportion of real number of hyperedges to all the possible number of hyperedges that a regular node could have [82]. e. g, the clustering coefficient for a user, cu, is defined as ku RuT There ku is the hyperdegree of user u, Ru is the number of resources that u collects and tu is the number of tags that u possesses. The above definition measures the fraction of possible pairs present in the neighborhood of u. a larger CL indicates that u has more similar topic of resources, which might also show that u has more concentrated on personalized or special topics, while smaller Cu might suggest that s/he has more diverse interests. Similar definitions can also be defined for measuring the clustering coefficient of resources and tags An alternative metric, named hyperedge density, is proposed by Zlatic et al. [86. Taking a user node u again as an example, they define the coordination number of u as z(u)= Ru t tu Given k(u, the maximal coordination number is 5 The degrees of users resources and tags are usually investigated separately For flickr. com and CiteulikE.cOm, the user and tag degree distributions are powerlawlike, while the resource degree distributions are much narrower because in flickr. com, a photograph is only collected by a single user and in CiteULike. com a reference is rarely collected by many users 86 By contrast, in delicious. com, a popular bookmark can be collected by thousands of users and thus the resource degree distribution is of a powerlaw kind 94] L. Lu et aL. Physics Reports 519 (2012)149 9 I like it Category t Year Attribute Nationali lt IS OK 1t Harry is a very spe Profiles I dislike it Users Fig 4. Illustration of a recommender system consisted of five users and four books. The basic information contained by every recommender system is the relations between users and objects that can be represented by a bipartite graph. this illustration also exhibits some additional information frequently exploited in the design of recommendation algorithms, including user profiles, object attributes and object content Zmar(u)=2k(u), while the minimal coordination number is zmin(u)=2n for n(n1)< k(u)<n2 and Zmin (u)=2n+1 for n2< k(u< n(n+ 1), with n some integer. Obviously a local tree structure leads to maximal coordination number while the maximum overlap corresponds to the minimal coordination number Therefore, they define the hyperedge density as[86] zmax(uz(u) Dh(u= 0≤Dn(u)≤1. Zmax(u)Zmin(u) The definition of hyperedge density for resources and tags is similar. Empirical analysis indicates a high clustering behavior under both metrics [82, 86]. The study of hypergraph for the collaborative tagging networks has just been unfolding, and how to properly quantify the clustering behavior, the correlations and similarities between nodes, and the community structure is still an open problem (iv)average distance: defined as the average shortest path length between two random nodes in the whole network 3.3. Recommender systems a recommender system uses the input data to predict potential further likes and interests of its users. Users' past evaluations are typically an important part of the input data. let m be the number of users and let n be the number of all objects that can be evaluated and recommended note that object is simply as a generic term which can represent book movies, or any other kind of consumed content To stay in line with standard terminology, we sometimes use item which has the same meaning. To make the notation more clear, we restrict to Latin indices i and j when enumerating the users and to Greek indices a and B when enumerating the objects. Evaluation rating of object a by user i is denoted as ria. This evaluation is often numerical in an integer rating scale (think of Amazon's five stars)in this case we speak of explicit ratings. Note that the common case of binary ratings(like/dislike or good/bad ) also belongs to this category When objects are only collected (as in bookmark sharing systems or simply consumed (as in online newspaper or magazine without rating systems)or when"like"is the only possible expression (as on Facebook), we are left with unary ratings. In this case, rio 1 represents a collected consumed liked object and ria =0 represents a nonexisting evaluation(see Fig. 4). Inferring users'confidence levels of ratings is not a trivial task, especially from the binary or unary ratings. Accessorial information about users'behavior may be helpful, for example, the users' confidence levels can be estimated by their watching time of television shows and with the help of this information, the quality of recommendation can be improved 95 even if we have explicit ratings, it does not mean we know how and why people vote with these ratingsdo they have standards of numerical ratings or they just use ratings to present orders? Recent evidence [96 to some extent supports the latter ansat The goal of a recommender system is to deliver lists of personalized"recommended"objects to its users. To this end, evaluations can be predicted or, alternatively, recommendation scores can be assigned to objects yet unknown to a given user Objects with the highest predicted ratings or the highest recommendation scores then constitute the recommendation list that is presented to the target user There is an extensive set of performance metrics that can be used to evaluate the resulting recommendation lists(see Section 3. 4). The usual classifications of recommender systems is as follows [15] 1. Contentbased recommendations: Recommended objects are those with content similar to the content of previously preferred objects of a target user. We present them in Section 4.2.3 L. Li et aL./ Physics Reports 519 (2012)149 Table 2 Recommendation process in a nutshell: to estimate the potential favorable opinion of Carol about Casablanca, one can use the similarity of her with those of Alice Alternatively, one can note that ratings of Titanic and Casablanca follow a similar pattern, suggesting that people who liked the former might also like the latter Alice Bob Carol Titanic 5 5 2001: A Space Odyssey 2 Casablanca 2. Collaborative recommendations: Recommended objects are selected on the basis of past evaluations of a large group of users. An example is given in Table 2. They can be divided into (a) Memorybased collaborative filtering: Recommended objects are those that were preferred by users who share similar preferences as the target user, or, those that are similar to the other objects preferred by the target user We present them in Sections 4(Standard similaritybased methods and 7(methods employing social filtering) (b) Modelbased collaborative filtering: Recommended objects are selected on models that are trained to identify patterns in the input data. We present them in Sections 5(dimensionality reduction methods)and 6(diffusionbased methods) 3. Hybrid approaches: These methods combine collaborative with contentbased methods or with different variants of other collaborative methods We present them in Section 8.4 3. 4. Evaluation metrics for recommendation Given a target user i, a recommender system will sort all i's uncollected objects and recommend the topranked objects To evaluate recommendation algorithms, the data is usually divided into two parts The training set e and the probe set p The training set is treated as known information, while no information from the probe set is allowed to be used for recommendation In this section we briefly review basic metrics that are used to measure the quality of recommendations How to choose a particular metric (or metrics) to evaluate recommendation performance depends on the goals that the system is supposed to fulfill. Of course, the ultimate evaluation of any recommender system is given by the judgment of its users 3.4.1. Accuracy metrics Rating accuracy metrics. The main purpose of recommender systems is to predict users' future likes and interests. A multitude of metrics exist to measure various aspects of recommendation performance two notable metrics, Mean absolute Error(mae)and root mean squared error (rmse), are used to measure the closeness of predicted ratings to the true ratings If rig is the true rating on object a by user i, ria is the predicted rating and er is the set of hidden userobject ratings, MAE and rmse are defined as 1 MAE= EPI ∑ (8) (i,a)∈EP 1/2 RMSE E∑ (9) (ia)∈EP Lower Mae and rmse correspond to higher prediction accuracy Since rmse squares the error before summing it, it tends to penalize large errors more heavily. As these metrics treat all ratings equally no matter what their positions are in the recommendation list, they are not optimal for some common tasks such as finding a small number of objects that are likely to be appreciated by a given user (Finding good objects Yet due to their simplicity rmse and mae are widely used in the evaluation of recommender systems. Rating and ranking correlations. Another way to evaluate the prediction accuracy is to calculate the correlation between the predicted and the true ratings. There are three wellknown correlation measures, namely the pearson productmoment correlation [97, the Spearman [98 correlation and Kendall's Tau [99]. The Pearson correlation measures the extent to which a linear relationship is present between the two sets of ratings. It is defined as ∑(a)(raf) PCC= (10) ∑Ga)2/∑(nF)2 where ro and ra are the true and predicted ratings, respectively. The spearman correlation coefficient p is defined in the same manner as the Pearson correlation, except that ra and ra are replaced by the ranks of the respective objects. Similarly
 774KB
Recommender Systems.pdf
20190611A ContentBoosted Collaborative Filtering Neural Network for Cross Domain Recommender Systems
recommender systems.pdf下载_course
20200808最权威的推荐系统综述，看懂了就可以对推荐系统有很深刻的了解，十分有用 相关下载链接：//download.csdn.net/download/xuanfeiye/8886381?utm_source=
 12.2MB
Practical Recommender Systems.pdf
20190203Practical Recommender Systems.pdf推荐系统推荐系统的书
 9.84MB
2018Graph Convolutional Neural Networks for WebScale Recommender Systems.pdf
201906112018Graph Convolutional Neural Networks for WebScale Recommender Systems
 4.45MB
2017Deep Matrix Factorization Models for Recommender Systems.pdf
201906112017Deep Matrix Factorization Models for Recommender Systems
 5.77MB
2016Collaborative Denoising AutoEncoders for TopN Recommender Systems.pdf
20190611Most realworld recommender services measure their performance based on the topN results shown to t
 2.85MB
Matrix factorization techniques for recommender systems
20130714Matrix factorization techniques for recommender systems 数据采集文档
 3.11MB
Recommender Systems An Introduction.pdf
20180503推荐系统方面的重要书籍, 浅显易懂, 条理清晰. 英文文字版本, 非扫描. 353页.
 3.11MB
Group Recommender Systems_An IntroductionSpringer(2018).pdf
20180311Group Recommender Systems_An IntroductionSpringer(2018).pdf 推荐系统
 8.1MB
Recommender Systems the textbook.pdf
20191221For a grad level audience, there is a new book by Charu Agarwal that is perhaps the most comprehensi
 31.39MB
推荐系统的循序进阶读物
201303211. 中文综述(了解概念入门篇) a) 个性化推荐系统的研究进展 b) 个性化推荐系统评价方法综述 2. 英文综述(了解概念进阶篇) a) 2004ACMToisEvaluating colla
 16.68MB
Practical Recommender Systems (Python)  2019.pdf
20190523关于推荐系统从算法到工程，架构的一本综合书，详细介绍了如何开发一个使用的推荐系统，2019年最新版。 英文介绍： Practical Recommender Systems explains how
 10.56MB
深度学习必读论文集
20190406深度学习领域的经典论文，主要由(hiton, lecun, bengio)三巨头完成，包括： 奠基： DeepLearning_By_Three_Giant.pdf Learning Represen
 4.81MB
Statistical Methods for Recommender Systems
20171205Designing algorithms to recommend items such as news articles and movies to users is a challenging t
 680KB
论文研究一种基于多正则化参数的矩阵分解推荐算法.pdf
20190911基于梯度下降矩阵分解模型的协同过滤推荐算法需要利用正则化技术对问题加以约束。损失函数中的正则化参数能够提高模型的预测精度，防止训练过拟合，并可以在二者间调节，使二者平衡。提出了一种多正则化参数的方法，
 21.62MB
Recommender Systems Handbook
20171106Recommender Systems Handbook.pdf 完整版 22.7M 推荐系统手册
 953KB
对话推荐系统综述论文（A Survey on Conversational Recommender Systems）【35页pdf】.pdf
20200403推荐系统是人工智能在实践中最明显的成功案例之一。通常，这些系统的主要任务是为用户指出感兴趣的潜在主题，例如电子商务网站。因此，它们不仅可以在信息超载的情况下帮助用户，还可以对服务提供商的业务做出重大贡
 14.37MB
Practical Recommender Systems; Kim Falk;January 2019
20190124在线推荐系统可以帮助用户找到电影，工作，餐馆甚至浪漫！将统计数据，人口统计数据和查询术语结合起来，可以获得令人满意的结果。学习以正确的方式构建推荐系统：它可以成就或破坏您的应用程序！ Practica
 5.92MB
Recommender Systems  The Textbook
20181117Recommender System：The Textbook 高清pdf 推荐系统教科书
 12.86MB
Practical Recommender Systems
20190121Recommender systems are practically a necessity for keeping a site's content current, useful, and in
 1.7MB
A review on deep learning for recommender systems_challenges and remedies.pdf
20200329深度学习应用在推荐系统的综述，很nice。深度学习应用在推荐系统的综述，很nice。深度学习应用在推荐系统的综述，很nice。
 1.79MB
Recommender Systems An Introduction
20111105Recommender Systems An Introduction PDF
 5.47MB
《Building Machine Learning Systems with Python》PDF
20180831《Building Machine Learning Systems with Python》
 3.99MB
MachineLearningParadigmsApplicationsinRecommenderSystems.pdf 英文原版
20190822Machine Learning Paradigms Applications in Recommender Systems
 911KB
2017Neural Collaborative Filtering.pdf
20190611In recent years, deep neural networks have yielded immense success on speech recognition, computer v
 865KB
论文研究基于位置社交网络的个性化兴趣点推荐.pdf
20190722兴趣点（pointofinterest，POI）推荐是基于位置的社交网络（locationbased social networks，LBSN）中一项重要的服务。针对目前推荐算法存在的噪声数据影
 381KB
Wide&DeepACM DLRS@RecSys 2016.pdf
20200403Google 于2016年提出的Wide & Deep模型的原论文《 Wide & Deep Learning for Recommender Systems》，将线性模型wide与非线性模型DNN相

学院
【数据分析实战训练营】Hive详解
【数据分析实战训练营】Hive详解

学院
【数据分析随到随学】机器学习模型及应用
【数据分析随到随学】机器学习模型及应用

下载
模拟数字时钟响应式网页模板
模拟数字时钟响应式网页模板

学院
前端性能优化
前端性能优化

下载
彩色数据管理面板响应式网页模板
彩色数据管理面板响应式网页模板

下载
寻找美丽的地方旅行服务响应式网页模板
寻找美丽的地方旅行服务响应式网页模板

博客
【C语言例题11】求阶乘问题。
【C语言例题11】求阶乘问题。

下载
智能音频播放器组件网页模板
智能音频播放器组件网页模板

博客
【计算机网络系列】物理层核心基础
【计算机网络系列】物理层核心基础

学院
【数据分析随到随学】Spark理论及实战
【数据分析随到随学】Spark理论及实战

学院
Java无损导出及转换word文档
Java无损导出及转换word文档

下载
产品参数价格列表响应式网页模板
产品参数价格列表响应式网页模板

下载
应用商店APP团队响应式网页模板
应用商店APP团队响应式网页模板

学院
python办公自动化技巧
python办公自动化技巧

下载
滑板运动介绍销售响应式网页模板
滑板运动介绍销售响应式网页模板

博客
npm ERR! yorkie@1.0.3 install: `node bin/install.js
npm ERR! yorkie@1.0.3 install: `node bin/install.js

下载
捕获瞬间摄影展示网页模板
捕获瞬间摄影展示网页模板

下载
城市天气预报列表组件响应式网页模板
城市天气预报列表组件响应式网页模板

学院
第1章 Java入门基础及环境搭建【java编程进阶】
第1章 Java入门基础及环境搭建【java编程进阶】

学院
易语言开发通达信ＤLL公式接口
易语言开发通达信ＤLL公式接口

学院
算法导论二（排序和顺序统计量）——编程大牛的必经之路
算法导论二（排序和顺序统计量）——编程大牛的必经之路

博客
小白的java递归
小白的java递归

下载
APP简洁说明响应式网页模板
APP简洁说明响应式网页模板

博客
中国矿业大学（北京）第二届ACM程序设计公开赛（决赛）
中国矿业大学（北京）第二届ACM程序设计公开赛（决赛）

博客
模电 Day8
模电 Day8

博客
关于JVM的内存空间
关于JVM的内存空间

学院
RabbitMQ消息中间件实战(附讲义和源码)
RabbitMQ消息中间件实战(附讲义和源码)

学院
Unity游戏开发之数字华容道
Unity游戏开发之数字华容道

博客
集合 &迭代器Iterator
集合 &迭代器Iterator

学院
【数据分析随到随学】Python语法强化与数据处理
【数据分析随到随学】Python语法强化与数据处理