YouTube推荐系统Paper[2016]-DeepNeuralNetworksforYouTubeRecommendations.pdf资源-CSDN文库

需积分: 28 105 浏览量 2020-02-24 14:27:04 上传评论收藏 880KB PDF 举报

资源推荐

资源详情

资源评论

Deep Neural Networks for YouTube Recommendations

Paul Covington, Jay Adams, Emre Sargin

Google

Mountain View, CA

{pcovington, jka, msargin}@google.com

ABSTRACT

YouTube represents one of the largest scale and most sophis-

ticated industrial recommendation systems in existence. In

this paper, we describe the system at a high level and fo-

cus on the dramatic performance improvements brought by

deep learning. The paper is split according to the classic

two-stage information retrieval dichotomy: ﬁrst, we detail a

deep candidate generation model and then describe a sepa-

rate deep ranking model. We also provide practical lessons

and insights derived from designing, iterating and maintain-

ing a massive recommendation system with enormous user-

facing impact.

Keywords

recommender system; deep learning; scalability

1. INTRODUCTION

YouTube is the world’s largest platform for creating, shar-

ing and discovering video content. YouTube recommenda-

tions are responsible for helping more than a billion users

discover personalized content from an ever-growing corpus

of videos. In this paper we will focus on the immense im-

pact deep learning has recently had on the YouTube video

recommendations system. Figure 1 illustrates the recom-

mendations on the YouTube mobile app home.

Recommending YouTube videos is extremely challenging

from three major perspectives:

• Scale: Many existing recommendation algorithms proven

to work well on small problems fail to operate on our

scale. Highly specialized distributed learning algorithms

and eﬃcient serving systems are essential for handling

YouTube’s massive user base and corpus.

• Freshness: YouTube has a very dynamic corpus with

many hours of video are uploaded per second. The

recommendation system should be responsive enough

to model newly uploaded content as well as the lat-

est actions taken by the user. Balancing new content

Permission to make digital or hard copies of part or all of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. Copyrights for third-party components of this work must be honored.

For all other uses, contact the owner/author(s).

RecSys ’16 September 15-19, 2016, Boston , MA, USA

 2016 Copyright held by the owner/author(s).

ACM ISBN 978-1-4503-4035-9/16/09.

DOI: http://dx.doi.org/10.1145/2959100.2959190

Figure 1: Recommendations displayed on YouTube

mobile app home.

with well-established videos can be understood from

an exploration/exploitation perspective.

• Noise: Historical user behavior on YouTube is inher-

ently diﬃcult to predict due to sparsity and a vari-

ety of unobservable external factors. We rarely ob-

tain the ground truth of user satisfaction and instead

model noisy implicit feedback signals. Furthermore,

metadata associated with content is poorly structured

without a well deﬁned ontology. Our algorithms need

to be robust to these particular characteristics of our

training data.

In conjugation with other product areas across Google,

YouTube has undergone a fundamental paradigm shift to-

wards using deep learning as a general-purpose solution for

nearly all learning problems. Our system is built on Google

Brain [4] which was recently open sourced as TensorFlow [1].

TensorFlow provides a ﬂexible framework for experimenting

with various deep neural network architectures using large-

scale distributed training. Our models learn approximately

one billion parameters and are trained on hundreds of bil-

lions of examples.

In contrast to vast amount of research in matrix factoriza-

tion methods [19], there is relatively little work using deep

neural networks for recommendation systems. Neural net-

works are used for recommending news in [17], citations in

[8] and review ratings in [20]. Collaborative ﬁltering is for-

mulated as a deep neural network in [22] and autoencoders

in [18]. Elkahky et al. used deep learning for cross domain

user modeling [5]. In a content-based setting, Burges et al.

used deep neural networks for music recommendation [21].

The paper is organized as follows: A brief system overview

is presented in Section 2. Section 3 describes the candidate

generation model in more detail, including how it is trained

and used to serve recommendations. Experimental results

will show how the model beneﬁts from deep layers of hidden

units and additional heterogeneous signals. Section 4 details

the ranking model, including how classic logistic regression

is modiﬁed to train a model predicting expected watch time

(rather than click probability). Experimental results will

show that hidden layer depth is helpful as well in this situa-

tion. Finally, Section 5 presents our conclusions and lessons

learned.

2. SYSTEM OVERVIEW

The overall structure of our recommendation system is il-

lustrated in Figure 2. The system is comprised of two neural

networks: one for candidate generation and one for ranking.

The candidate generation network takes events from the

user’s YouTube activity history as input and retrieves a

small subset (hundreds) of videos from a large corpus. These

candidates are intended to be generally relevant to the user

with high precision. The candidate generation network only

provides broad personalization via collaborative ﬁltering.

The similarity between users is expressed in terms of coarse

features such as IDs of video watches, search query tokens

and demographics.

Presenting a few “best” recommendations in a list requires

a ﬁne-level representation to distinguish relative importance

among candidates with high recall. The ranking network

accomplishes this task by assigning a score to each video

according to a desired objective function using a rich set of

features describing the video and user. The highest scoring

videos are presented to the user, ranked by their score.

The two-stage approach to recommendation allows us to

make recommendations from a very large corpus (millions)

of videos while still being certain that the small number of

videos appearing on the device are personalized and engag-

ing for the user. Furthermore, this design enables blending

candidates generated by other sources, such as those de-

scribed in an earlier work [3].

During development, we make extensive use of oﬄine met-

rics (precision, recall, ranking loss, etc.) to guide iterative

improvements to our system. However for the ﬁnal deter-

mination of the eﬀectiveness of an algorithm or model, we

rely on A/B testing via live experiments. In a live experi-

ment, we can measure subtle changes in click-through rate,

watch time, and many other metrics that measure user en-

gagement. This is important because live A/B results are

not always correlated with oﬄine experiments.

3. CANDIDATE GENERATION

During candidate generation, the enormous YouTube cor-

pus is winnowed down to hundreds of videos that may be

relevant to the user. The predecessor to the recommender

candidate

ranking

user history and context

generation

millions

hundreds

dozens

video

corpus

other candidate sources

video

features

Figure 2: Recommendation system architecture

demonstrating the “funnel” where candidate videos

are retrieved and ranked before presenting only a

few to the user.

described here was a matrix factorization approach trained

under rank loss [23]. Early iterations of our neural network

model mimicked this factorization behavior with shallow

networks that only embedded the user’s previous watches.

From this perspective, our approach can be viewed as a non-

linear generalization of factorization techniques.

3.1 Recommendation as Classiﬁcation

We pose recommendation as extreme multiclass classiﬁca-

tion where the prediction problem becomes accurately clas-

sifying a speciﬁc video watch w

at time t among millions

of videos i (classes) from a corpus V based on a user U and

context C,

P (w

= i|U, C) =

j∈V

where u ∈ R

represents a high-dimensional“embedding” of

the user, context pair and the v

∈ R

represent embeddings

of each candidate video. In this setting, an embedding is

simply a mapping of sparse entities (individual videos, users

etc.) into a dense vector in R

. The task of the deep neural

network is to learn user embeddings u as a function of the

user’s history and context that are useful for discriminating

among videos with a softmax classiﬁer.

Although explicit feedback mechanisms exist on YouTube

(thumbs up/down, in-product surveys, etc.) we use the im-

plicit feedback [16] of watches to train the model, where a

user completing a video is a positive example. This choice is

based on the orders of magnitude more implicit user history

available, allowing us to produce recommendations deep in

the tail where explicit feedback is extremely sparse.

Efﬁcient Extreme Multiclass

To eﬃciently train such a model with millions of classes, we

rely on a technique to sample negative classes from the back-

ground distribution (“candidate sampling”) and then correct

for this sampling via importance weighting [10]. For each ex-

ample the cross-entropy loss is minimized for the true label

and the sampled negative classes. In practice several thou-

sand negatives are sampled, corresponding to more than 100

times speedup over traditional softmax. A popular alterna-

tive approach is hierarchical softmax [15], but we weren’t

剩余7页未读，继续阅读

评论收藏

内容反馈

KimiWing

粉丝: 0
资源: 5

YouTube推荐系统Paper[2016]-Deep Neural Networks for YouTube Recommen...

最新资源

YouTube推荐系统Paper[2016]-Deep Neural Networks for YouTube Recommen...

Deep Neural Networks for YouTube Recommendations论文翻译.pdf

Deep Neural Networks for YouTube Recommendations.zip

2016-Session based Recommendations with Recurrent Neural Networks.pdf

Deep Learning Recurrent Neural Networks in Python LazyProgrammer (2016).pdf

An Overview of Multi-Task Learning in Deep Neural Networks.pdf

imagenet-classification-with-deep-convolutional-neural-networks原版和翻译..rar

Improved Recurrent Neural Networks for Session-based Recommendations.pdf

Deep Neural Networks for YouTube Recommendations 油管推荐深度神经网络论文

Tensorflow-Deep-Neural-Networks-master_python_

4824-imagenet-classification-with-deep-convolutional-neural-networks.rar

constrained-hamiltonian-neural-networks

刘知远-Introduction to Graph Neural Networks.pdf

Deep-Learning-with-TensorFlow-Explore-neural-networks-with-Python.pdf.pdf

ISSCC2021_Session_15V_Compute-in-Memory Processors for Deep Neural Networks.pdf

Hands-On-Neural-Networks-python.zip

Image-Processing-using-Pulse-Coupled-Neural-Networks-Applications-in-Python.pdf.pdf

pytorch__快速搭建自己的模型！_Pytorch-Deep-Neural-Networks.zip

Deep Neural Networks are Easily Fooled

5207-deep-neural-networks-for-object-detection_nutshell_deeplear

藏经阁-Spiking Neural Networks, the N.pdf

DeepPose-Human Pose Estimation via Deep Neural Networks

Udemy - Deep Learning Convolutional Neural Networks in Python

Deep Learning: Practical Neural Networks with Java 完整高清英文azw3版

Aggregated Residual Transformations for Deep Neural Networks.pdf

Pattern Recognition and Neural Networks (B.D.Ripley).pdf

最新资源