没有合适的资源?快使用搜索试试~ 我知道了~
dssm的相关介绍,文档以及一些相关的技巧,值得下载!!!
资源推荐
资源详情
资源评论
arXiv:1801.02294v3 [stat.ML] 21 May 2018
Learning Tree-based Deep Model for Recommender Systems
Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, Kun Gai
Alibaba Group
{zhuhan.zh,yushi.lx,pengye.zpy,guozheng .lgz,jay.hj,lihan.lh,jingshi.gk}@alibaba-inc.c om
ABSTRACT
Model-based methods for recommender systems have been stud-
ied extensively in recent years. In systems with large corpus, how-
ever, the calculation cost for the learnt model to predict all user-
item preferences is tremendous, which makes full corpus retrieval
extremely difficult. To overcome the calculation barriers, models
such as matrix factorization resort to inner p roduct form (i.e., model
user-item preference as the inner product of user, item latent fac-
tors) and indexes to facilitate efficient approximate k-nearest neigh-
bor searches. However, it still remains challenging to incorporate
more expressive interaction forms between user and item features,
e.g., interactions through deep neural networks, bec ause of the cal-
culation cost.
In t his paper, we focus on the problem of introducing arbitrary
advanced models to recommender systems with lar ge corpus. We
propose a novel tree-based metho d which can provide logarithmic
complexity w.r.t. corpus size even with more expressive models
such as deep neural networks. Our main idea is to predict user in-
terests from coarse to fine by traversing tree nodes in a top-down
fashion and making decisions for each user-node pair. We also
show that the tree structure can be jointly learnt towards better
compatibility with users’ interest distribution and hence facilitate
both training and prediction. Experimental evaluations with two
large-scale real-world datasets show that the prop osed method sig-
nificantly outperforms traditional method s. Online A/B test results
in Taobao display advertising platform also demonstrate the effec-
tiveness of the proposed method in production environments.
CCS CONCEPTS
• Computing methodologies → Classification and regression
trees; Neural networks; • Information systems → Recom-
mender systems;
KEYWORDS
Tree-based Learning, Recommender Systems, Implicit Feedback
1 INTRODUCTION
Recommendation has been widely used by various kinds of content
providers. Personalized reco mmendation method, base d on t he in-
tuition t hat users’ interests can b e inferred from their historical
behaviors or other users with similar preference, has been proven
to be effective in YouTube [7] and Amazon [22].
Designing such a recommendation model to predict the best
candidate set from the entire corpus for each user has many chal-
lenges. In systems with enormous corpus, some well-performed
recommendation algorithms may fail to predict from the entire
corpus. The linear prediction complexity w.r.t. the corpus size is
unacceptable. Deploying such large-scale recommender system re-
quires the amount of calculation to predict for each single user
be limited. And besides preciseness, the novelty of recommended
items should also be responsible for user experience. Results that
only contain homogeneous items with user’s historical behaviors
are not expected.
To reduce the amount of calculation and handle enormous cor-
pus, memory-based collaborative filtering methods are widely de-
ployed in industry [22]. As a representative method in co llabo-
rative filtering family, item-based collaborative fi ltering [31] can
recommend from very large corpus with relatively much fewer
computations, depending on the pre-calculated similarity between
item pairs and using user’s historical behaviors as triggers to recall
those most similar items. However, there exists restriction on the
scope of candidate set, i.e., not all items but only items similar to
the triggers can be ultimately recommended. This intuitio n pre-
vents the recommender system from jumping out of historical be-
havior to explore potential user interests, which limits the accu racy
of recalled results. And in practice the recommendation novelty is
also criticized. Another way to reduce calculation is making co ar se-
grained recommendation. For example, the system recommends a
small number of item categories for users and picks out all corre-
sponding it ems, with a fol lowing ranking stage. However, for large
corpus, the calculatio n problem is still not solved. If the category
number is large, the category recommendation itself also me ets the
calculation barrier. If not, some categories will inevitably include
too many items, making the following ranking calculation imprac-
ticable. Besides, the used categories are usually not designed for
recommendation problem, which can seriously harm the recom-
mendation accuracy.
In the literatures of recommender systems, model-based meth-
ods are an active t opic. Models such as matrix factorization (MF)
[19, 30] try to decompose pairwise user-item preferences (e.g., rat-
ings) into u ser and item factors, then reco mmend to each user its
most preferred items. Factorization machine (FM) [28] further pro-
poses a unified model that can mimic different factorization models
with any kind of input data. In some real-world scenarios that have
no explicit preference but only implicit user feedback (e.g., user
behaviors like clicks or pu rchases), Bayesian personalized ranking
[29] gives a solution that formulates the preference in triplets with
partial order, and applies it to MF models. In industry, YouTube
uses deep neural network [7] to learn both user and item’s embed-
dings, where two kinds of embeddings are generated from their
corresponding features separately. In all the above kinds of meth-
ods, the preference of user-item pair can be formulated as the in-
ner product of user and item’s vector representations. The predic-
tion stage thus is equivalent to retrieve user vector’s nearest neigh-
bors in inner product space. For vector search problem, indices like
hashing or quantization [18] for approximate k-nearest neighbor
(kNN) search can ensure the efficiency of retrieval.
However, the inner product interaction form between user and
item’s vector representations severely limits model’s capability. There
exist many other kinds of more expressive interaction forms, for ex-
ample, cross-product features between user’s historical behaviors
and candidate items are widely used in click-through rate predic-
tion [5]. Recent work [ 13] proposes a neural collaborative fi ltering
method, where a neural network instead of inner prod uct is used
to model the interaction between user and item’s vector represen-
tations. The work’s experimental results prove that a multi-layer
feed-forward neural network performs better than the fixed inner
product manner. Deep interest network [34] points out that user in-
terests are diverse, and an attention like network structure can gen-
erate varying user vectors according to different candidate items.
Beyond the above works, o ther methods like product neural net-
work [27] have also proven the effectiveness of advanced neural
networks. However, as these kinds of models can not be regulated
to inner product form between u ser and item vectors to utilize effi-
cient approximate kNN search, they can not be used to recall candi-
dates in l ar ge-scale recommender systems. How to overcome the
calculation barrier t o make arbitrary advanced neural networks
feasible in large-scale recommendation is a problem.
To address the challenges above, we propose a novel tree-based
deep recommendation model (TDM) in this paper. Tree and t ree-
based methods are researched in multiclass classification p roblem
[1–3, 6, 15, 26, 32], where tree is usually used to partition the sam-
ple or label space to reduce calculation cost. However, researchers
seldom set foot in the co ntext o f recommender systems using tree
structure as an index for retrieval. Actually, hierarchical struct ure
of information ubiquitously exists in many domains. For exam-
ple, in E-commerce scenario, iP hone is the fine-grained item while
smartphone is the coarse-grained concept to which iPhone belongs.
The proposed TDM method l everages this hierarchy of informa-
tion and turns recommendation problem into a series of hierarchi-
cal classification problems. By solving the problem from easy to
difficult, TDM can improve both accuracy and efficiency. T he main
contributions of our paper are summarized as follows:
• To our best knowledge, TDM is the first method that makes
arbitrary advanced models possible in generating recommen-
dations from large corpus. Benefiting from hierarchical tree
search, TDM achieves logarithmic amount of calculation w.r.t.
corpus size when making prediction.
• TDM can help find novel but effective recommendation re-
sults more precisely, because the entire corpus is explored
and more effective deep models also can help find potential
interests.
• Besides more advanced models, TDM also promotes recom-
mendation accuracy by hierarchical search, which divides
a large problem into smaller ones and solves them succes-
sively from easy to difficult.
• As a kind of index, the tree structure can also be learnt to-
wards optimal hierarchy of items and concepts for more ef-
fective retrieval, which in turn facilitat es the model training.
We employ a tree learning method that allows joint training
of neural network and the tree structure.
• We conduct extensive exp eriments o n two large-scale real-
world datasets, which show that TDM outperforms existing
methods significantly.
It’s worth mentioning that t ree-based approach is also researched
in language model work hierarchical softmax [24], but it’s differ-
ent from the proposed TDM not only in motivation but also in for-
mulation. In next-word prediction problem, conventional softmax
has to calculate the normalization term to get any single word’s
probability, which is very time-consuming. Hierarchical softmax
uses tree structure, and next-word’s probability is converted to the
product of node probabilities along the tree p ath. Such formulat ion
reduces the computation complexity of next-word’s probability to
logarithmic magnitude w.r.t. the corpus size. However, in recom-
mendation problem, the goal is to search the entire corpus for t hose
most preferred items, which is a retrieval problem. In hierarchical
softmax tree, the op timum of parent nodes can not guarantee that
the optimal low level nodes are in their descendants, and all items
still need to be traversed to find the optimal one. Thus, it’s not
suitable for such a retrieval problem. To address the retrieval prob-
lem, we propose a max-heap like tree formulation and introduce
deep neural networks to model the tree, which forms an efficient
method for l ar ge-scale recommendation. The following sections
will show its difference in for mulation and its superiority in per-
formance. In addition, hierarchical softmax adop ts a single hidden
layer network for a specific natural language processing problem,
while the proposed TDM method is practicable to engage any neu-
ral network structures.
The proposed t ree-based model is a universal solution for all
kinds of online co ntent providers. T he remainder o f this paper is
organized as follows: In Section 2, we’ll introduce the system ar-
chitecture of Taob ao display advertising to show the position of
the proposed method. Section 3 will give a detailed introduction
and formalization of the proposed tree-based deep model. And the
following Section 4 will describe how the tree-based model serves
online. Experimental results on large-scale benchmark dataset and
Taobao advertising dataset are shown in Section 5. At last, Sec-
tion 6 gives our work a conclusion.
2 SYSTEM ARCHITECTURE
In this section, we introduce the architecture of Taobao displ ay ad-
vertising recommender system as Figure 1. After receiving page
view request from a user, the system uses user features, context
features and item features as input to generate a relatively much
smaller set (usually hundreds) of candidate items from the entire
corpus (hundreds of millions) in the matching server. The tree-
based recommendation model takes effort in this stage and shrinks
the size of candidate set by several orders of magnitude.
With hundreds of candidate items, the real-time prediction server
uses more expressive but also more time consuming models [11,
34] to predict indicators like click-through rate or conversion rate.
And after ranking by strategy [17, 35], several items are ultimately
impressed to user.
As aforementioned, the proposed recommendation model aims
to construct a candidate set with hundreds of items. This stage is
essential and also difficult. Whether the user is interested in the
generated candidates gives an upper bound of the impression qual-
ity. How to draw candidates from the entire corpu s weighing effi-
ciency and effectiveness is a problem.
2
剩余9页未读,继续阅读
资源评论
weixin_38444012
- 粉丝: 1
- 资源: 19
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 自定义参照引用后保存显示主键或显示为空问题处理
- 我国1950-2023年外汇及黄金储备汇总+趋势变化图
- YOLOX,YOLOV5,YOLOV8,YOLOV9 针对 OpenVINO 的 C++ 推理,支持 float32、float16 和 int8 .zip
- 设置NCC单据参照字段多选(参照多选)
- 已安装xcb、X11库的交叉编译器(x86-64-aarch64-linux-gnu)
- 包含约100万条由BELLE项目生成的中文指令数据
- BIP集成NC65预算
- 包含约50万条由BELLE项目生成的中文指令数据
- 完整的交叉编译好支持xcb的qt库(qt5.15.2、arm64、xcb、no-opengl)
- 包含约40万条由BELLE项目生成的个性化角色对话数据,包含角色介绍
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功