dssm难点介绍以及相关技巧资源-CSDN文库

深度学习

需积分: 10 80 浏览量 2018-07-05 13:54:55 上传评论收藏 356KB PDF 举报

资源推荐

资源详情

资源评论

arXiv:1801.02294v3 [stat.ML] 21 May 2018

Learning Tree-based Deep Model for Recommender Systems

Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, Kun Gai

Alibaba Group

{zhuhan.zh,yushi.lx,pengye.zpy,guozheng .lgz,jay.hj,lihan.lh,jingshi.gk}@alibaba-inc.c om

ABSTRACT

Model-based methods for recommender systems have been stud-

ied extensively in recent years. In systems with large corpus, how-

ever, the calculation cost for the learnt model to predict all user-

item preferences is tremendous, which makes full corpus retrieval

extremely diﬃcult. To overcome the calculation barriers, models

such as matrix factorization resort to inner p roduct form (i.e., model

user-item preference as the inner product of user, item latent fac-

tors) and indexes to facilitate eﬃcient approximate k-nearest neigh-

bor searches. However, it still remains challenging to incorporate

more expressive interaction forms between user and item features,

e.g., interactions through deep neural networks, bec ause of the cal-

culation cost.

In t his paper, we focus on the problem of introducing arbitrary

advanced models to recommender systems with lar ge corpus. We

propose a novel tree-based metho d which can provide logarithmic

complexity w.r.t. corpus size even with more expressive models

such as deep neural networks. Our main idea is to predict user in-

terests from coarse to ﬁne by traversing tree nodes in a top-down

fashion and making decisions for each user-node pair. We also

show that the tree structure can be jointly learnt towards better

compatibility with users’ interest distribution and hence facilitate

both training and prediction. Experimental evaluations with two

large-scale real-world datasets show that the prop osed method sig-

niﬁcantly outperforms traditional method s. Online A/B test results

in Taobao display advertising platform also demonstrate the eﬀec-

tiveness of the proposed method in production environments.

CCS CONCEPTS

• Computing methodologies → Classiﬁcation and regression

trees; Neural networks; • Information systems → Recom-

mender systems;

KEYWORDS

Tree-based Learning, Recommender Systems, Implicit Feedback

1 INTRODUCTION

Recommendation has been widely used by various kinds of content

providers. Personalized reco mmendation method, base d on t he in-

tuition t hat users’ interests can b e inferred from their historical

behaviors or other users with similar preference, has been proven

to be eﬀective in YouTube [7] and Amazon [22].

Designing such a recommendation model to predict the best

candidate set from the entire corpus for each user has many chal-

lenges. In systems with enormous corpus, some well-performed

recommendation algorithms may fail to predict from the entire

corpus. The linear prediction complexity w.r.t. the corpus size is

unacceptable. Deploying such large-scale recommender system re-

quires the amount of calculation to predict for each single user

be limited. And besides preciseness, the novelty of recommended

items should also be responsible for user experience. Results that

only contain homogeneous items with user’s historical behaviors

are not expected.

To reduce the amount of calculation and handle enormous cor-

pus, memory-based collaborative ﬁltering methods are widely de-

ployed in industry [22]. As a representative method in co llabo-

rative ﬁltering family, item-based collaborative ﬁ ltering [31] can

recommend from very large corpus with relatively much fewer

computations, depending on the pre-calculated similarity between

item pairs and using user’s historical behaviors as triggers to recall

those most similar items. However, there exists restriction on the

scope of candidate set, i.e., not all items but only items similar to

the triggers can be ultimately recommended. This intuitio n pre-

vents the recommender system from jumping out of historical be-

havior to explore potential user interests, which limits the accu racy

of recalled results. And in practice the recommendation novelty is

also criticized. Another way to reduce calculation is making co ar se-

grained recommendation. For example, the system recommends a

small number of item categories for users and picks out all corre-

sponding it ems, with a fol lowing ranking stage. However, for large

corpus, the calculatio n problem is still not solved. If the category

number is large, the category recommendation itself also me ets the

calculation barrier. If not, some categories will inevitably include

too many items, making the following ranking calculation imprac-

ticable. Besides, the used categories are usually not designed for

recommendation problem, which can seriously harm the recom-

mendation accuracy.

In the literatures of recommender systems, model-based meth-

ods are an active t opic. Models such as matrix factorization (MF)

[19, 30] try to decompose pairwise user-item preferences (e.g., rat-

ings) into u ser and item factors, then reco mmend to each user its

most preferred items. Factorization machine (FM) [28] further pro-

poses a uniﬁed model that can mimic diﬀerent factorization models

with any kind of input data. In some real-world scenarios that have

no explicit preference but only implicit user feedback (e.g., user

behaviors like clicks or pu rchases), Bayesian personalized ranking

[29] gives a solution that formulates the preference in triplets with

partial order, and applies it to MF models. In industry, YouTube

uses deep neural network [7] to learn both user and item’s embed-

dings, where two kinds of embeddings are generated from their

corresponding features separately. In all the above kinds of meth-

ods, the preference of user-item pair can be formulated as the in-

ner product of user and item’s vector representations. The predic-

tion stage thus is equivalent to retrieve user vector’s nearest neigh-

bors in inner product space. For vector search problem, indices like

hashing or quantization [18] for approximate k-nearest neighbor

(kNN) search can ensure the eﬃciency of retrieval.

However, the inner product interaction form between user and

item’s vector representations severely limits model’s capability. There

exist many other kinds of more expressive interaction forms, for ex-

ample, cross-product features between user’s historical behaviors

and candidate items are widely used in click-through rate predic-

tion [5]. Recent work [ 13] proposes a neural collaborative ﬁ ltering

method, where a neural network instead of inner prod uct is used

to model the interaction between user and item’s vector represen-

tations. The work’s experimental results prove that a multi-layer

feed-forward neural network performs better than the ﬁxed inner

product manner. Deep interest network [34] points out that user in-

terests are diverse, and an attention like network structure can gen-

erate varying user vectors according to diﬀerent candidate items.

Beyond the above works, o ther methods like product neural net-

work [27] have also proven the eﬀectiveness of advanced neural

networks. However, as these kinds of models can not be regulated

to inner product form between u ser and item vectors to utilize eﬃ-

cient approximate kNN search, they can not be used to recall candi-

dates in l ar ge-scale recommender systems. How to overcome the

calculation barrier t o make arbitrary advanced neural networks

feasible in large-scale recommendation is a problem.

To address the challenges above, we propose a novel tree-based

deep recommendation model (TDM) in this paper. Tree and t ree-

based methods are researched in multiclass classiﬁcation p roblem

[1–3, 6, 15, 26, 32], where tree is usually used to partition the sam-

ple or label space to reduce calculation cost. However, researchers

seldom set foot in the co ntext o f recommender systems using tree

structure as an index for retrieval. Actually, hierarchical struct ure

of information ubiquitously exists in many domains. For exam-

ple, in E-commerce scenario, iP hone is the ﬁne-grained item while

smartphone is the coarse-grained concept to which iPhone belongs.

The proposed TDM method l everages this hierarchy of informa-

tion and turns recommendation problem into a series of hierarchi-

cal classiﬁcation problems. By solving the problem from easy to

diﬃcult, TDM can improve both accuracy and eﬃciency. T he main

contributions of our paper are summarized as follows:

• To our best knowledge, TDM is the ﬁrst method that makes

arbitrary advanced models possible in generating recommen-

dations from large corpus. Beneﬁting from hierarchical tree

search, TDM achieves logarithmic amount of calculation w.r.t.

corpus size when making prediction.

• TDM can help ﬁnd novel but eﬀective recommendation re-

sults more precisely, because the entire corpus is explored

and more eﬀective deep models also can help ﬁnd potential

interests.

• Besides more advanced models, TDM also promotes recom-

mendation accuracy by hierarchical search, which divides

a large problem into smaller ones and solves them succes-

sively from easy to diﬃcult.

• As a kind of index, the tree structure can also be learnt to-

wards optimal hierarchy of items and concepts for more ef-

fective retrieval, which in turn facilitat es the model training.

We employ a tree learning method that allows joint training

of neural network and the tree structure.

• We conduct extensive exp eriments o n two large-scale real-

world datasets, which show that TDM outperforms existing

methods signiﬁcantly.

It’s worth mentioning that t ree-based approach is also researched

in language model work hierarchical softmax [24], but it’s diﬀer-

ent from the proposed TDM not only in motivation but also in for-

mulation. In next-word prediction problem, conventional softmax

has to calculate the normalization term to get any single word’s

probability, which is very time-consuming. Hierarchical softmax

uses tree structure, and next-word’s probability is converted to the

product of node probabilities along the tree p ath. Such formulat ion

reduces the computation complexity of next-word’s probability to

logarithmic magnitude w.r.t. the corpus size. However, in recom-

mendation problem, the goal is to search the entire corpus for t hose

most preferred items, which is a retrieval problem. In hierarchical

softmax tree, the op timum of parent nodes can not guarantee that

the optimal low level nodes are in their descendants, and all items

still need to be traversed to ﬁnd the optimal one. Thus, it’s not

suitable for such a retrieval problem. To address the retrieval prob-

lem, we propose a max-heap like tree formulation and introduce

deep neural networks to model the tree, which forms an eﬃcient

method for l ar ge-scale recommendation. The following sections

will show its diﬀerence in for mulation and its superiority in per-

formance. In addition, hierarchical softmax adop ts a single hidden

layer network for a speciﬁc natural language processing problem,

while the proposed TDM method is practicable to engage any neu-

ral network structures.

The proposed t ree-based model is a universal solution for all

kinds of online co ntent providers. T he remainder o f this paper is

organized as follows: In Section 2, we’ll introduce the system ar-

chitecture of Taob ao display advertising to show the position of

the proposed method. Section 3 will give a detailed introduction

and formalization of the proposed tree-based deep model. And the

following Section 4 will describe how the tree-based model serves

online. Experimental results on large-scale benchmark dataset and

Taobao advertising dataset are shown in Section 5. At last, Sec-

tion 6 gives our work a conclusion.

2 SYSTEM ARCHITECTURE

In this section, we introduce the architecture of Taobao displ ay ad-

vertising recommender system as Figure 1. After receiving page

view request from a user, the system uses user features, context

features and item features as input to generate a relatively much

smaller set (usually hundreds) of candidate items from the entire

corpus (hundreds of millions) in the matching server. The tree-

based recommendation model takes eﬀort in this stage and shrinks

the size of candidate set by several orders of magnitude.

With hundreds of candidate items, the real-time prediction server

uses more expressive but also more time consuming models [11,

34] to predict indicators like click-through rate or conversion rate.

And after ranking by strategy [17, 35], several items are ultimately

impressed to user.

As aforementioned, the proposed recommendation model aims

to construct a candidate set with hundreds of items. This stage is

essential and also diﬃcult. Whether the user is interested in the

generated candidates gives an upper bound of the impression qual-

ity. How to draw candidates from the entire corpu s weighing eﬃ-

ciency and eﬀectiveness is a problem.

剩余9页未读，继续阅读

评论收藏

内容反馈

weixin_38444012

粉丝: 1
资源: 19

dssm难点介绍以及相关技巧

通过DSSM算法进行商品推荐.zip

DSSM_1050_V3.zip

DSSM图书推荐实现.zip

基于movieLen1M数据集的DSSM深度召回实验_DSSM.zip

51CTO下载-DSSM335V4.part1

DSSM(双塔).pdf

paddledssm:带桨的dssm代码

基于sonyiseme的repo实现DSSM、Multi-viewDSSM和temporalRSModel.zip

DSSM-Lookalike

Python-文本匹配的相关模型DSSMESIMABCNNBIMPM等数据集为LCQMC官方数据

IBM Storage Manager DSSM_1050_V3 模拟器.7z

文本匹配的相关模型DSSM,ESIM,ABCNN,BIMPM等，数据集为LCQMC官方数据.zip

文本匹配的相关模型DSSM,ESIM,ABCNN,BIMPM等，数据集为LCQMC官方数据_text_matching.zip

TransFormerDSSM:该模型在DSSM模型的基础上，将模型的表示层使用基于Transformer的Encoder部分来实现。

DSSM-with-Paddle:使用PaddlePaddle的DSSM实现

基于DSSM-区位熵模型的西北民族地区产业结构与竞争力分析.docx

大规模推荐算法库，包含推荐系统经典及最新算法LR、Wide、Deep、DSSM、TDM、MIND、Word2VecPaddleR

深度语义相似模型：我的Keras实现的深度语义相似模型（DSSM）卷积潜在语义模型（CLSM）在这里描述：http：research.microsoft.compubs226585cikm2014_cdssm_final.pdf

深度学习解决NLP问题：语义相似度计算.pdf

基于神经网络的车牌字符识别技术.doc

dssm-wemb-theano:在Theano中实现深度结构化语义模型（DSSM）

QA-pair-Classify:DSSM框架

DeepMatch:深度匹配的模型库，用于推荐和广告。轻松训练模型并导出可用于ANN搜索的表示向量

CLSM:该代码用于卷积潜在语义模型，与DSSM（深度语义相似模型）相似

YOLOv8-deepsort 实现智能车辆目标检测+车辆跟踪+车辆计数

Transformer模型实现长期预测并可视化结果（附代码+数据集+原理介绍）

最新资源