没有合适的资源?快使用搜索试试~ 我知道了~
资源详情
资源评论
资源推荐
Improving Context and Category Matching for Entity Search
Yueguo Chen
†
, Lexi Gao
‡
, Shuming Shi
§
, Xiaoyong Du
‡
, Ji-Rong Wen
‡
†
Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China), MOE, China
‡
School of Information, Renmin University of China
§
Microsoft Research Asia, China
†‡
{chenyueguo, gaolexi, duyong, jrwen}@ruc.edu.cn,
§
shumings@microsoft.com
Abstract
Entity search is to retrieve a ranked list of named entities
of target types to a given query. In this paper, we pro-
pose an approach of entity search by formalizing both
context matching and category matching. In addition,
we propose a result re-ranking strategy that can be eas-
ily adapted to achieve a hybrid of two context matching
strategies. Experiments on the INEX 2009 entity rank-
ing task show that the proposed approach achieves a sig-
nificant improvement of the entity search performance
(xinfAP from 0.27 to 0.39) over the existing solutions.
Introduction
Entity search has recently attracted much attention (Balog,
Serdyukov, and de Vries 2011; Demartini, Iofciu, and de
Vries 2009). In contrast to general web search whose goal
is to retrieve a list of relevant documents, the goal of entity
search, however, is to generate a short list of relevant entity
names. Compared to general web search, entity search pro-
vides users more succinct answers. It has a wide range of
applications such as question-and-answer (Raghavan, Allan,
and Mccallum 2004), knowledge services (Weikum 2009),
and web content analysis (Demartini et al. 2010).
There are some variant definitions (Cheng, Yan, and
Chang 2007; Demartini, Iofciu, and de Vries 2009; Balog,
Serdyukov, and de Vries 2011) of the entity search problem
in terms of both inputs and outputs. The widely accepted in-
put is a list of query words plus one or more desired entity
types. In this paper, the problem is defined as: given a list
of keywords or a natural language question, where the types
of the target entities are explicitly specified, return a ranked
list of relevant entity names of target types. We consider the
above problem over a web scale entity search application
which has a large number of entities (> 10
6
) and their types
(> 10
5
) as the domain of entity search results.
Early solutions of entity search mainly take a voting strat-
egy (Demartini, Iofciu, and de Vries 2009; Balog et al.
2009b; Santos, Macdonald, and Ounis 2010). Given a query,
they first retrieve top relevant documents from a corpus.
Then, entities embedded in the top relevant documents are
extracted, and ranked mainly based on their occurring fre-
quency within the retrieved documents. The voting approach
Copyright
c
2014, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
(Macdonald and Ounis 2006) is applied for entity search in
(Santos, Macdonald, and Ounis 2010). In the expert search
domain, people have proposed a number of generative lan-
guage models such as candidate model and document model
(Balog, Azzopardi, and de Rijke 2006), as well as their hy-
brid (Balog and de Rijke 2008; Serdyukov and Hiemstra
2008). However, these models rank entities simply based
on their contexts. They are therefore context matching so-
lutions which are inadequate when applying to entity search
because of the ignorance of entity types.
The importance of category matching has been verified
by many solutions of entity search (Balog, Bron, and de Ri-
jke 2010; Kaptein and Kamps 2013), where language mod-
els are typically applied to evaluate the category matching
between entities and queries. There have been some recent
studies that apply a linear combination of term-based (con-
text matching) model and category-based (category match-
ing) model (Balog, Bron, and de Rijke 2011; Raviv, Carmel,
and Kurland 2012). However, such a way of hybrid may
not be effective enough because of the instinctive distinc-
tion between the two models. Their reported results show
that the achieved precision is still not good enough (Balog
et al. 2009a; Raviv, Carmel, and Kurland 2012), when fairly
compared with an alternative (Ramanathan et al. 2009) of
the INEX 2009 entity ranking task.
In this paper, based on generative language modeling
techniques, we propose a formal model of entity search by
formalizing both context matching and category matching,
and associating them more effectively. We also propose a re-
sult re-ranking strategy that can be easily adapted to achieve
a hybrid of two context matching strategies. Extensive ex-
perimental results on the INEX 2009 entity ranking task
demonstrate that the proposed model achieves much better
empirical performance over the existing solutions, by an im-
provement of xinfAP from 0.27 to 0.39.
Our main contribution in this paper is 3 folds: 1) we apply
and extend the existing context matching models, and effec-
tively hybrid them using a result re-ranking technique; 2) we
propose a novel approach of category matching other than
existing language-model-based solutions; 3) we propose the
entity model that effectively combine the proposed context
matching and category matching models other than a linear
combination.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence
16
weixin_38620959
- 粉丝: 10
- 资源: 923
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0