改善实体搜索的上下文和类别匹配资源-CSDN文库

76 浏览量 2021-03-16 02:01:43 上传评论收藏 529KB PDF 举报

资源详情

资源评论

资源推荐

Improving Context and Category Matching for Entity Search

Yueguo Chen

†

, Lexi Gao

‡

, Shuming Shi

, Xiaoyong Du

‡

, Ji-Rong Wen

‡

†

Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China), MOE, China

‡

School of Information, Renmin University of China

Microsoft Research Asia, China

†‡

{chenyueguo, gaolexi, duyong, jrwen}@ruc.edu.cn,

shumings@microsoft.com

Abstract

Entity search is to retrieve a ranked list of named entities

of target types to a given query. In this paper, we pro-

pose an approach of entity search by formalizing both

context matching and category matching. In addition,

we propose a result re-ranking strategy that can be eas-

ily adapted to achieve a hybrid of two context matching

strategies. Experiments on the INEX 2009 entity rank-

ing task show that the proposed approach achieves a sig-

niﬁcant improvement of the entity search performance

(xinfAP from 0.27 to 0.39) over the existing solutions.

Introduction

Entity search has recently attracted much attention (Balog,

Serdyukov, and de Vries 2011; Demartini, Iofciu, and de

Vries 2009). In contrast to general web search whose goal

is to retrieve a list of relevant documents, the goal of entity

search, however, is to generate a short list of relevant entity

names. Compared to general web search, entity search pro-

vides users more succinct answers. It has a wide range of

applications such as question-and-answer (Raghavan, Allan,

and Mccallum 2004), knowledge services (Weikum 2009),

and web content analysis (Demartini et al. 2010).

There are some variant deﬁnitions (Cheng, Yan, and

Chang 2007; Demartini, Iofciu, and de Vries 2009; Balog,

Serdyukov, and de Vries 2011) of the entity search problem

in terms of both inputs and outputs. The widely accepted in-

put is a list of query words plus one or more desired entity

types. In this paper, the problem is deﬁned as: given a list

of keywords or a natural language question, where the types

of the target entities are explicitly speciﬁed, return a ranked

list of relevant entity names of target types. We consider the

above problem over a web scale entity search application

which has a large number of entities (> 10

) and their types

(> 10

) as the domain of entity search results.

Early solutions of entity search mainly take a voting strat-

egy (Demartini, Iofciu, and de Vries 2009; Balog et al.

2009b; Santos, Macdonald, and Ounis 2010). Given a query,

they ﬁrst retrieve top relevant documents from a corpus.

Then, entities embedded in the top relevant documents are

extracted, and ranked mainly based on their occurring fre-

quency within the retrieved documents. The voting approach

 2014, Association for the Advancement of Artiﬁcial

(Macdonald and Ounis 2006) is applied for entity search in

(Santos, Macdonald, and Ounis 2010). In the expert search

domain, people have proposed a number of generative lan-

guage models such as candidate model and document model

(Balog, Azzopardi, and de Rijke 2006), as well as their hy-

brid (Balog and de Rijke 2008; Serdyukov and Hiemstra

2008). However, these models rank entities simply based

on their contexts. They are therefore context matching so-

lutions which are inadequate when applying to entity search

because of the ignorance of entity types.

The importance of category matching has been veriﬁed

by many solutions of entity search (Balog, Bron, and de Ri-

jke 2010; Kaptein and Kamps 2013), where language mod-

els are typically applied to evaluate the category matching

between entities and queries. There have been some recent

studies that apply a linear combination of term-based (con-

text matching) model and category-based (category match-

ing) model (Balog, Bron, and de Rijke 2011; Raviv, Carmel,

and Kurland 2012). However, such a way of hybrid may

not be effective enough because of the instinctive distinc-

tion between the two models. Their reported results show

that the achieved precision is still not good enough (Balog

et al. 2009a; Raviv, Carmel, and Kurland 2012), when fairly

compared with an alternative (Ramanathan et al. 2009) of

the INEX 2009 entity ranking task.

In this paper, based on generative language modeling

techniques, we propose a formal model of entity search by

formalizing both context matching and category matching,

and associating them more effectively. We also propose a re-

sult re-ranking strategy that can be easily adapted to achieve

a hybrid of two context matching strategies. Extensive ex-

perimental results on the INEX 2009 entity ranking task

demonstrate that the proposed model achieves much better

empirical performance over the existing solutions, by an im-

provement of xinfAP from 0.27 to 0.39.

Our main contribution in this paper is 3 folds: 1) we apply

and extend the existing context matching models, and effec-

tively hybrid them using a result re-ranking technique; 2) we

propose a novel approach of category matching other than

existing language-model-based solutions; 3) we propose the

entity model that effectively combine the proposed context

matching and category matching models other than a linear

combination.

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余6页未读，立即下载

评论收藏

内容反馈

weixin_38620959

粉丝: 10
资源: 923

改善实体搜索的上下文和类别匹配

评论0

最新资源

改善实体搜索的上下文和类别匹配

评论0

物联网中适用于内容搜索的实体状态匹配预测方法

Categorized-Suggest:带有对结果进行分类的建议的搜索栏。 类别栏将与滚动条一起浮动，以维护用户的上下文

AutoSuggestSearch:改善搜索体验！ 在搜索时接收产品，类别建议和产品缩略图

论文研究-垂直搜索中基于Lucene的实体匹配设计 .pdf

基于形状上下文的形状匹配

基于形状上下文的人脸匹配算法

深度学习：基于多个上下文双向匹配的同义实体发现（代码+数据）

论文研究-基于SURF和形状上下文的人脸匹配算法.pdf

带有匹配估计方法的物联网基于内容的实体搜索机制

形状上下文相似度匹配算法

论文研究-基于面上下文码匹配的CAD模型检索方法.pdf

对Linux内核中进程上下文和中断上下文的理解

js上下文理解

Linux 内核进程上下文和中断上下文

上下文菜单与上下文操作模式

DAX指南-行上下文和筛选器上下文

基于加权形状上下文的图匹配方法 (2013年)

操作的上下文和基于上下文的操作转换

IO空间 内存管理 平台设备 中断上下文与进程上下文

ChatGP Java 基于SpringBoot的后端web学习项目，支持OpenAI官方所有接口 无限轮聊天 + 带上下文逻辑

snl上下文无关文法

Qt 5实现串口调试助手 （源工程文件、0积分下载）

【SystemVerilog】路科验证V2学习笔记（全600页）.pdf

AutoSAR标准协议4.2.2

光伏-储能并网系统仿真.rar

NPPJSONViewer.zip

GD32替换STM32注意事项.pdf

最新资源

Categorized-Suggest:带有对结果进行分类的建议的搜索栏。类别栏将与滚动条一起浮动，以维护用户的上下文

AutoSuggestSearch:改善搜索体验！在搜索时接收产品，类别建议和产品缩略图

IO空间内存管理平台设备中断上下文与进程上下文

ChatGP Java 基于SpringBoot的后端web学习项目，支持OpenAI官方所有接口无限轮聊天 + 带上下文逻辑

Qt 5实现串口调试助手（源工程文件、0积分下载）