TowardsPersonalizedMapsMiningUserPreferencesfromGeo-textualData资源-CSDN文库

需积分: 10 36 浏览量 2018-01-21 13:46:47 上传评论收藏 2.51MB PDF 举报

### 个性化地图：从地理文本数据中挖掘用户偏好 #### 摘要本文提出了一种新型的方法，通过分析丰富的在线地理文本数据来构建个性化地图。这种方法能够学习多种类型的用户偏好，并在此基础上开发了一个原型系统（称为PreMiner），旨在支持个性化地图服务。与现有的推荐系统或数据分析系统不同，PreMiner能够高度定制用户的地图体验，并支持多种应用，如用户移动性和兴趣挖掘、区域意见挖掘、用户推荐、兴趣点推荐以及地理文本数据的查询和订阅等。 #### 引言随着移动设备的普及，尤其是配备有GPS功能的设备，人们每天都会在网络上发布大量内容。这些内容通常包含地理位置坐标（经度和纬度），其中一些还关联着具有语义意义的位置，即兴趣点（POI）。此外，这些内容中还包含反映特定话题的文字（见图1(a)），或者反映用户对POI不同方面看法的文字（见图1(b)）。由于地理文本帖子中包含了多种类型的信息，因此我们有机会从中挖掘出不同种类的用户偏好，包括话题偏好、地区偏好、POI方面偏好及类别偏好等。例如，一个倾向于体育话题的用户可能经常在其帖子中提到“射门”、“进球”等词汇；另一个用户可能经常访问某个他喜欢的购物区中的商店（即地区偏好）。然而，构建一个统一的模型来捕捉不同类型用户偏好的挑战主要体现在三个方面： 1. **不同类型的潜在变量**（如方面、情感、地区、话题）与观测变量（如文本、时间、类别、POI）之间的交互作用尚不明确。 2. **数据格式多样**，可能来自不同的数据源，如Yelp和Foursquare等，数据可能存在连续型和离散型之分。 3. **模型的复杂性**，需要同时处理多种类型的数据和偏好，这对算法的设计提出了更高要求。 #### 方法论为了解决上述挑战，本研究提出了一种两阶段的方法来建模和挖掘用户偏好： 1. **用户行为模型**：设计了两种用户行为模型来学习地理文本数据中的几种类型用户偏好。这包括利用统计学方法和机器学习技术来识别用户的兴趣点、话题偏好和地区偏好等。 2. **原型系统PreMiner**：基于用户偏好模型，构建了一个原型系统PreMiner。该系统支持多种应用和服务，如： - **用户移动性和兴趣挖掘**：通过分析用户的历史位置记录和发布的文本内容，系统能够识别用户的出行习惯和兴趣爱好。 - **意见挖掘在区域**：利用文本分析技术，从用户在特定地区的帖子中提取其对某一主题的态度和看法。 - **用户推荐**：根据用户的兴趣和行为模式，向其推荐其他可能感兴趣的内容或用户。 - **兴趣点推荐**：结合用户的地理位置和偏好，推荐附近的兴趣点。 - **地理文本数据的查询和订阅**：允许用户基于地理位置和话题等条件查询相关数据，也可以订阅感兴趣的更新。 #### 结论本文提出的方法提供了一种新的视角来理解和挖掘用户偏好，并通过开发PreMiner系统实现了个性化地图的应用。未来的工作可以进一步扩展用户偏好模型的维度，探索更广泛的数据来源，以及提高系统的可扩展性和实用性。

资源推荐

资源详情

资源评论

Towards Personalized Maps: Mining User Preferences

from Geo-textual Data

Kaiqi Zhao

Yiding Liu

Quan Yuan

Lisi Chen

Zhida Chen

Gao Cong

Nanyang Technological University

{

kzhao002@e.,

ydliu@,

qyuan1@e.,

chen0936@e.,

gaocong@}ntu.edu.sg

Hong Kong Baptist University

chenlisi@comp.hkbu.edu.hk

ABSTRACT

Rich geo-textual data is available online and the data keeps in-

creasing at a high speed. We propose two user behavior models

to learn several types of user preferences from geo-textual data,

and a prototype system on top of the user preference models for

mining and search geo-textual data (called PreMiner) to support

personalized maps. Different from existing recommender systems

and data analysis systems, PreMiner highly personalizes user ex-

perience on maps and supports several applications, including user

mobility & interests mining, opinion mining in regions, user rec-

ommendation, point-of-interest recommendation, and querying and

subscribing on geo-textual data.

1. INTRODUCTION

People post a variety of content to the internet everyday through

GPS-equiped mobile devices. Such posts are associated with ge-

ographical coordinates (latitude and longitude), and some of them

are associated with semantic places, i.e., points-of-interest (POIs).

They also contain words that imply semantic topics (see Figure

1(a)), or words that imply user’s opinions on different aspects of

a POI (see Figure 1(b)). With multiple types of information avail-

able from geo-textual posts, we face a great opportunity to mine

different kinds of user preferences, including preferences on topic,

region, POI aspect, and category. For example, a user who prefers

topic “sports” may often mention words like “shoot” and “goal” in

their posts. As another example, a user may frequently visit shops

in a shopping area she likes (i.e., preferences on region).

However, building a uniﬁed model that captures different types

of user preferences poses three main challenges. First, the interac-

tions among different types of latent variables (e.g., aspect, senti-

ment, region, topic) and observable variables (e.g., text, time, cat-

egory, POI) are unclear. Second, the data could be in different for-

mats (continuous and discrete) from different data sources (e.g.,

Yelp and Foursquare). The variety of data makes the modeling and

parameter learning complicated. Third, the latent variables in dif-

ferent scopes further complicate the model learning. For example,

each sentence in a review is often related to one aspect and the

This work is licensed under the Creative Commons Attribution-

NonCommercial-NoDerivatives 4.0 International License. To view a copy

of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For

any use beyond those covered by this license, obtain permission by emailing

info@vldb.org.

Proceedings of the VLDB Endowment, Vol. 9, No. 13

(a) Words about topic “eat & drink” in a check-in post

(b) Words about different aspects (environment, food and service)

and their corresponding sentiments in a review

Figure 1: Screenshot of short text (e.g., Foursquare check-ins)

and long text (e.g., Yelp reviews) data

whole review should be posted in some latent region. This implies

that aspect and sentiment are often modeled in the scope of sen-

tence, while region is modeled in the scope of document.

To tackle these challenges, we design two probabilistic mod-

els, namely Who, Where, When, What (W4) model [7, 8] and

Sentiment, Aspect, Region (SAR) model [9] for short text and long

text data, respectively. W4 mines user preferences on topics and re-

gions from short geo-textual documents with temporal information

(e.g., check-ins), while SAR mines user preferences on aspects,

categories and regions from geo-textual documents in which tem-

poral information is not available but the text is long enough for

sentiment analysis (e.g., geo-tagged reviews). Both user behav-

ior models support mining several types of user preferences and

hence cater the needs for various applications. The proposed mod-

els achieve better performances than other models in many appli-

cations. For example, SAR achieves at least 60% higher accuracy

than other models in POI recommendation, and W4 performs at

least 80% more accurate than other models in location prediction.

In this demonstration, we propose a prototype system

, namely

PreMiner, which is built on top of the two user behavior models.

Our system supports querying and mining geo-textual data for per-

sonalized map services, based on the two models and techniques

proposed in our previous work [1, 3]. It supports, but not limited to

the following applications:

The system is available at http://spatialkeyword.sce.

ntu.edu.sg/PreMiner.

1545

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余3页未读，立即下载

评论收藏

内容反馈

bccvictory

粉丝: 0
资源: 5

Towards Personalized Maps Mining User Preferences from Geo-textu...

最新资源

Towards Personalized Maps Mining User Preferences from Geo-textu...

The Facebook Data Scientist Interview - Towards Data Science

Python Data Visualization(PACKT,2ed,2015)

2013 vsfm Towards Linear-time Incremental Structure from Motion

Towards-a-Side-Access-Free-Data-Grid-Resource-by._Free!

Reinforcement Learning 101 - Towards Data Science

Making Python Programs Blazingly Fast - Towards Data Science

A Distilled List of AI Trends For 2020 - Towards Data Science

Violin Plot — It’s Time to Ditch the Box Plots - Towards Data Science

us-18-Wu-Towards-Automating-Exploit-Generation-For-Arbitrary-

「基础架构安全」eu-16-Sullivan-Towards-A-Policy-Agnostic-Control-Flow

Towards Frequent Subgraph Mining on Single Large Uncertain Graphs

towards-artificial-general-intelligence-with-hybrid-tianjic-chip-architecture

ExtremeLearningMachine资源共享-Towards-enhancing-centroid-classifier-for-text-classification_2013_Neurocomp.pdf

Structure-from-Motion Revisited

Image Embedding of PMU Data for Deep Learning towards Transient Disturbance

PSG 3D 三维测绘系统

VRPTW 的 Solomon 标准测试数据集

数学建模国赛：无人机遂行编队飞行中的纯方位无源定位分析

2023年国赛数学建模高教社杯获奖优秀论文B题原文多波束测线问题

origin2021下载免费分享

最值得收藏的 数据结构 全部知识点思维导图整理(王道考研), 附带经典题型整理.emmx

利用SVM（支持向量机）进行图像分割/提取-MATLAB

2022年数学建模国赛高教社杯C题古代玻璃制品的成分分析与鉴别优秀论文下载

变分模态分解（VMD）代码

多时间尺度、多分辨率、多PET计算方式的 日/周/月干旱指标SPEI计算代码及测试文件

最新资源

最值得收藏的数据结构全部知识点思维导图整理(王道考研), 附带经典题型整理.emmx

多时间尺度、多分辨率、多PET计算方式的日/周/月干旱指标SPEI计算代码及测试文件