周期模式挖掘_周期模式挖掘资源-CSDN文库

需积分: 9 10 浏览量 2013-01-15 18:22:09 上传评论收藏 679KB PDF 举报

### 周期模式挖掘详解 #### 一、引言在数据挖掘领域，周期模式挖掘作为频繁模式挖掘的一种扩展，近年来受到了广泛的关注。它不仅仅关注于数据中的频繁出现模式，还进一步研究了这些模式是否按照一定的周期性规律出现。这种挖掘方法能够帮助我们发现隐藏在大量数据背后的时间序列特征，对于预测分析、市场篮子分析等应用场景具有重要意义。 #### 二、周期模式挖掘的基本概念 ##### 1. 定义与背景周期模式挖掘的目标是从交易数据库中发现那些以用户指定的周期性间隔频繁出现的模式。这里的“频繁”是指模式的支持度（即包含该模式的事务所占的比例）超过用户设定的最小支持度阈值；而“周期性”则意味着这些模式的出现遵循一定的周期规律，即两次连续出现之间的时间间隔大致相同，并且不超过用户设定的最大周期性阈值。 ##### 2. 单约束模型与多约束模型周期模式挖掘最初是基于单约束模型进行的，即每个模式只需要满足一个周期性约束条件即可。然而，在实际应用中，这种方法遇到了所谓的“稀有项问题”，即包含罕见项目的模式可能会被误判为周期频繁模式。为了解决这一问题，提出了基于多约束模型的方法，它可以同时考虑多个周期性和支持度的约束条件。这种方法虽然能够在一定程度上提高挖掘结果的质量，但由于不满足向下闭合性质，导致计算复杂度较高。 #### 三、高效周期模式挖掘方法 ##### 1. 改进的多约束模型针对现有模型中存在的不足，研究者提出了一种新的多约束模型，旨在解决周期模式挖掘过程中的计算效率问题。新模型通过合理设置多个约束条件，既能够有效避免“稀有项问题”，又能确保周期模式的向下闭合性质。这使得周期模式的挖掘可以更加高效地进行。 ##### 2. 模式增长算法为了配合改进后的多约束模型，还开发了一种专门的模式增长算法。这种算法采用了类似FP树的数据结构来存储和处理数据，从而有效地减少了不必要的搜索路径，提高了挖掘周期模式的速度。实验结果显示，相比于传统的周期模式挖掘方法，这种新模式和算法组合能够显著提高挖掘效率，并减少无趣模式的产生。 #### 四、案例分析以一项具体的研究为例，该研究由Akshat Surana、R. Uday Kiran 和 P. Krishna Reddy等人完成，他们在《An Efficient Approach to Mine Periodic-Frequent Patterns in Transactional Databases》中详细介绍了如何利用改进的多约束模型和模式增长算法来进行高效的周期模式挖掘。他们通过实验证明，新的模型不仅能够有效克服现有方法的局限性，还能在保持高质量的挖掘结果的同时大幅提升计算效率。 #### 五、结论与展望周期模式挖掘作为一种重要的数据挖掘技术，其应用前景广阔。通过对传统方法的改进，我们可以更准确、高效地发现数据中的周期性频繁模式。未来的研究可以进一步探索如何将周期模式挖掘与其他数据挖掘技术相结合，以应对更加复杂的现实世界问题。此外，还可以考虑如何利用机器学习算法自动调整约束条件，以适应不同的应用场景需求。

资源推荐

资源详情

资源评论

An Efﬁcient Approach to Mine Periodic-Frequent

Patterns in Transactional Databases

Akshat Surana, R. Uday Kiran and P. Krishna Reddy

Center for Data Engineering,

International Institute of Information Technology-Hyderabad,

Hyderabad, Andhra Pradesh, India - 500032.

{akshat.surana,uday_rage}@research.iiit.ac.in,pkreddy@iiit.ac.in

Abstract. Recently, temporal occurrences of the frequent patterns in a transac-

tional database has been exploited as an interestingness criterion to discover a

class of user-interest-based frequent patterns, called periodic-frequent patterns.

Informally, a frequent pattern is said to be periodic-frequent if it occurs at reg-

ular intervals speciﬁed by the user throughout the database. The basic model of

periodic-frequent patterns is based on the notion of “single constraints.” Using

this model to mine periodic-frequent patterns containing both frequent and rare

items leads to a dilemma called the “rare item problem.” To confront the prob-

lem, an alternative model based on the notion of “multiple constraints” has been

proposed in the literature. The periodic-frequent patterns discovered with this

model do not satisfy downward closure property. As a result, it is computation-

ally expensive to mine periodic-frequent patterns with the model. Furthermore, it

has been observed that this model still generates some uninteresting patterns as

periodic-frequent patterns. With this motivation, we propose an efﬁcient model

based on the notion of “multiple constraints.” The periodic-frequent patterns dis-

covered with this model satisfy downward closure property. Hence, periodic-

frequent patterns can be efﬁciently discovered. A pattern-growth algorithm has

also been discussed for the proposed model. Experimental results show that the

proposed model is effective.

Keywords: Data mining, frequent pattern, rare item problem, multiple constraints.

1 Introduction

Periodic-frequent patterns were introduced in [7]. A frequent pattern is said to be periodic-

frequent if it occurs at regular intervals speciﬁed by the user in the database. Techni-

cally, a pattern (i.e. a set of items or itemset) is considered periodic-frequent if it satisﬁes

the user-speciﬁed minimum support (minsup) and maximum periodicity (maxprd) con-

straints. Minsup controls the minimum number of transactions in which a pattern must

appear in the database. Maxprd controls the maximum time difference between two

consecutive appearances of a pattern in the database.

Since only single minsup and single maxprd values are used for the entire database,

the basic model implicitly assumes that all items in a database have uniform frequencies

and similar occurrence behavior. However, this is often not the case in many real-world

2 Akshat Surana, R. Uday Kiran and P. Krishna Reddy

databases. In a real-world database, some items reoccur frequently while others reoccur

relatively infrequent (or rarely). Furthermore, the rare items may have a larger reoc-

currence interval than the frequent items. If the items’ frequencies in a database vary

widely, usage of single minsup and single maxprd framework to discover periodic-

frequent patterns containing both frequent and rare items leads to “rare item problem”

[8] (discussed in Section 2). It has been shown in the literature that periodic-frequent

patterns consisting of rare items can provide useful information.

Example 1. In a supermarket, costly and/or durable goods such as soap and shampoo

are less frequently purchased than cheaper and/or perishable goods such as bread and

jam. However, soap and shampoo together can generate more revenue per unit than

bread and jam. Furthermore, the duration between two consecutive purchases of soap

and shampoo can be generally longer than the two consecutive purchases of bread and

jam.

A model based on multiple minsups and multiple maxprds framework has been pro-

posed in [5] to confront “rare item problem.” However, this model is computationally

expensive to implement because periodic-frequent patterns mined do not satisfy down-

ward closure property, i.e., not all non-empty subsets of a periodic-frequent pattern are

periodic-frequent.

In this paper, we have proposed an improved model to mine periodic-frequent pat-

terns with multiple minsups and multiple maxprds framework. A pattern growth ap-

proach has also been proposed for efﬁcient mining of periodic-frequent patterns. The

periodic-frequent patterns mined with the proposed model satisfy downward closure

property. As a result the proposed model is computationally efﬁcient than the model

discussed in [5]. Experimental results show that the proposed approach is efﬁcient in

mining periodic-frequent patterns containing both frequent and rare items.

The rest of the paper is organized as follows. In Section 2, we discuss the back-

ground on mining periodic-frequent patterns in transactional databases. In Section 3, we

discuss the motivation and introduce the proposed model. A pattern-growth approach

to mine periodic-frequent patterns is discussed in Section 4. Experimental results are

reported in Section 5. The last section contains conclusions.

2 Background

2.1 Periodic-Frequent Pattern Model

Periodic-frequent patterns [7] are a class of user-interest based frequent patterns that

exist in a database. The basic model of periodic-frequent pattern mining is as follows.

Let I = {i

, i

, ··· , i

} be a set of items. A set of items X where X ⊆ I is called

a pattern (or an itemset). A transaction t =(tid, Y ) is a tuple, where tid represents a

transaction-id (or a timestamp) and Y is a pattern. A transactional database T over I

is a set of transactions, T = {t

, ··· , t

}, m = |T |, where |T | is the size of T in total

number of transactions. If X ⊆ Y , it is said that t contains X or X occurs in t and such

transaction-id is denoted as t

, j ∈ [1, m]. Let T

= {t

, ··· , t

} ⊆ T , where k ≤ l and

k, l ∈ [1, m] be the ordered set of transactions in which pattern X has occurred. Let t

剩余11页未读，继续阅读

评论收藏

内容反馈

xushouquan

粉丝: 9
资源: 6

周期模式 挖掘

基于Sequitur的时间序列异步周期模式挖掘 (2012年)

多粒度时间下的部分周期模式挖掘 (2005年)

时间序列数据挖掘相似性度量和周期模式挖掘研究.pptx

时间序列周期模式挖掘的周期检测方法.pdf

基于GPS轨迹的周期模式发现

时间序列部分周期模式的更新算法 (2011年)

关于云计算的Web数据挖掘方法.pdf

论数据挖掘技术在客户关系管理(CRM)中的应用.pdf

论数据挖掘技术在客户关系管理(CRM)中的应用.docx

基于股票时间序列数据的关联规则挖掘研究-文献综述报告.doc

基于数据挖掘的高效取样方法对手机用户的周期运动模式的研究.pdf

基于凝聚层次聚类的域内交通流周期模式发现 (2015年)

基于GPS轨迹的周期模式发现.pdf

模式挖掘中的知识条纹表示过程研究.rar

基于GPS数据的周期性行为挖掘.pdf

数据挖掘在客户全生命周期管理中的应用研究.pdf

时序数据的异常模式挖掘.pptx

时间序列模式挖掘.pdf

珍贵的数据挖掘挖掘资料

一种面向分布式数据流的闭频繁模式挖掘方法.pdf

第11期-基于海量轨迹数据的列车编队模式挖掘.pdf

20151223-安信证券-农业行业2016年度投资策略：把握景气周期反转，挖掘转型成功模式.rar

数据挖掘完整 PPT

【Python实战】-Python+Opencv是实现车牌自动识别（源码+数据+字符匹配模板）

Python基于机器学习实现的股票价格预测、股票预测源码+数据集，机器学习大作业

时间序列数据集TSdatasets.rar

最新资源

周期模式挖掘