没有合适的资源?快使用搜索试试~ 我知道了~
用于关联分析的 IBM SPSS Modeler 14.2.doc
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 152 浏览量
2024-04-02
14:31:18
上传
评论
收藏 556KB DOC 举报
温馨提示
试读
14页
用于关联分析的 IBM SPSS Modeler 14.2.doc
资源推荐
资源详情
资源评论
Last Updated 4/2/2024 2:52:17 PM Page 1
Data Mining with IBM SPSS Modeler 14.2
University of Arkansas
David Douglas
Association Analysis
Last Updated 4/2/2024 2:52:17 PM Page 2
Notes on Association Analysis using IBM SPSS Modeler 14.2
Association Rules Using Clementine
IBM SPSS Modeler 14.2 has three different algorithms for generating association rules. Input data format
can be either tabular or transactional. The models are:
� Apriori – all data must be categorical
� Carma – categorical consequents but can have numeric inputs
� Sequential – sequential association rules
Apriori also produces association rules in a very efficient manner. It also has the advantage of having
options that provide choices in the criterion measurements used to guide detecting the rules. However, it has
a major disadvantage in that only categorical (symbolic) fields are allowed as inputs.
Carma, unlike GRI and Apriori, offers options for rule detection that includes support for both the antecedent
and the consequence; plus it handles data in transaction format. Additionally, it allows rules with multiple
consequents, or outcome and is not limited to categorical data.
Sequential association analysis takes into account the sequence of events. It works with either transaction or
table data.
Notes on Data Formats for Association Analysis
Market basket is a natural for association analysis and there are two general formats of data representation
for market basket analysis. The first is sometimes referred to as the transactional data format and the second
is the tabular data format. The transactional data format requires only two fields—an id field and a content
field. For example (ignore quantities purchased for now),
Transaction ID Items
1 Broccoli
1 Green Peppers
1 Corn
2 Asparagus
2 Squash
2 Corn
3 Corn
3 Tomatoes
… …
Note in this case that a single transaction requires several records. SAS EM 6.1 requires this format, unless
you have its data warehousing software—which we do not have.
Last Updated 4/2/2024 2:52:17 PM Page 3
In the tabular data format, each record is a transaction (also ignoring quantities purchased for now) and a flag
(0/1 or T/F) to represent a purchase or not. For example,
Trans Id
Asparagus
Beans
Broccoli
Corn
Green
Peppers
Squash
Tomatoes
1
0
0
1
1
1
0
0
2
1
0
0
1
0
1
0
3
0
1
0
1
0
1
1
…
…
…
n
1
1
0
0
1
0
1
Note that this data format can become very cumbersome for a large number of products and a large number
of transactions—and will typically be a very sparse matrix. Thus, two approaches for mining a large
number of transactions with a large number of products are generally taken.
1 SQL will be used to create a transaction data format file that will be used for the market basket
association analysis
2 A software product will be used that does in-database data mining
IBM SPSS Modeler 14.2 does in-database mining via ODBC and with database vendors’ products. For
example, IBM SPSS Modeler 14.2 can be used for in-database data mining for DB2 and likewise works with
SQL Server and Oracle.
IBM SPSS Modeler 14.2 for Association Analysis
Aprori & Carma Model in Tabular Format
The Aprori model will be illustrated first. Place an Excel node on the model canvas as shown above, open
the edit window and import the Baskets1n.xls file. Click the Types tab and click the Read Values button.
剩余13页未读,继续阅读
资源评论
百态老人
- 粉丝: 1602
- 资源: 2万+
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功