没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
I
EMAIL DATA MINING:
An Approach to construct an organization position
wise structure while performing Email Analysis
A Writing Project Presented to
The Faculty of the Department of Computer Science
San Jose State University
California
In Partial Fulfillment of the Requirements for the Degree
Master of Science
By
Bhargav Vadher
Spring 2010
II
Copyright 2010
Bhargav Vadher
All Rights Reserved
III
ABSTRACT
In this age of social networking, it is necessary to define the relationships among
the members of a social network. Various techniques are already available to define user-
to-user relationships across the network. Over time, many algorithms and machine
learning techniques were applied to find relationships over social networks, yet very few
techniques and information are available to define a relation directly over raw email data.
Few educational societies have developed a way to mine the email log files and have
found the inter-relation between the users by means of clusters. Again, there is no solid
technique available that can accurately predict the ranking of each user within an
organization by mining through their email transaction logs. The author in this report
presents a technique to mine the email data log files in order to figure out the position
wise structure of an organization. The author also discusses send-receive analysis,
statistical analysis, semantic analysis and temporal analysis over the data, and has applied
them to test cases. Throughout the research the author has used the Enron employees
email log files, which was made public on 2001.
IV
ACKNOWLEDGEMENTS
I thank my advisor, Dr. Robert Chun, whose guidance, support, and dedication is
priceless. Dr. Chun is an educator with the truest sense of the knowledge and very kind
personality suitable for students. I sincerely appreciate Dr. Mark Stamp‟s and Dr. Chris
Pollett‟s participation as project committee members. The committee has provided me an
enlightening insight, guiding and polishing the details presented in this report. All of
them, committee members and advisor, have been proved as precious assets to computer
science department.
I would especially like to thank the IEEE (Institute of Electrical and Electronics
Engineers, Inc.) and to SJSU research articles database [27] from Sjlibrary for providing
recent information about my topic.
For sure it has been a challenging, yet rewarding and fruitful journey which I
could not have completed alone and I am thankful for all your support.
Thank you.
V
Table of contents
1.0 Introduction ……………………………………………….….……………………..1
1.1 Overview ……………………………………………….……………….………1
1.2 Email Log mining? …………………………………….……………….………2
2.0 Related work ………………………………………….…….……………….………5
2.1 Related Research …………………………………..…….………………………6
2.2 Date Gathering….. ………………………………….…….……………………..8
3.0 Design…………….. …………………………………….………..……..…………..9
3.1 Cleaning the Junk Data……………………………….…………..…..….………9
3.2 Database……………………………………………….…………..…..….…….11
4.0 Analysis …………..……………………………………….………………………..14
4.1 S-R Analysis and Statistical Analysis …………….………………………..14
4.1.1 Finding the Root ……………………….……………………….15
4.1.1.1 Send-Receive Analysis ………….……………………...15
4.1.1.1.1 Bottom-Up Explanation ……………………16
4.1.1.1.2 Filtering Process ……………………………19
4.1.2 Finding 2
nd
Level Nodes ………………….…………………….23
4.1.3 Finding Lower Level Nodes ……………….…………………...26
4.2 Semantic Analysis ………………………………………….………………28
4.2.1 Need for Semantic Analysis ………………….………………...28
4.2.2 Methodology ………………………………….………………..28
4.2.3 Inspection of Questions ……………………….…………….….30
4.2.4 Integration with Statistical Analysis …………….……………..32
4.2.5 Result of Semantic Analysis …………………….……….…….33
4.3 Temporal Analysis …………………………………………….…………...35
4.3.1 Need for Temporal Analysis ……………………….…….…….35
4.3.2 Methodology ……………………………………….…….…….35
5.0 Software and Tools used ……………………………………………….….……….40
6.0 Experimental results and test cases …………………………………….…………..41
6.1 Related experiments ………………………………………………….…….…..41
6.2 Experimental results ………………………………………………….…….…..42
剩余60页未读,继续阅读
ranchunfeng
- 粉丝: 18
- 资源: 22
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
- 1
- 2
前往页