北京交通大学硕士学位论文
Abstract
Weka has gradually been a world well-known data mining platform,
which is attracting more and more users to join in because of Weka's
characters of open source code and free using, lots of algorithms, standard
architecture, good compatibility. With the excellent behavior of data mining
technology play in great data processing, developing a new data mining
system will make count, but every commerce dm tools almost is kept secret in
designing and source code, just like more and more users are interested in
Linux's core, Weka will be a wisdom selection.
The paper lucubrates in the architecture of Weka platform, generally
anatomizes each package and detailedly analyses core files of Weka System.
What's more, it briefly summarizes the region, function, usage, input&output,
visualization, custom development, related projects; points out the problem in
face of Weka; and presents a method to enhance the function of data
preprocessing of weka.
Weka usually impresses one numerous and jumbled, that is it, there are
multifarious functions and algorithms which is terribly complex, thirty
thousand lines and eight handred files of java ext. It is a significative job to
dispart the core of Weka platform. This paper will reduce the system in
classifier, eg. Only thirty one files support the NavieBayesSimple classifier.
Weka belongs to untight couple DM tools, usually spents much time on
searching, extracting, cleaning, transforming in the phase of data
preprocessing, what's more, untight couple DM tool recurs to other methods
to extract data, and is hard to integrate with information process systems. As
far as time consuming phase of data mining, data preprocessing takes up
60%-80% the whole time, all that make it is important to improve the
intelligence of data processing. At last, after combining with weka platform
and JDBCWrapper the paper analyses and designs the couple between weka
- 1
- 2
- 3
- 4
前往页