weka-src.rar_wekasourcecode_FilteredAssociator_filteredassoc资源-CSDN文库

共1826个文件

java：1351个

gif：387个

properties：32个

版权申诉

77 浏览量 2022-09-20 18:32:43 上传评论收藏 5.74MB RAR 举报

《深入理解Weka3.6源代码：Filtered Associator过滤关联规则分析》 Weka（Waikato Environment for Knowledge Analysis）是一款广泛应用于数据挖掘领域的开源机器学习软件，它提供了丰富的机器学习算法和数据预处理工具。本文将聚焦于Weka 3.6版本的源代码，特别是`Filtered Associator`这一过滤关联规则算法，帮助初学者理解和掌握其内部工作原理。 `Filtered Associator`是Weka中的一种结合了过滤器和关联规则挖掘的算法。在数据挖掘过程中，关联规则用于发现数据集中项集之间的有趣关系，如“如果顾客购买了尿布，那么他们也可能会购买啤酒”。过滤器则用于预处理数据，去除噪声、降低维度或者调整数据分布，以提升模型的性能。在Weka的源代码中，`Filtered Associator`实现了以下主要功能： 1. **数据预处理**：`Filtered Associator`使用过滤器对原始数据进行清洗和转换。它允许用户选择不同的过滤器来处理数据，如`RemoveUseless`可以移除无用的属性，`Normalize`可以进行特征归一化，以确保数据在不同尺度上的可比性。 2. **关联规则挖掘**：`Filtered Associator`的核心在于关联规则的挖掘，这通常通过Apriori、FP-Growth等算法实现。这些算法寻找频繁项集，并基于频繁项集生成满足最小支持度和最小置信度的规则。 3. **过滤器与关联算法的结合**：`Filtered Associator`的独特之处在于，它先应用过滤器，然后在过滤后的数据上运行关联规则算法。这种策略可以避免在大量无用或冗余数据上进行关联规则挖掘，从而提高效率。 4. **详细注释**：Weka的源代码注释丰富，对于初学者来说是一份宝贵的资源。通过阅读这些注释，可以理解每一步操作的目的和实现方式，以及如何与其他Weka组件交互。在`weka-src`目录下，你可以找到与`Filtered Associator`相关的Java类，例如`weka.associations.FilteredAssociator`，这个类实现了`Classifier`接口，这是Weka中所有分类和关联规则算法的基类。在这里，你可以看到如何加载过滤器和关联算法，以及如何在过滤后数据上进行挖掘的逻辑。通过深入研究Weka 3.6的源代码，尤其是`Filtered Associator`，不仅可以了解数据预处理和关联规则挖掘的基本概念，还能掌握如何在实际项目中应用和定制这些算法。对于想要在机器学习领域深化学习的开发者而言，Weka源代码是一份不可多得的学习材料。

资源推荐

资源详情

资源评论

收起资源包目录

weka-src.rar_ weka source code_Filtered Associator_filteredassoc （1826个子文件）

anneal.arff 140KB

heart-c.arff 1KB

FilterTest.arff 1KB

InstancesTest.arff 865B

ClassifierTest.arff 864B

Elnino_small.arff 853B

iris.arff 568B

ClassifierTest.cost 55B

Parser.cup 11KB

Parser.cup 9KB

GenericPropertiesCreator.excludes 1KB

weka3.gif 30KB

weka_splash.gif 8KB

weka_background.gif 7KB

DefaultClassifier_animated.gif 3KB

DefaultClassifier.gif 3KB

PaceRegression.gif 3KB

PaceRegression_animated.gif 3KB

SMO_animated.gif 3KB

SMO.gif 3KB

SimpleLinearRegression_animated.gif 3KB

SimpleLinearRegression.gif 3KB

DataNearBalancedND.gif 3KB

DataNearBalancedND_animated.gif 3KB

SimpleLogistic_animated.gif 3KB

SimpleLogistic.gif 3KB

LinearRegression_animated.gif 3KB

LinearRegression.gif 3KB

ClassificationViaRegression.gif 3KB

ClassificationViaRegression_animated.gif 3KB

filters.unsupervised.attribute.NumericToBinary_animated.gif 3KB

LeastMedSq_animated.gif 3KB

LeastMedSq.gif 3KB

filters.unsupervised.attribute.NumericToBinary.gif 3KB

SMOreg.gif 3KB

filters.unsupervised.attribute.StringToNominal_animated.gif 3KB

filters.unsupervised.attribute.StringToNominal.gif 3KB

RegressionByDiscretization_animated.gif 3KB

RegressionByDiscretization.gif 3KB

SMOreg_animated.gif 3KB

END_animated.gif 3KB

END.gif 3KB

ClassBalancedND_animated.gif 3KB

ClassBalancedND.gif 3KB

Default_miscClassifier_animated.gif 3KB

Default_miscClassifier.gif 3KB

filters.unsupervised.instance.Normalize.gif 3KB

filters.unsupervised.instance.Normalize_animated.gif 3KB

IncrementalClassifierEvaluator.gif 3KB

NaiveBayes_animated.gif 3KB

IncrementalClassifierEvaluator_animated.gif 3KB

NaiveBayes.gif 3KB

ComplementNaiveBayes.gif 3KB

ComplementNaiveBayes_animated.gif 3KB

filters.supervised.instance.Resample_animated.gif 3KB

RandomCommittee_animated.gif 3KB

RandomCommittee.gif 3KB

filters.supervised.instance.Resample.gif 3KB

RandomForest_animated.gif 3KB

filters.unsupervised.attribute.AddNoise_animated.gif 3KB

RandomForest.gif 3KB

filters.unsupervised.attribute.AddNoise.gif 3KB

Logistic_animated.gif 3KB

Logistic.gif 3KB

CVParameterSelection.gif 3KB

CVParameterSelection_animated.gif 3KB

LibSVM_animated.gif 3KB

LibSVM.gif 3KB

filters.unsupervised.attribute.Normalize_animated.gif 3KB

filters.unsupervised.attribute.RemoveType.gif 3KB

filters.unsupervised.attribute.RemoveType_animated.gif 3KB

filters.unsupervised.attribute.Normalize.gif 3KB

ThresholdSelector_animated.gif 3KB

ThresholdSelector.gif 3KB

HNB.gif 3KB

Default_lazyClassifier.gif 3KB

filters.unsupervised.attribute.FirstOrder_animated.gif 3KB

filters.unsupervised.attribute.AddExpression_animated.gif 3KB

filters.unsupervised.attribute.FirstOrder.gif 3KB

filters.unsupervised.attribute.AddExpression.gif 3KB

Default_lazyClassifier_animated.gif 3KB

HNB_animated.gif 3KB

MultilayerPerceptron_animated.gif 3KB

MultilayerPerceptron.gif 3KB

LogitBoost_animated.gif 3KB

AdditiveRegression.gif 3KB

LogitBoost.gif 3KB

DecisionStump_animated.gif 3KB

filters.unsupervised.attribute.Standardize_animated.gif 3KB

AdditiveRegression_animated.gif 3KB

filters.unsupervised.attribute.Standardize.gif 3KB

DecisionStump.gif 3KB

NaiveBayesMultinomial.gif 3KB

NaiveBayesMultinomial_animated.gif 3KB

NaiveBayesSimple_animated.gif 3KB

NaiveBayesSimple.gif 3KB

filters.unsupervised.attribute.ReplaceMissingValues_animated.gif 3KB

filters.unsupervised.attribute.ReplaceMissingValues.gif 3KB

filters.unsupervised.attribute.MakeIndicator.gif 3KB

LBR.gif 3KB

共 1826 条

/* * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ /* * Evaluation.java * Copyright (C) 1999 University of Waikato, Hamilton, New Zealand * */ package weka.classifiers; import weka.classifiers.evaluation.NominalPrediction; import weka.classifiers.evaluation.ThresholdCurve; import weka.classifiers.pmml.consumer.PMMLClassifier; import weka.classifiers.xml.XMLClassifier; import weka.core.Drawable; import weka.core.FastVector; import weka.core.Instance; import weka.core.Instances; import weka.core.Option; import weka.core.OptionHandler; import weka.core.Range; import weka.core.RevisionHandler; import weka.core.RevisionUtils; import weka.core.Summarizable; import weka.core.Utils; import weka.core.Version; import weka.core.converters.ConverterUtils.DataSink; import weka.core.converters.ConverterUtils.DataSource; import weka.core.pmml.PMMLFactory; import weka.core.pmml.PMMLModel; import weka.core.xml.KOML; import weka.core.xml.XMLOptions; import weka.core.xml.XMLSerialization; import weka.estimators.Estimator; import weka.estimators.KernelEstimator; import java.beans.BeanInfo; import java.beans.Introspector; import java.beans.MethodDescriptor; import java.io.BufferedInputStream; import java.io.BufferedOutputStream; import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.FileReader; import java.io.InputStream; import java.io.ObjectInputStream; import java.io.ObjectOutputStream; import java.io.OutputStream; import java.io.Reader; import java.lang.reflect.Method; import java.util.Date; import java.util.Enumeration; import java.util.Random; import java.util.zip.GZIPInputStream; import java.util.zip.GZIPOutputStream; /** * Class for evaluating machine learning models. * * ------------------------------------------------------------------- * * General options when evaluating a learning scheme from the command-line: * * -t filename * Name of the file with the training data. (required) * * -T filename * Name of the file with the test data. If missing a cross-validation * is performed. * * -c index * Index of the class attribute (1, 2, ...; default: last). * * -x number * The number of folds for the cross-validation (default: 10). * * -no-cv * No cross validation. If no test file is provided, no evaluation * is done. * * -split-percentage percentage * Sets the percentage for the train/test set split, e.g., 66. * * -preserve-order * Preserves the order in the percentage split instead of randomizing * the data first with the seed value ('-s'). * * -s seed * Random number seed for the cross-validation and percentage split * (default: 1). * * -m filename * The name of a file containing a cost matrix. * * -l filename * Loads classifier from the given file. In case the filename ends with ".xml", * a PMML file is loaded or, if that fails, options are loaded from XML. * * -d filename * Saves classifier built from the training data into the given file. In case * the filename ends with ".xml" the options are saved XML, not the model. * * -v * Outputs no statistics for the training data. * * -o * Outputs statistics only, not the classifier. * * -i * Outputs information-retrieval statistics per class. * * -k * Outputs information-theoretic statistics. * * -p range * Outputs predictions for test instances (or the train instances if no test * instances provided and -no-cv is used), along with the attributes in the specified range * (and nothing else). Use '-p 0' if no attributes are desired. * * -distribution * Outputs the distribution instead of only the prediction * in conjunction with the '-p' option (only nominal classes). * * -r * Outputs cumulative margin distribution (and nothing else). * * -g * Only for classifiers that implement "Graphable." Outputs * the graph representation of the classifier (and nothing * else). * * -xml filename | xml-string * Retrieves the options from the XML-data instead of the command line. * * -threshold-file file * The file to save the threshold data to. * The format is determined by the extensions, e.g., '.arff' for ARFF * format or '.csv' for CSV. * * -threshold-label label * The class label to determine the threshold data for * (default is the first label) * * ------------------------------------------------------------------- * * Example usage as the main of a classifier (called FunkyClassifier): * <code> <pre> * public static void main(String [] args) { * runClassifier(new FunkyClassifier(), args); * } * </pre> </code> * * * ------------------------------------------------------------------ * * Example usage from within an application: * <code> <pre> * Instances trainInstances = ... instances got from somewhere * Instances testInstances = ... instances got from somewhere * Classifier scheme = ... scheme got from somewhere * * Evaluation evaluation = new Evaluation(trainInstances); * evaluation.evaluateModel(scheme, testInstances); * System.out.println(evaluation.toSummaryString()); * </pre> </code> * * * @author Eibe Frank (eibe@cs.waikato.ac.nz) * @author Len Trigg (trigg@cs.waikato.ac.nz) * @version $Revision: 6346 $ */ public class Evaluation implements Summarizable, RevisionHandler { /** The number of classes. */ protected int m_NumClasses; /** The number of folds for a cross-validation. */ protected int m_NumFolds; /** The weight of all incorrectly classified instances. */ protected double m_Incorrect; /** The weight of all correctly classified instances. */ protected double m_Correct; /** The weight of all unclassified instances. */ protected double m_Unclassified; /*** The weight of all instances that had no class assigned to them. */ protected double m_MissingClass; /** The weight of all instances that had a class assigned to them. */ protected double m_WithClass; /** Array for storing the confusion matrix. */ protected double [][] m_ConfusionMatrix; /** The names of the classes. */ protected String [] m_ClassNames; /** Is the class nominal or numeric? */ protected boolean m_ClassIsNominal; /** The prior probabilities of the classes */ protected double [] m_ClassPriors; /** The sum of counts for priors */ protected double m_ClassPriorsSum; /** The cost matrix (if given). */ protected CostMatrix m_CostMatrix; /** The total cost of predictions (includes instance weights) */ protected double m_TotalCost; /** Sum of errors. */ protected double m_SumErr; /** Sum of absolute errors. */ protected double m_SumAbsErr; /** Sum of squared errors. */ protected double m_SumSqrErr; /** Sum of class values. */ protected double m_SumClass; /** Sum of squared class values. */ protected double m_SumSqrClass; /*** Sum of predicted values. */ protected double m_SumPredicted; /** Sum of squared predicted values. */ protected double m_SumSqrPredicted; /** Sum of predicted * class values

评论收藏

内容反馈

版权申诉