Java Data Mining: Strategy,
Standard, and Practice
Java Data Mining: Strategy, Standard, and Practice
Mark F. Hornick, Erik Marcadé, and Sunil Venkayala
Joe Celko’s Analytics and OLAP in SQL
Joe Celko
Data Preparation for Data Mining Using SAS
Mamdouh Refaat
Querying XML: XQuery, XPath, and SQL/XML in
Context
Jim Melton and Stephen Buxton
Data Mining: Concepts and Techniques, Second
Edition
Jiawei Han and Micheline Kamber
Database Modeling and Design: Logical Design,
Fourth Edition
Toby J. Teorey, Sam S. Lightstone, and Thomas
P. Nadeau
Foundations of Multidimensional and Metric Data
Structures
Hanan Samet
Joe Celko’s SQL for Smarties: Advanced SQL
Programming, Third Edition
Joe Celko
Moving Objects Databases
Ralf Hartmut Güting and Markus Schneider
Joe Celko’s SQL Programming Style
Joe Celko
Data Mining, Second Edition: Concepts and
Te c h n i q u e s
Ian Witten and Eibe Frank
Fuzzy Modeling and Genetic Algorithms for Data
Mining and Exploration
Earl Cox
Data Modeling Essentials, Third Edition
Graeme C. Simsion and Graham C. Witt
Location-Based Services
Jochen Schiller and Agnès Voisard
Database Modeling with Microsft® Visio for
Enterprise Architects
Terry Halpin, Ken Evans, Patrick Hallock, and
Bill Maclean
Designing Data-Intensive Web Applications
Stephano Ceri, Piero Fraternali, Aldo Bongio, Marco
Brambilla, Sara Comai, and Maristella Matera
Mining the Web: Discovering Knowledge from
Hypertext Data
Soumen Chakrabarti
Advanced SQL: 1999—Understanding Object-
Relational and Other Advanced Features
Jim Melton
Database Tuning: Principles, Experiments, and
Troubleshooting Techniques
Dennis Shasha and Philippe Bonnet
SQL: 1999—Understanding Relational Language
Components
Jim Melton and Alan R. Simon
Information Visualization in Data Mining and
Knowledge Discovery
Edited by Usama Fayyad, Georges G. Grinstein,
and Andreas Wierse
Transactional Information Systems: Theory,
Algorithms, and Practice of Concurrency Control and
Recovery
Gerhard Weikum and Gottfried Vossen
Spatial Databases: With Application to GIS
Philippe Rigaux, Michel Scholl, and Agnès Voisard
Information Modeling and Relational Databases:
From Conceptual Analysis to Logical Design
Terry Halpin
Component Database Systems
Edited by Klaus R. Dittrich and Andreas Geppert
Managing Reference Data in Enterprise Databases:
Binding Corporate Data to the Wider World
Malcolm Chisholm
Understanding SQL and Java Together: A Guide to
SQLJ, JDBC, and Related Technologies
Jim Melton and Andrew Eisenberg
Database: Principles, Programming, and Performance,
Second Edition
Patrick and Elizabeth O’Neil
The Object Data Standard: ODMG 3.0
Edited by R. G. G. Cattell and Douglas K. Barry
Data on the Web: From Relations to Semistructured
Data and XML
Serge Abiteboul, Peter Buneman, and Dan Suciu
Data Mining: Practical Machine Learning Tools and
Techniques with Java Implementations
Ian Witten and Eibe Frank
Joe Celko’s SQL for Smarties: Advanced SQL
Programming, Second Edition
Joe Celko
Joe Celko’s Data and Databases: Concepts in Practice
Joe Celko
Developing Time-Oriented Database Applications
in SQL
Richard T. Snodgrass
Web Farming for the Data Warehouse
Richard D. Hackathorn
Management of Heterogeneous and Autonomous
Database Systems
Edited by Ahmed Elmagarmid, Marek
Rusinkiewicz, and Amit Sheth
Object-Relational DBMSs: Tracking the Next Great
Wave, Second Edition
Michael Stonebraker and Paul Brown, with
Dorothy Moore
A Complete Guide to DB2 Universal Database
Don Chamberlin
Universal Database Management: A Guide to Object/
Relational Technology
Cynthia Maro Saracco
Readings in Database Systems, Third Edition
Edited by Michael Stonebraker and Joseph M.
Hellerstein
Understanding SQL’s Stored Procedures: A Complete
Guide to SQL/PSM
Jim Melton
Principles of Multimedia Database Systems
V. S. Subrahmanian
Principles of Database Query Processing for Advanced
Applications
Clement T. Yu and Weiyi Meng
Advanced Database Systems
Carlo Zaniolo, Stefano Ceri, Christos Faloutsos,
Richard T. Snodgrass, V. S. Subrahmanian, and
Roberto Zicari
Principles of Transaction Processing
Philip A. Bernstein and Eric Newcomer
Using the New DB2: IBM’s Object-Relational
Database System
Don Chamberlin
Distributed Algorithms
Nancy A. Lynch
Active Database Systems: Triggers and Rules For
Advanced Database Processing
Edited by Jennifer Widom and Stefano Ceri
Migrating Legacy Systems: Gateways, Interfaces, & the
Incremental Approach
Michael L. Brodie and Michael Stonebraker
Atomic Transactions
Nancy Lynch, Michael Merritt, William Weihl, and
Alan Fekete
Query Processing for Advanced Database Systems
Edited by Johann Christoph Freytag, David Maier,
and Gottfried Vossen
Transaction Processing: Concepts and Techniques
Jim Gray and Andreas Reuter
Building an Object-Oriented Database System: The
Story of O
2
Edited by François Bancilhon, Claude Delobel, and
Paris Kanellakis
Database Transaction Models for Advanced
Applications
Edited by Ahmed K. Elmagarmid
A Guide to Developing Client/Server SQL
Applications
Setrag Khoshafian, Arvola Chan, Anna Wong, and
Harry K. T. Wong
The Benchmark Handbook for Database and
Transaction Processing Systems, Second Edition
Edited by Jim Gray
Camelot and Avalon: A Distributed Transaction Facility
Edited by Jeffrey L. Eppinger, Lily B. Mummert,
and Alfred Z. Spector
Readings in Object-Oriented Database Systems
Edited by Stanley B. Zdonik and David Maier
The Morgan Kaufmann Series in Data Management Systems
Series Editor: Jim Gray, Microsoft Research
Java Data Mining: Strategy,
Standard, and Practice
A Practical Guide for Architecture,
Design, and Implementation
Mark F. Hornick
Erik Marcadé
Sunil Venkayala
AMSTERDAM • BOSTON • HEIDELBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Publisher Diane D. Cerra
Publishing Services Manager George Morrison
Project Manager Marilyn E. Rash
Assistant Editor Asma Palmeiro
Cover Design Brian May, Maycreate LLC
Production Services Graphic World Inc.
Composition diacriTech
Illustration diacriTech
Interior Printer The Maple-Vail Book Manufacturing Group
Cover Printer Phoenix Color Corp
Morgan Kaufmann Publishers is an imprint of Elsevier.
500 Sansome Street, Suite 400, San Francisco, CA 94111
This book is printed on acid-free paper.
© 2007 by Elsevier Inc. All rights reserved.
Designations used by companies to distinguish their products are often claimed as trademarks or registered
trademarks. In all instances in which Morgan Kaufmann Publishers is aware of a claim, the product names
appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies
for more complete information regarding trademarks and registration.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means—electronic, mechanical, photocopying, scanning, or otherwise—without prior written
permission of the publisher.
Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford,
UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail: permissions@elsevier.com. You may
also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting
“Support & Contact” then “Copyright and Permission” and then “Obtaining Permissions.”
Java Specification Request 73. Copyright © 2004. Oracle Corporation. Used with permission.
Java Specification Request 274. Copyright © 2005. Oracle Corporation. Used with permission.
Library of Congress Cataloging-in-Publication Data
Hornick, Mark F.
Java data mining : strategy, standard, and practice : a practical guide for architecture, design,
and implementation / Mark F. Hornick, Erik Marcadé, Sunil Venkayala.
p. cm.—(The Morgan Kaufmann series in data management systems)
Includes bibliographical references and index.
ISBN 0-12-370452-9 (acid-free paper)
1. Data mining. 2. Java (Computer program language) I. Marcadé, Erik.
II. Venkayala, Sunil. III. Title.
QA76.9.D343.H67 2007
005.74—dc22 2006050783
ISBN-10: 0-12-370452-9
ISBN-13: 978-0-12-370452-8
For information on all Morgan Kaufmann publications,
visit our Web site at www.mkp.com or www.books.elsevier.com
Printed in the United States of America
06 07 08 09 10 10 9 8 7 6 5 4 3 2 1