Statistics for Biology and Health
Series Editors
W. Wong, M. Gail, K. Krickeberg, A. Tsiatis, J. Samet
Robert Gentleman Rafael A. Irizarry
Vincent J. Carey Sandrine Dudoit
Wolfgang Huber
Editors
Bioinformatics and
Computational Biology
Solutions Using R
and Bioconductor
With 128 Illustrations
Editors
Robert Gentleman
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024 USA
rgentlem@fhcrc.org
Vincent J. Carey
Channing Laboratory
Brigham and Women’s Hospital
Harvard Medical School
181 Longwood Ave Boston MA 02115 USA
stvjc@channing.harvard.edu
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology
Laboratory
Cambridge, CB10 1SD UK
huber@ebi.ac.uk
Rafael A. Irizarry
Department of Biostatistics
Johns Hopkins Bloomberg
School of Public Health
615 North Wolfe Street
Baltimore, MD 21205 USA
rafa@jhu.edu
Sandrine Dudoit
Division of Biostatistics
School of Public Health
University of California,
Berkeley
140 Earl Warren Hall, #7360
Berkeley, CA 94720-7360
USA
sandrine@stat.berkeley.edu
Series Editors
Wing Wong
Department of Statistics
Stanford University
Stanford, CA 94305
USA
M. Gail
National Cancer Institute
Rockville, MD 20892
USA
K. Krickeberg
Le Cha
¨
telet
F-63270 Manglieu
France
A. Tsiatis
Department of Statistics
North Carolina State University
Raleigh, NC 27695
USA
J. Samet
Department of Epidemiology
School of Public Health
Johns Hopkins University
615 Wolfe Street
Baltimore, MD 21205
USA
Library of Congress Control Number: 2005923843
ISBN-10: 0-387-25146-4 Printed on acid-free paper.
ISBN-13: 978-0387-25146-2
© 2005 Springer Science+Business Media, Inc.
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, Inc., 233 Spring Street, New York, NY
10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connec-
tion with any form of information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they
are not identified as such, is not to be taken as an expression of opinion as to whether or not they are
subject to proprietary rights.
Printed in China. (EVB)
987654321
springeronline.com
Preface
During the past few years, there have been enormous advances in ge-
nomics and molecular biology, which carry the promise of understanding
the functioning of whole genomes in a systematic manner. The challenge
of interpreting the vast amounts of data from microarrays and other high
throughput technologies has led to the development of new tools in the
fields of computational biology and bioinformatics, and opened exciting
new connections to areas such as chemometrics, exploratory data analysis,
statistics, machine learning, and graph theory.
The Bioconductor project is an open source and open development soft-
ware project for the analysis and comprehension of genomic data. It is
rooted in the open source statistical computing environment R. This book’s
coverage is broad and ranges across most of the key capabilities of the
Bioconductor project. Thanks to the hard work and dedication of many
developers, a responsive and enthusiastic user community has formed. Al-
though this book is self-contained with respect to the data processing and
data analytic tasks covered, readers of this book are advised to acquaint
themselves with other aspects of the project by touring the project web
site www.bioconductor.org.
This book represents an innovative approach to publishing about sci-
entific software. We made a commitment at the outset to have a fully
computable book. Tables, figures, and other outputs are dynamically gen-
erated directly from the experimental data. Through the companion web
site, www.bioconductor.org/mogr, readers have full access to the source
code and necessary supporting libraries and hence will be able to see how
every plot and statistic was computed. They will be able to reproduce those
calculations on their own computers and should be able to extend most of
those computations to address their own needs.
Acknowledgments
This book, like so many projects in bioinformatics and computational bi-
ology, is a large collaborative effort. The editors would like to thank the
chapter authors for their dedication and their efforts in producing widely
used software, and also in producing well-written descriptions of how to
use that software.
We would like to thank the developers of R, without whom there would
be no Bioconductor project. Many of these developers have provided ad-
ditional help and engaged in discussions about software development and
design. We would like to thank the many Bioconductor developers and
users who have helped us to find bugs, think differently about problems,
and whose enthusiasm has made the long hours somewhat more bearable.
We would also like to thank Dorit Arlt, Michael Boutros, Sabina
Chiaretti, James MacDonald, Meher Majety, Annemarie Poustka, Jerome
Ritz, Mamatha Sauermann, Holger S
¨
ultmann, Stefan Wiemann, and Seth
Falcon, who have contributed in many different ways to the production of
this monograph. Much of the preliminary work on the MLInterfaces pack-
age, described in Chapter 16, was carried out by Jess Mar, Department
of Biostatistics, Harvard School of Public Health. Ms Mar’s efforts were
supported in part by a grant from Insightful Corporation.
The Bioconductor project is supported by grant 1R33 HG002708 from
the NIH as well as by institutional funds at both the Dana Farber Cancer
Institute and the Fred Hutchinson Cancer Research Center. W.H. received
project-related funding from the German Ministry for Education and Re-
search through National Genome Research Network (NGFN) grant FKZ
01GR0450.
Seattle Robert Gentleman
Boston Vincent Carey
Cambridge (UK) Wolfgang Huber
Baltimore Rafael Irizarry
Berkeley Sandrine Dudoit
February 2005
vi Preface
- 1
- 2
前往页