Graph-based Natural Language Processing and Information Retrieval

所需积分/C币:9 2014-02-22 15:50:17 1.54MB PDF
20
收藏 收藏
举报

Graphs are ubiquitous. There is hardly any domain in which objects and their relations cannot be intuitively represented as nodes and edges in a graph. Graph theory is a well-studied sub-discipline of mathematics, with a large body of results and a large number of efficient algorithms that operate on graphs. Like many other disciplines, the fields of natural language processing (NLP) and information retrieval (IR) also deal with data that can be represented as a graph. In this light, it is somewhat surprising that only in recent years the applicability of graph-theoretical frameworks to language technology became apparent and increasingly found its way into publications in the field of computational linguistics. Using algorithms that take the overall graph structure of a problem into account, rather than characteristics of single objects or (unstructured) sets of objects, graph-based methods have been shown to improve a wide range of NLP tasks. In a short but comprehensive overview of the field of graph-based methods for NLP and IR, Rada Mihalcea and Dragomir Radev list an extensive number of techniques and examples from a wide range of research papers by a large number of authors. This book provides an excellent review of this research area, and serves both as an introduction and as a survey of current graph-based techniques in NLP and IR. Because the few existing surveys in this field concentrate on particular aspects, such as graph clustering (Lancichinetti and Fortunato 2009) or IR (Liu 2006), a textbook on the topic was very much needed and this book surely fills this gap. The book is organized in four parts and contains a total of nine chapters. The first part gives an introduction to notions of graph theory, and the second part covers natural and random networks. The third part is devoted to graph-based IR, and part IV covers graph-based NLP. Chapter 1 lays the groundwork for the remainder of the book by introducing all necessary concepts in graph theory, inc
GRAPH-BASED NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL Graph theory and the fields of natural language processing and information retrieval are well-studied disciplines. Traditionally, these areas have been per- ceived as distinct, with different algorithms, different applications, and different potential end-users. However, recent research has shown that these disciplines are intimately connected, with much variety in the way that natural language processing and information retrieval applications find efficient solutions within graph-theoretical frameworks This book is a comprehensive description of the use of graph-based algo rithms for natural language processing and information retrieval. It brings together topics as diverse as lexical semantics, text summarization, text min ing, ontology construction, text classification, and text retrieval, which are connected by the common underlying theme of the use of graph-theoretical methods for text-and information-processing tasks. Readers will gain a firm understanding of the major methods and applications in natural language pro cessing and information retrieval that rely on graph-based representations and algorithms Rada mihalcea is an associate Professor in the department of computer sci ence and Engineering at the University of North Texas, where she leads the Language and Information Technologies research group In 2009, she received the Presidential Early Career Award for Scientists and Engineers, awarded by President Barack Obama. She served on the editorial board of several journals, including Computational Linguistics, Journal of Natural Language engineer ing, and Language Resources and Evaluations, and she cochaired the empirical Methods in Natural Language Processing Conference in 2009 and the Associa- tion for Computational linguistics Conference in 2011. She has been published in IEEE Intelligent SystemS, Journal of Natural Language Engineering, Jour- nal of Machine Translation, Computational Intelligence, International Journal of semantic Computing, and Artificial Intelligence magazine Dragomir Radev is a Professor in the School of Information, the Depart- ment of Electrical Engineering and Computer Science, and the Department of Linguistics at the University of Michigan, where he is the leader of the Com putational Linguistics and Information Retrieval(CLaIr) research group. He has had more than 100 publications in conferences and journals such as Com- munications of the Association for Computing Machinery(ACM), Journal of Artificial Intelligence research, BioinformaticS, Computational Linguistics, Information Processing and Management, and American Journal of political Science, among others. He is on the editorial boards of information retrieval Journal of Natural language Engineering, and Journal of artificial Intelligence Research. Professor radev is an acm distinguished scientist as well as coach of the U.S. high school team in computational linguistics. He is also an adjunct Professor in computer science at Columbia University GRAPH-BASED NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL RADA MIHALCEA University of North Texas, Department of Computer Science and Engineering DRAGOMIR RADEV University of michigan School of information Department of Electrical Engineering and Computer Science Department of linguistics 罗 CAMBRIDGE UNIVERSITY PRESS CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape to Singapore, Sao Paulo, Delhi, Tokyo, Mexico City Cambridge University Press 32 Avenue of the americas. New york. ny 10013-2473 USA www.cambridge.org Informationonthistitlewww.cambridge.org/9780521896139 C Rada Mihalcea and Dragomir Radev 2011 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements no reproduction of any part may take place without the written permission of Cambridge University Press First published 2011 Printed in the united states of america A catalog record for this publication is available from the british library Library of Congress Cataloging in Publication data Mihalcea Rada, 1974- raph Based Natural Language Processing and Information Retrieval /Rada mihalcea Dragomir radev Includes bibliographical references and index ISBN978-0-521-89613-9 1. Natural language processing( Computer science) 2. Graphical user interfaces Computer systems)I Radev, Dragomir, 1968- Il. Title QA76.9N38M532011 005437dc22 2010044578 isbn 978-0-521-89613-9 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet Web sites referred to in this publication and does not guarantee that any content on such Web sites is, or will remain, accurate or appropriate Contents Introduction page I 0. 1 Background 0.2 Book Organization 0.3 Acknowledgments 347 Part I. Introduction to Graph Theory 1 Notations, Properties, and representations 1.1 Graph terminology and notations 1.2 Graph properties 1.3 Graph Types 14 1. 4 Representing Graphs as Matrices 15 1.5 USing Matrices to Compute Graph Properties 16 1.6 Representing graphs as linked lists 17 1.7 Eigenvalues and Eigenvectors 18 2 Graph-Based Algorithms 20 2. 1 Depth-First Graph Traversal 20 2.2 Breadth-First Graph Traversal 22 2.3 Minimum Spanning Trees 23 2. 4 Shortest-Path algorithms 26 2.5 Cuts and flows 29 2.6 Graph Matching 31 2. 7 Dimensionality reduction 32 2.8 Stochastic Processes on graphs 34 2.9 Harmonic functions 38 2.10 Random Walks 40 2. 11 Spreading activation 41 2. 12 Electrical Interpretation of Random Walks 42 2.13 Power Method 44 Contents 2.14 Linear Algebra Methods for Computing Harmonic Functions 45 2. 15 Method of relaxations 46 2.16 Monte Carlo method Part network 3 Random networks 3. 1 Networks and graphs 3.2 Random graphs 54 3.3 Degree Distributions 3.4 Power laws 57 3.5 Zipfs Law 58 3.6 Preferential Attachment 61 3.7 Giant Component 62 3.8 Clustering Coefficient 62 3.9 Small Worlds 63 3.10 Assortativity 65 3. 1 1 Centrality 67 3. 12 Degree Centralit 67 3. 13 Closeness Centrality 68 3. 14 Betweenness Centrality 6 3. 15 Network Example 3.16 Dynamic Processes: Percolation 3. 17 Strong and weak ties 74 3.18 Assortative Mixing 76 3. 19 Structural holes 76 4 Language Networks 78 4.1 Co-Occurrence Networks 4.2 Syntactic Dependency Networks 80 4.3 Semantic Networks 81 4.4 Similarity Networks 85 Part Ill. Graph-Based Information Retrieval 5 Link analysis for the World wide Web 91 5.1 The Web as a graph 91 5.2 PageRank 92 5.3 Undirected graphs 95 5.4 Weighted Graphs 95 Contents 5.5 Combining PageRank with Content Analysis 97 5.6 Topic-Sensitive Link Analysis 97 5.7 Query-Dependent Link Analysis 100 5.8 Hyperlinked-Induced Topic Search 101 5.9 Document Reranking with Induced Links 103 6 Text Clustering 106 6. 1 Graph-Based Clustering 108 6.2 Spectral Methods 111 6.3 The Fiedler Method 113 6.4 The Kernighan-Lin Method 114 6.5 Betweenness-Based Clustering 115 6.6 Min-Cut Clustering 117 6.7 Text Clustering Using Random Walks 119 Part IV. Graph-Based Natural Language Processing 7 Semantics 123 7. 1 Semantic Classes 123 7.2 Synonym Detection 125 7.3 Semantic Distance 126 7.4 Textual Entailment 129 7.5 Word-Sense Disambiguation 131 7.6 Name Disambiguation 134 7.7 Sentiment and Subjectivity 135 8 Syntax 140 8. 1 Part-of-Speech Tagging 140 8.2 Dependency Parsing 141 8.3 Prepositional-Phrase Attachment 144 8. 4 Co-Reference resolution 146 9 Applications 149 9.1 Summarization 149 9.2 Semi-supervised Passage Retrieval 150 9.3 Keyword Extraction 154 9.4 Topic Identification 156 9.5 Topic Segmentation 161 9.6 Discourse 162 9.7 Machine Translation 165 Contents 9.8 Cross-Language Information Retrieval 166 9.9 Information Extraction 169 9.10 Question Answering 9.11 Term Weighting 174 Bibliography ndex 191

...展开详情
试读 127P Graph-based Natural Language Processing and Information Retrieval
立即下载 低至0.43元/次 身份认证VIP会员低至7折
一个资源只可评论一次,评论内容不能少于5个字
sinat_27713177 很好的书,感谢分享!
2016-04-27
回复
CrawlingForward 非常好,找了很久,总算在csdn找到
2015-08-31
回复
zhc199 值得学习呀
2015-06-03
回复
嘉和的空间 很不错的书,其中的自然语言网络构建的部分讲解的很详细,谢谢分享。
2014-05-26
回复
您会向同学/朋友/同事推荐我们的CSDN下载吗?
谢谢参与!您的真实评价是我们改进的动力~
关注 私信
上传资源赚积分or赚钱
最新推荐
Graph-based Natural Language Processing and Information Retrieval 9积分/C币 立即下载
1/127
Graph-based Natural Language Processing and Information Retrieval第1页
Graph-based Natural Language Processing and Information Retrieval第2页
Graph-based Natural Language Processing and Information Retrieval第3页
Graph-based Natural Language Processing and Information Retrieval第4页
Graph-based Natural Language Processing and Information Retrieval第5页
Graph-based Natural Language Processing and Information Retrieval第6页
Graph-based Natural Language Processing and Information Retrieval第7页
Graph-based Natural Language Processing and Information Retrieval第8页
Graph-based Natural Language Processing and Information Retrieval第9页
Graph-based Natural Language Processing and Information Retrieval第10页
Graph-based Natural Language Processing and Information Retrieval第11页
Graph-based Natural Language Processing and Information Retrieval第12页
Graph-based Natural Language Processing and Information Retrieval第13页
Graph-based Natural Language Processing and Information Retrieval第14页
Graph-based Natural Language Processing and Information Retrieval第15页
Graph-based Natural Language Processing and Information Retrieval第16页
Graph-based Natural Language Processing and Information Retrieval第17页
Graph-based Natural Language Processing and Information Retrieval第18页
Graph-based Natural Language Processing and Information Retrieval第19页
Graph-based Natural Language Processing and Information Retrieval第20页

试读结束, 可继续阅读

9积分/C币 立即下载 >