【Book】用Python做文本挖掘

所需积分/C币:22 2017-09-28 15:22:07 1.25MB PDF

英文版的,本文介绍如何利用Python去做文本挖掘,英语不错的可以下载来看,PDF格式。
2.7 Documentation 21 2.8 Testing 2.8.1 Testing for typc 21 2.8.2 Zero-Onle-SOIne testing 22 2.8.3 Test layout and test discovery 2.8.4 Test coverage 2.8.5 Testing in different environments 24 2.9 Profiling 2. 10 Coding style 26 2.10.1 Where is private and public 2.11 Command-line interface scripting 2.11.1 Distinguishing between module and script 2.11.2 Argument parsing 2.11.3 Exit status 2. 12 Debugging 2.12.1 Logging 30 2. 13 Advices 3 Python for data mining 33 3.1 Numpy 3.2 Plotting 3.2. 1 3D plotting 3.2.2 Real-time plotting 34 3.2.3 Plotting for the Web 3.2.4 Vispy 3.3 Pandas 3.1 Pandas data ty 40 3.3.2 Pandas indexing 3.3.3 Pandas joining, merging and concatenations 42 3.3.4 Simple statistics 43 3.4 SciP 3.4.1 scipy linal 4.2 scipy fftpack 3.5 Statsmodcls 3.6S 3.7 Machine learning 48 3.7.1 Scikit-learn 48 3.8 Text mining 49 3.8.1 Regular expressions 49 3.8.2 Extracting from webpages 3.8.3 NLTK 3.8.4 Tokenization and part-of-speech tagging 52 8.5 Language detection 3.8.6 Sentiment analysis 4 3.9 Network mining 3.10 Miscellaneous issues 3.10.1 Lazy computation 3. 11 Testing data mining code 4 Case: Pure Python matrix library 59 4.1 Code listin 59 5 Case: Pima data set 65 6. 1 Problem description and objectives .65 5.2 Descriptive statistics and plottin 66 5.3 Statistical tests 5.4 Predicting diabetes type 6 Case: Data mining a database 7 6. 1 Problem description and objectives 71 6.2 Reading the data 71 6.3 Graphical overview on the connections between the tables 72 6.4 Statistics on the number of tracks sold 74 7 Case: Twitter information diffusion 75 7. 1 ProbleIn description and objectives 7.2 Building a news classifier 75 8 Case: Big data 77 8. 1 Problem description and objectives 77 8.2 Strealll processing of JSON 77 Bibliography 79 Index 83 Iv Preface Python has grown to become one of the central languages in data mining offering both a general programming language and libraries specifically targeted numerical computations This book is continuously being written and grew out of course given at the Technical University of Denmark VI List of Figures 1.1 The Python hierarchy. 2. 1 Overview of methods and attributes in the common Python 2 built-in data types plotted as a formal concept analysis lattice graph. Only a small subset of methods and attributes is shown. 15 3. 1 Comorbidity for ICD-10 disease code(appendicitis) 5. 1 Seaborn correlation plot on the Pina data set 6.1 Database tables graph v11 V11 List of tables 2. 1 Basic built-in and Numpy and Pandas datatypes 10 2. 2 Class methods and attributes 16 2.3 Testing concepts 21 3.1 Function for generation of Numpy data structures 3.2 Some of the subpackages of SciPy 3.3 Python machine learning packages 3.4 Scikit-learn methods 3.5 sklearn classifiers 3.6 Metacharacters and character classes 50 3.7 NLT submodules 52 5. 1 Variables in the pima data set

...展开详情
img
luyu8709

关注 私信 TA的资源

上传资源赚积分,得勋章
相关内容推荐