没有合适的资源?快使用搜索试试~ 我知道了~
Natural-Language-Processing-with-Python-Cookbook.pdf.pdf
需积分: 10 9 下载量 9 浏览量
2019-09-16
01:56:00
上传
评论
收藏 32.13MB PDF 举报
温馨提示
试读
301页
Natural-Language-Processing-with-Python-Cookbook.pdf
资源推荐
资源详情
资源评论
Krishna Bhavsar
Naresh Kumar
Pratap Dangeti
BIRMINGHAM - MUMBAI
Natural Language Processing
with Python Cookbook
Over 60 recipes to implement text analytics solutions
using deep learning principles
Preface
1
Chapter 1: Corpus and WordNet
8
Introduction
8
Accessing in-built corpora
9
How to do it...
9
Download an external corpus, load it, and access it
12
Getting ready
12
How to do it...
12
How it works...
14
Counting all the wh words in three different genres in the Brown corpus
15
Getting ready
15
How to do it...
15
How it works...
17
Explore frequency distribution operations on one of the web and chat
text corpus files
17
Getting ready
18
How to do it...
18
How it works...
20
Take an ambiguous word and explore all its senses using WordNet
20
Getting ready
21
How to do it...
21
How it works...
24
Pick two distinct synsets and explore the concepts of hyponyms and
hypernyms using WordNet
25
Getting ready
25
How to do it...
25
How it works...
28
Compute the average polysemy of nouns, verbs, adjectives, and
adverbs according to WordNet
28
Getting ready
29
How to do it...
29
How it works...
30
Chapter 2: Raw Text, Sourcing, and Normalization
31
Introduction
31
Contents
The importance of string operations
32
Getting ready…
32
How to do it…
32
How it works…
34
Getting deeper with string operations
34
How to do it…
34
How it works…
37
Reading a PDF file in Python
37
Getting ready
37
How to do it…
38
How it works…
39
Reading Word documents in Python
40
Getting ready…
40
How to do it…
40
How it works…
43
Taking PDF, DOCX, and plain text files and creating a user-defined
corpus from them
44
Getting ready
44
How to do it…
45
How it works…
47
Read contents from an RSS feed
48
Getting ready
48
How to do it…
48
How it works…
50
HTML parsing using BeautifulSoup
50
Getting ready
51
How to do it…
51
How it works…
53
Chapter 3: Pre-Processing
54
Introduction
54
Tokenization – learning to use the inbuilt tokenizers of NLTK
55
Getting ready
55
How to do it…
55
How it works…
57
Stemming – learning to use the inbuilt stemmers of NLTK
58
Getting ready
58
How to do it…
58
How it works…
60
Lemmatization – learning to use the WordnetLemmatizer of NLTK
60
Getting ready
60
How to do it…
61
How it works…
63
Stopwords – learning to use the stopwords corpus and seeing the
difference it can make
63
Getting ready
63
How to do it...
63
How it works...
66
Edit distance – writing your own algorithm to find edit distance
between two strings
66
Getting ready
66
How to do it…
67
How it works…
69
Processing two short stories and extracting the common vocabulary
between two of them
69
Getting ready
69
How to do it…
70
How it works…
75
Chapter 4: Regular Expressions
76
Introduction
76
Regular expression – learning to use *, +, and ?
77
Getting ready
77
How to do it…
77
How it works…
79
Regular expression – learning to use $ and ^, and the non-start and
non-end of a word
79
Getting ready
80
How to do it…
80
How it works…
82
Searching multiple literal strings and substring occurrences
83
Getting ready
83
How to do it…
83
How it works...
85
Learning to create date regex and a set of characters or ranges of
character
85
How to do it...
85
How it works...
87
Find all five-character words and make abbreviations in some
sentences
88
欢迎加入非盈利Python编程学习交流QQ群783462347,群里免费提供500+本Python书籍!
剩余300页未读,继续阅读
资源评论
weixin_38743506
- 粉丝: 348
- 资源: 2万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功