没有合适的资源?快使用搜索试试~ 我知道了~
BeautifulSoup4.2中文版文档1
需积分: 0 0 下载量 119 浏览量
2022-08-04
12:07:32
上传
评论
收藏 1.72MB PDF 举报
温馨提示
试读
54页
喜欢的转换现惯的档导,查找,档的式.Beautiful Soup帮你节时的作时间.这档绍BeautifulSoup4中有主性,有.让我来向你做什,何作,样使,何
资源详情
资源评论
资源推荐
Beautiful Soup 4.2.0
html_doc
=
"""
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their nam
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>
<p class="story">...</p>
"""
BeautifulSoup
from
bs4
import
BeautifulSoup
soup
=
BeautifulSoup
(
html_doc
)
Beautiful Soup 4.2.0 文档 — Beautiful Soup 4.2.0... https://www.crummy.com/software/BeautifulSoup/bs4...
第1页 共54页 2018/2/22 上午11:05
print
(
soup
.
prettify
())
# <html>
# <head>
# <title>
# The Dormouse's story
# </title>
# </head>
# <body>
# <p class="title">
# <b>
# The Dormouse's story
# </b>
# </p>
# <p class="story">
# Once upon a time there were three little sisters; and their names were
# <a class="sister" href="http://example.com/elsie" id="link1">
# Elsie
# </a>
# ,
# <a class="sister" href="http://example.com/lacie" id="link2">
# Lacie
# </a>
# and
# <a class="sister" href="http://example.com/tillie" id="link2">
# Tillie
# </a>
# ; and they lived at the bottom of a well.
# </p>
# <p class="story">
# ...
# </p>
# </body>
# </html>
soup
.
title
# <title>The Dormouse's story</title>
soup
.
title
.
name
# u'title'
soup
.
title
.
string
# u'The Dormouse's story'
soup
.
title
.
parent
.
name
# u'head'
Beautiful Soup 4.2.0 文档 — Beautiful Soup 4.2.0... https://www.crummy.com/software/BeautifulSoup/bs4...
第2页 共54页 2018/2/22 上午11:05
soup
.
p
# <p class="title"><b>The Dormouse's story</b></p>
soup
.
p
[
'class'
]
# u'title'
soup
.
a
# <a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>
soup
.
find_all
(
'a'
)
# [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
# <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
# <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]
soup
.
find
(
id
=
"link3"
)
# <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>
for
link
in
soup
.
find_all
(
'a'
):
print
(
link
.
get
(
'href'
))
# http://example.com/elsie
# http://example.com/lacie
# http://example.com/tillie
print
(
soup
.
get_text
())
# The Dormouse's story
#
# The Dormouse's story
#
# Once upon a time there were three little sisters; and their names were
# Elsie,
# Lacie and
# Tillie;
# and they lived at the bottom of a well.
#
# ...
Beautiful Soup
Beautiful Soup 4.2.0 文档 — Beautiful Soup 4.2.0... https://www.crummy.com/software/BeautifulSoup/bs4...
第3页 共54页 2018/2/22 上午11:05
$
apt-get
install
Python-bs4
easy_install
pip
beautifulsoup4
$
easy_install
beautifulsoup4
$
pip
install
beautifulsoup4
BeautifulSoup
BeautifulSoup
beautifulsoup4
easy_install
pip
$
Python
setup.py
install
ImportError
ImportError
SyntaxError
$
Python3
setup.py
install
$
2to3-3.2
-w
bs4
Beautiful Soup 4.2.0 文档 — Beautiful Soup 4.2.0... https://www.crummy.com/software/BeautifulSoup/bs4...
第4页 共54页 2018/2/22 上午11:05
$
apt-get
install
Python-lxml
$
easy_install
lxml
$
pip
install
lxml
$
apt-get
install
Python-html5lib
$
easy_install
html5lib
$
pip
install
html5lib
BeautifulSoup(markup,
"html.parser")
BeautifulSoup(markup,
"lxml")
BeautifulSoup(markup,
["lxml",
"xml"])
BeautifulSoup(markup,
"xml")
BeautifulSoup(markup,
"html5lib")
Beautiful Soup 4.2.0 文档 — Beautiful Soup 4.2.0... https://www.crummy.com/software/BeautifulSoup/bs4...
第5页 共54页 2018/2/22 上午11:05
剩余53页未读,继续阅读
蓝洱
- 粉丝: 23
- 资源: 316
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0