Beautiful Soup is a library that makes it easy to scrape information
from web pages. It sits atop an HTML or XML parser, providing Pythonic
idioms for iterating, searching, and modifying the parse tree.
# Quick start
```
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup("<p>Some<b>bad<i>HTML")
>>> print soup.prettify()
<html>
<body>
<p>
Some
<b>
bad
<i>
HTML
</i>
</b>
</p>
</body>
</html>
>>> soup.find(text="bad")
u'bad'
>>> soup.i
<i>HTML</i>
#
>>> soup = BeautifulSoup("<tag1>Some<tag2/>bad<tag3>XML", "xml")
#
>>> print soup.prettify()
<?xml version="1.0" encoding="utf-8">
<tag1>
Some
<tag2 />
bad
<tag3>
XML
</tag3>
</tag1>
```
To go beyond the basics, [comprehensive documentation is available](http://www.crummy.com/software/BeautifulSoup/bs4/doc/).
# Links
* [Homepage](http://www.crummy.com/software/BeautifulSoup/bs4/)
* [Documentation](http://www.crummy.com/software/BeautifulSoup/bs4/doc/)
* [Discussion group](http://groups.google.com/group/beautifulsoup/)
* [Development](https://code.launchpad.net/beautifulsoup/)
* [Bug tracker](https://bugs.launchpad.net/beautifulsoup/)
* [Complete changelog](https://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/CHANGELOG)
# Note on Python 2 sunsetting
Since 2012, Beautiful Soup has been developed as a Python 2 library
which is automatically converted to Python 3 code as necessary. This
makes it impossible to take advantage of some features of Python
3.
For this reason, I plan to discontinue Beautiful Soup's Python 2
support at some point after December 31, 2020: one year after the
sunset date for Python 2 itself. Beyond that point, new Beautiful Soup
development will exclusively target Python 3. Of course, older
releases of Beautiful Soup, which support both versions, will continue
to be available.
# Supporting the project
If you use Beautiful Soup as part of your professional work, please consider a
[Tidelift subscription](https://tidelift.com/subscription/pkg/pypi-beautifulsoup4?utm_source=pypi-beautifulsoup4&utm_medium=referral&utm_campaign=readme).
This will support many of the free software projects your organization
depends on, not just Beautiful Soup.
If you use Beautiful Soup for personal projects, the best way to say
thank you is to read
[Tool Safety](https://www.crummy.com/software/BeautifulSoup/zine/), a zine I
wrote about what Beautiful Soup has taught me about software
development.
# Building the documentation
The bs4/doc/ directory contains full documentation in Sphinx
format. Run `make html` in that directory to create HTML
documentation.
# Running the unit tests
Beautiful Soup supports unit test discovery from the project root directory:
```
$ nosetests
```
```
$ python -m unittest discover -s bs4
```
If you checked out the source tree, you should see a script in the
home directory called test-all-versions. This script will run the unit
tests under Python 2, then create a temporary Python 3 conversion of
the source and run the unit tests again under Python 3.
没有合适的资源?快使用搜索试试~ 我知道了~
beautifulsoup4-4.9.1.tar.gz
需积分: 1 0 下载量 51 浏览量
2024-03-03
13:15:19
上传
评论
收藏 366KB GZ 举报
温馨提示
共53个文件
py:24个
txt:8个
rst:5个
py依赖包
资源推荐
资源详情
资源评论
收起资源包目录
beautifulsoup4-4.9.1.tar.gz (53个子文件)
beautifulsoup4-4.9.1
TODO.txt 1KB
convert-py3k 546B
setup.py 2KB
doc
Makefile 5KB
source
index.rst 116KB
conf.py 8KB
6.1.jpg 22KB
LICENSE 1KB
test-all-versions 56B
PKG-INFO 5KB
doc.ptbr
Makefile 5KB
source
index.rst 115KB
conf.py 8KB
6.1.jpg 22KB
COPYING.txt 1KB
doc.ru
Makefile 5KB
source
index.rst 478B
conf.py 8KB
6.1.jpg 22KB
bs4ru.rst 155KB
NEWS.txt 57KB
bs4
__init__.py 31KB
dammit.py 33KB
testing.py 44KB
builder
__init__.py 19KB
_lxml.py 12KB
_html5lib.py 18KB
_htmlparser.py 18KB
diagnose.py 8KB
tests
__init__.py 27B
test_builder_registry.py 5KB
test_docs.py 1KB
test_soup.py 29KB
test_htmlparser.py 4KB
test_lxml.py 4KB
test_html5lib.py 7KB
test_tree.py 87KB
formatter.py 6KB
element.py 79KB
MANIFEST.in 219B
setup.cfg 38B
beautifulsoup4.egg-info
SOURCES.txt 1KB
top_level.txt 4B
PKG-INFO 5KB
requires.txt 62B
dependency_links.txt 1B
README.md 3KB
doc.zh
Makefile 5KB
source
index.rst 94KB
conf.py 8KB
6.1.jpg 22KB
scripts
demonstrate_parser_differences.py 3KB
demonstration_markup.txt 3KB
共 53 条
- 1
资源评论
程序员Chino的日记
- 粉丝: 2983
- 资源: 4万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功