python用requests爬取古代诗人名字

共1个文件

py：1个

python

需积分: 5 0 下载量 168 浏览量 2024-06-26 09:49:26 上传评论收藏 588B RAR 举报

温馨提示

使用python的爬虫知识，从古诗文网上爬取的诗人名字，用到工具包requests和bs4。 requests用于爬取网页信息，bs4用于解析html内容。功能简介好懂，短短几十行代码，让你快速认识python爬虫技术，爬虫入门好帮手。

资源推荐

资源详情

资源评论

收起资源包目录

爬虫_诗人.rar （1个子文件）

爬虫_诗人.py 757B

共 1 条

import requests # 请求，模拟在浏览器的操作 import bs4 # 解析 url = "https://www.gushiwen.cn/" def get_gushiwen(): response = requests.get(url) print(response.status_code) data = response.text # data2 = response.content # data3 = data2.decode() # print(type(data), type(data2),type(data3)) ret = bs4.BeautifulSoup(data, "html.parser") all_a = ret.find_all("a", {"target":"_blank"}) for i in all_a: # <a href="https://so.gushiwen.cn/authorv_b90660e3e492.aspx" target="_blank">李白</a> # 根据href属性和authorv字符串，过滤出来所有的诗人 href = i["href"] if href.count("authorv")>0: print(i.text, end=" ") get_gushiwen()

评论收藏

内容反馈