爬虫开发-基于Python实现爬取github上热门语言对应的项目.zip

共1个文件

py：1个

爬虫

python

爬虫开发

需积分: 1 0 下载量 84 浏览量 2024-04-03 12:01:14 上传评论收藏 1KB ZIP 举报

温馨提示

爬虫开发_基于Python实现爬取github上热门语言对应的项目

资源推荐

资源详情

资源评论

收起资源包目录

爬虫开发_基于Python实现爬取github上热门语言对应的项目.zip （1个子文件）

爬虫开发_基于Python实现爬取github上热门语言对应的项目

github_hot.py 798B

共 1 条

import re import requests import pandas as pd import numpy as np def hot_github(keyword): url = 'https://github.com/trending/{0}'.format(keyword) main_url = 'https://github.com{0}' html = requests.get(url).content.decode('utf-8') reg_hot_url = re.compile('<h3 class="repo-list-name">\s*<a href="(.*?)">') hot_url = [main_url.format(i) for i in re.findall(reg_hot_url, html)] url_abstract_reg = re.compile('<p class="repo-list-description">\s*(.*?)\s*</p>') summary_text = re.findall(url_abstract_reg, html) hotDF = pd.DataFrame() hotDF['项目简介'] = summary_text hotDF['项目地址'] = hot_url hotDF.to_csv('./github_hot.csv', index=False) if __name__ == '__main__': keyword = input('请输入查找的热门语言:') hot_github(keyword)

评论收藏

内容反馈

资源评论