Python_Spidder.rar_python 文件_python spidder

共7个文件

py：4个

project：1个

prefs：1个

1.该资源内容由用户上传，如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款（资源遇到问题，请及时私信上传者）

版权申诉

0 下载量 131 浏览量 2022-09-14 23:54:56 上传评论收藏 2KB RAR 举报

温馨提示

使用python从特定网站上下载特定格式的文件信息，并保存至指定文件夹中

资源详情

资源评论

资源推荐

收起资源包目录

Python_Spidder.rar （7个子文件）

Python_Spidder

.pydevproject 432B

.settings

org.eclipse.core.resources.prefs 132B

src

getfilelist.py 172B

saveimg.py 1KB

getimage.py 912B

__init__.py 0B

.project 385B

共 7 条

# -*- coding:utf-8 -*- import urllib.request as req import urllib import os import re write_file_path="C:\\Users\\testing\\Spider\\write_file\\" file_root_path="C:\\Users\\testing\\Spider\\root_folder\\" file_list=tuple(os.walk(file_root_path)) for file_name in file_list[0][2]: print(file_name) file_path=file_root_path+file_name file_object=open(file_path) try: file_content=file_object.read() restr = r'http://bbs.*?jpg' urllist = re.findall(restr,file_content,re.S) finally: file_object.close() #print(urllist) for url in urllist: if len(url)!=0: split_name=url.split('/')[-1] writed_file_name=write_file_path+split_name #print(writed_file_name) with open(writed_file_name, 'wb') as file: try: image_detail=req.urlopen(url, None, None).read() file.write(image_detail) except urllib.error.HTTPError as e: print(e.code) print(url)