2020-04-23 10:16:30 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: LianjiaSpider)
2020-04-23 10:16:30 [scrapy.utils.log] INFO: Versions: lxml 4.3.4.0, libxml2 2.9.9, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 18.9.0, Python 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1e 17 Mar 2020), cryptography 2.7, Platform Windows-10-10.0.18362-SP0
2020-04-23 10:16:30 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'LianjiaSpider', 'DOWNLOAD_DELAY': 0.25, 'LOG_FILE': 'Spiderlog20200423.txt', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'LianjiaSpider.spiders', 'SPIDER_MODULES': ['LianjiaSpider.spiders']}
2020-04-23 10:16:30 [scrapy.extensions.telnet] INFO: Telnet Password: 6b0540ee8c8de579
2020-04-23 10:16:30 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2020-04-23 10:16:31 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'LianjiaSpider.middlewares.ProxyMiddleware',
'LianjiaSpider.middlewares.UserAgentMiddeleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-04-23 10:16:31 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-04-23 10:16:31 [scrapy.middleware] INFO: Enabled item pipelines:
['LianjiaSpider.pipelines.CsvPipeline',
'LianjiaSpider.pipelines.MongodbPipeline']
2020-04-23 10:16:31 [scrapy.core.engine] INFO: Spider opened
2020-04-23 10:16:31 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2020-04-23 10:16:31 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-04-23 10:16:53 [root] ERROR: https://zz.lianjia.com/ershoufang/104102188631.html save CSV Field error
2020-04-23 10:17:04 [root] ERROR: https://zz.lianjia.com/ershoufang/104103786789.html save CSV Field error
2020-04-23 10:17:04 [root] ERROR: https://zz.lianjia.com/ershoufang/104103785799.html save CSV Field error
2020-04-23 10:17:05 [root] ERROR: https://zz.lianjia.com/ershoufang/104103785683.html save CSV Field error
2020-04-23 10:17:06 [scrapy.core.downloader.handlers.http11] WARNING: Got data loss in https://zz.lianjia.com/ershoufang/baisha/pg72/. If you want to process broken responses set the setting DOWNLOAD_FAIL_ON_DATALOSS = False -- This message won't be shown in further requests
2020-04-23 10:17:06 [root] ERROR: https://zz.lianjia.com/ershoufang/104103783057.html save CSV Field error
2020-04-23 10:17:06 [root] ERROR: https://zz.lianjia.com/ershoufang/104103783011.html save CSV Field error
2020-04-23 10:17:07 [root] ERROR: https://zz.lianjia.com/ershoufang/104103782964.html save CSV Field error
2020-04-23 10:17:12 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://zz.lianjia.com/ershoufang/104103670134.html>: HTTP status code is not handled or not allowed
2020-04-23 10:17:31 [scrapy.extensions.logstats] INFO: Crawled 186 pages (at 186 pages/min), scraped 136 items (at 136 items/min)
2020-04-23 10:17:34 [root] ERROR: https://zz.lianjia.com/ershoufang/104103909501.html save CSV Field error
2020-04-23 10:17:51 [scrapy.core.scraper] ERROR: Error processing {'city': '郑州',
'detail_url': 'https://zz.lianjia.com/ershoufang/104103307861.html',
'house_info_dict': {'上次交易': '暂无数据',
'交易权属': '商品房',
'产权所属': '暂无数据',
'供暖方式': '集中供暖',
'关注人数': '1',
'单价': '8195元/平米',
'套内面积': '暂无数据',
'小区名称': '金程•名湖山庄',
'建筑类型': '板楼',
'建筑结构': '混合结构',
'建筑面积': '144㎡',
'总价': '118万',
'户型结构': '平层',
'房屋年限': '暂无数据',
'房屋户型': '3室2厅1厨2卫',
'房屋朝向': '南',
'房屋用途': '普通住宅',
'房本备件': '未上传房本照片',
'所在区域': '中牟县',
'所在楼层': '高楼层 (共5层)',
'抵押信息': '无抵押',
'挂牌时间': '2019-11-24',
'梯户比例': '一梯两户',
'装修情况': '其他',
'配备电梯': '无'},
'street': '中牟县',
'street_page_url': 'https://zz.lianjia.com/ershoufang/zhongmuxian1/pg99/'}
Traceback (most recent call last):
File "C:\Users\Shinelon\Anaconda3\lib\site-packages\twisted\internet\defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\Shinelon\Desktop\bishe_websheji\LianjiaSpider\LianjiaSpider\pipelines.py", line 105, in process_item
self.writer.writerow(data)
UnicodeEncodeError: 'gbk' codec can't encode character '\u2022' in position 2: illegal multibyte sequence
2020-04-23 10:17:51 [scrapy.core.scraper] ERROR: Error processing {'city': '郑州',
'detail_url': 'https://zz.lianjia.com/ershoufang/104103307403.html',
'house_info_dict': {'上次交易': '暂无数据',
'交易权属': '商品房',
'产权所属': '非共有',
'供暖方式': '集中供暖',
'关注人数': '18',
'单价': '7155元/平米',
'套内面积': '暂无数据',
'小区名称': '金程•名湖山庄',
'建筑类型': '板楼',
'建筑结构': '混合结构',
'建筑面积': '123㎡',
'总价': '88万',
'户型结构': '平层',
'房屋年限': '暂无数据',
'房屋户型': '3室1厅1厨1卫',
'房屋朝向': '南 北',
'房屋用途': '普通住宅',
'房本备件': '未上传房本照片',
'所在区域': '中牟县',
'所在楼层': '高楼层 (共6层)',
'抵押信息': '有抵押 25万元',
'挂牌时间': '2019-11-24',
'梯户比例': '一梯两户',
'装修情况': '精装',
'配备电梯': '无'},
'street': '中牟县',
'street_page_url': 'https://zz.lianjia.com/ershoufang/zhongmuxian1/pg99/'}
Traceback (most recent call last):
File "C:\Users\Shinelon\Anaconda3\lib\site-packages\twisted\internet\defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\Shinelon\Desktop\bishe_websheji\LianjiaSpider\LianjiaSpider\pipelines.py", line 105, in process_item
self.writer.writerow(data)
UnicodeEncodeError: 'gbk' codec can't encode character
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
【资源说明】 基于python+scrapy+flask的二手房的采集和分析可视化系统-毕业设计源码+使用文档(高分优秀项目) 【备注】 1、该资源内项目代码都经过测试运行成功,功能ok的情况下才上传的,请放心下载使用! 2、本项目适合计算机相关专业(如软件工程、计科、人工智能、通信工程、自动化、电子信息等)的在校学生、老师或者企业员工下载使用,也可作为毕设项目、课程设计、作业、项目初期立项演示等,当然也适合小白学习进阶。 3、如果基础还行,可以在此代码基础上进行修改,以实现其他功能,也可直接用于毕设、课设、作业等。 欢迎下载,沟通交流,互相学习,共同进步!
资源推荐
资源详情
资源评论
收起资源包目录
基于python+scrapy+flask的二手房的采集和分析可视化系统-毕业设计源码+使用文档(高分优秀项目) (140个子文件)
scrapy.cfg 269B
bootstrap.min.css 118KB
animate.min.css 74KB
pick-a-color-1.2.3.min.css 26KB
pick-a-color-1.2.3.min.css 26KB
bootstrap-theme.min.css 23KB
main.css 12KB
shCoreDefault.css 9KB
style.css 6KB
dashboard.css 2KB
barrager.css 1KB
header.css 1KB
glyphicons-halflings-regular.eot 20KB
.gitignore 2KB
price_element.html 17KB
housetype.html 14KB
data_analysis_index.html 14KB
hot_analyze.html 13KB
realtime.html 12KB
show_data.html 11KB
contrast.html 9KB
area_price.html 8KB
index.html 3KB
footer.html 3KB
layout.html 1KB
header.html 1KB
importJS.html 694B
ajax_receive.html 640B
importCSS.html 342B
JD.ico 25KB
weibo.ico 10KB
tmall.ico 1KB
flask_echarts.iml 602B
house1.jpg 147KB
house.jpg 24KB
timg (1).jpg 24KB
1.jpg 3KB
4.jpg 3KB
3.jpg 3KB
5.jpg 3KB
2.jpg 3KB
6.jpg 3KB
echarts.js 2.24MB
unitprice.js 1.2MB
echarts-wordcloud.min.js 125KB
jquery.min.js 95KB
isotope.pkgd.min.js 48KB
bootstrap-table.min.js 47KB
bootstrap.min.js 36KB
worldcloud.js 28KB
pick-a-color-1.2.3.min.js 24KB
shCore.js 16KB
tinycolor-0.9.15.min.js 14KB
shBrushPhp.js 5KB
wow.min.js 5KB
main.js 4KB
dataTool.js 3KB
jquery.countTo.js 2KB
jquery.barrager.js 2KB
jquery.barrager.min.js 2KB
shLegacy.js 2KB
bootstrap-table-mobile.min.js 2KB
shBrushJScript.js 2KB
shAutoloader.js 1KB
yqyTable.js 1KB
bootstrap-table-zh-CN.min.js 853B
npm.js 484B
410100.json 15KB
LICENSE 1KB
chart.png 622KB
cnki.png 105KB
btos.png 73KB
analyse.png 34KB
clients.png 13KB
footer.png 12KB
under.png 8KB
barrager.png 6KB
tour-icon1.png 5KB
cycle.png 5KB
icon1.png 4KB
close.png 4KB
icon3.png 4KB
icon2.png 4KB
tour-bg.png 3KB
slider-bg.png 3KB
left.png 537B
client4.png 536B
client1.png 536B
client6.png 536B
client5.png 536B
client3.png 536B
client2.png 536B
right.png 528B
icon.png 356B
activeicon.png 356B
profile1.png 351B
profile2.png 351B
lian_jia.py 10KB
settings.py 6KB
analyze.py 5KB
共 140 条
- 1
- 2
资源评论
不走小道
- 粉丝: 3175
- 资源: 3971
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功