# PatentCrawler
专利爬虫
使用说明见[WIKI](https://github.com/will4906/PatentCrawler/wiki)
### ReleaseNote
* V2.0
* 使用scrapy框架爬取
* 大幅度缩减代码
* 加快了爬取速度
* FixBug: 解决了首次爬取总是失败的问题
* V1.0
* 使用selenium模拟爬取
* javascript解析
* 简单介绍:csdn博客:[http://blog.csdn.net/will4906/article/details/68955619](http://blog.csdn.net/will4906/article/details/68955619)
### License
PatentCrawler is released under the Apache 2.0 license.
```
Copyright 2017 willshuhua.me.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```
### 感谢支持
<table width="100%">
<tr><td align="center" colspan="2">赞赏</td></tr>
<tr>
<td align="center">
<img src="http://img.blog.csdn.net/20170521121423299?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvd2lsbDQ5MDY=/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" width="200px" alt="微信支付">
</td>
<td align="center">
<img src="http://img.blog.csdn.net/20170521131930503?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvd2lsbDQ5MDY=/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" width="200px" alt="支付宝">
</td>
</tr>
<tr>
<td align="center">微信</td>
<td align="center">支付宝</td>
</tr>
</table>
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
scrapy专利爬虫.zip (80个子文件)
PatentCrawler
README.md 2KB
LICENSE 11KB
PatentCrawler
spiders
Patent.py 9KB
__pycache__
Patent.cpython-35.pyc 7KB
__init__.cpython-35.pyc 145B
__init__.py 165B
items.py 1KB
pipelines.py 3KB
middlewares.py 700B
settings.py 4KB
__pycache__
items.cpython-35.pyc 790B
middlewares.cpython-35.pyc 1KB
__init__.cpython-35.pyc 137B
pipelines.cpython-35.pyc 3KB
settings.cpython-35.pyc 521B
__init__.py 0B
main.py 2KB
scrapy.cfg 281B
config
QueryInfo.py 2KB
BaseConfig.py 473B
__pycache__
BaseConfig.cpython-35.pyc 630B
QueryInfo.cpython-35.pyc 2KB
.idea
workspace.xml 24KB
markdown-navigator.xml 4KB
vcs.xml 185B
misc.xml 706B
PatentCrawler.iml 408B
modules.xml 285B
inspectionProfiles
Project_Default.xml 712B
profiles_settings.xml 241B
markdown-exported-files.xml 192B
markdown-navigator
profiles_settings.xml 106B
service
ItemCollection.py 3KB
SearchService.py 1003B
__pycache__
__init__.cpython-35.pyc 186B
ItemCollection.cpython-35.pyc 2KB
SearchService.cpython-35.pyc 1KB
__init__.py 78B
util
excel
ExcelEditor.py 499B
ExcelUtil.py 2KB
__pycache__
__init__.cpython-35.pyc 189B
ExcelEditor.cpython-35.pyc 859B
ExcelUtil.cpython-35.pyc 2KB
__init__.py 76B
TimeUtil.py 234B
FileUtil.py 407B
__pycache__
__init__.cpython-35.pyc 183B
TimeUtil.cpython-35.pyc 570B
HeadersEngine.cpython-35.pyc 2KB
__init__.py 76B
HeadersEngine.py 2KB
.git
HEAD 23B
packed-refs 221B
index 3KB
COMMIT_EDITMSG 16B
objects
77
b680f77b69d3b865a2933cd83986193edb8dcb 309B
pack
pack-108c9564650000a4ceb8f12c89c0c5b0b5e78f2c.pack 20.1MB
pack-108c9564650000a4ceb8f12c89c0c5b0b5e78f2c.idx 8KB
58
56f0691b76b083c3fbfffb1ee6e45d2cf6dab2 169B
1f
ec2c652046c0f9d8c617a41016c00002f95fd9 1KB
info
description 73B
config 308B
info
exclude 240B
hooks
pre-applypatch.sample 424B
pre-commit.sample 2KB
applypatch-msg.sample 478B
pre-rebase.sample 5KB
commit-msg.sample 896B
prepare-commit-msg.sample 1KB
update.sample 4KB
pre-receive.sample 544B
post-update.sample 189B
pre-push.sample 1KB
logs
HEAD 336B
refs
heads
master 336B
remotes
origin
HEAD 185B
master 142B
refs
tags
heads
master 41B
remotes
origin
HEAD 32B
master 41B
共 80 条
- 1
资源评论
- LauraKuang2023-07-24文档中还提供了一些额外的技巧和建议,能够帮助读者更好地利用scrapy进行爬取,对于优化爬虫效果有很大帮助。
- 乔木Leo2023-07-24对于专利爬虫这一特定领域的介绍非常详尽,对于想要专研这个领域的人来说是一份宝贵的资料。
- 白羊的羊2023-07-24这个文件提供了详细的爬虫教程,对于想要学习scrapy的人来说非常有帮助。
- 黄浦江畔的夏先生2023-07-24文件内容结构清晰,讲解方式简单易懂,即使是对爬虫不太熟悉的人也能够快速上手。
- Friday永不为奴2023-07-24这个文件提供了实际案例,通过实例演示了如何使用scrapy进行专利数据的爬取,非常实用。
will4906
- 粉丝: 86
- 资源: 9
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功