# CarCrawlers
[![LICENSE](https://img.shields.io/badge/License-MIT-%23FF4D5B.svg?style=flat-square)](https://github.com/DolorHunter/AutoTBOXDataSystem/blob/master/LICENSE)
[![Python](https://img.shields.io/badge/Python-v3.9.0-blue.svg?style=flat-square)](https://github.com/DolorHunter/1p3aMSCSAdminReport/releases)
[![BeautifulSoup](https://img.shields.io/badge/BeautifulSoup-v4.9.3-yellow.svg?style=flat-square)](https://github.com/DolorHunter/1p3aMSCSAdminReport/releases)
## Info
* Data source [cars-data.com](https://www.cars-data.com)
* Crawler [src/main.py](src/main.py)
* Data (23,351 row, 18.9MB) [data/car.csv](data/car.csv)
## Requirements
Python 3
```plain
$ pip install os requests beautifulsoup4
```
## About CarCrawlers
* [cars-data.com](https://www.cars-data.com) seems not to have crawler protection, and the format is easy to crawl. Have fun crawling.
* [data/car.csv](src/car.csv) only includes data from page1~page36 and it stoped at [Mercedes E 220 CDI Classic tech specs](https://www.cars-data.com/en/mercedes-e-220-cdi-classic-specs/24161/tech), which is at [page36](https://www.cars-data.com/en/all-cars/page36.html), model [2006 Mercedes-Benz E-class specs](https://www.cars-data.com/en/mercedes-benz-e-class-2006/1503), type [2006 Mercedes E 220 CDI Classic 170 hp, diesel, 6 s., manual](https://www.cars-data.com/en/mercedes-e-220-cdi-classic-specs/24161/tech).
* I think it is caused by a bad connection or should I switch my IP during the crawling. __[Full Tracker Info](#Traceback-Info)__ is down below, and there is a __[Brief Traceback](#Brief-Traceback)__ for you.
### Brief Traceback
```javascript
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.cars-data.com', port=443): Max retries exceeded with url: /en/mercedes-e-220-cdi-classic-specs/24161/tech (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1122)')))
&
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.cars-data.com', port=443): Max retries exceeded with url: /en/mercedes-e-220-cdi-classic-specs/24161/tech (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1122)')))
```
### Traceback Info
```javascript
Traceback (most recent call last):
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
self._prepare_proxy(conn)
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 932, in _prepare_proxy
conn.connect()
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 362, in connect
self.sock = ssl_wrap_socket(
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 386, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create
self.do_handshake()
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1122)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 726, in urlopen
retries = retries.increment(
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\retry.py", line 446, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.cars-data.com', port=443): Max retries exceeded with url: /en/mercedes-e-220-cdi-classic-specs/24161/tech (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1122)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\*\AutoTBOXDataSystem\CarCrawlers\src\main.py", line 137, in <module>
car_data = get_car_data(model_type_url)
File "D:\*\AutoTBOXDataSystem\CarCrawlers\src\main.py", line 26, in get_car_data
raw_html = get_html(url)
File "D:\*\AutoTBOXDataSystem\CarCrawlers\src\main.py", line 17, in get_html
response = session.get(url, headers=headers)
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 543, in get
return self.request('GET', url, **kwargs)
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "C:\Users\*\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.cars-data.com', port=443): Max retries exceeded with url: /en/mercedes-e-220-cdi-classic-specs/24161/tech (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1122)')))
Process finished with exit code 1
```
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
【作品名称】:汽车TBOX数据采集及分析系统设计与实现 【适用人群】:适用于希望学习不同技术领域的小白或进阶学习者。可作为毕设项目、课程设计、大作业、工程实训或初期项目立项。 【项目介绍】: 数据库:安装MySQL到本地或服务器(其他应该也行,只要JPA支持)。 后端:安装Maven(包管理工具),安装Spring包环境(刷新maven自动下载),安装Tomcat(服务)。 前端:安装Nodejs,安装yarn或npm。 运行HttpClient中的main.py(数据采集与分析),生成visual chart数据,每天执行一次即可。 分别启动Spring和React(命令如下)。 yarn $ yarn start npm $ npm start
资源推荐
资源详情
资源评论
收起资源包目录
汽车TBOX数据采集及分析系统设计与实现 (207个子文件)
mvnw.cmd 6KB
2.637eb612.chunk.css 43KB
car.csv 16.84MB
.DS_Store 6KB
.DS_Store 6KB
.DS_Store 6KB
fontawesome-webfont.674f50d2.eot 162KB
.gitignore 445B
.gitignore 319B
index.html 2KB
index.html 2KB
favicon.ico 4KB
favicon.ico 4KB
CarEntity.java 21KB
CarServiceImpl.java 20KB
CarWarningEntity.java 12KB
UserServiceImpl.java 12KB
RoleServiceImpl.java 11KB
RoleMenuServiceImpl.java 9KB
MenuServiceImpl.java 8KB
UserCarController.java 8KB
UserController.java 7KB
CarWarningServiceImpl.java 7KB
CarWarningController.java 7KB
CarController.java 7KB
UserRoleController.java 7KB
RoleMenuController.java 7KB
UserCarServiceImpl.java 6KB
UserRoleServiceImpl.java 6KB
CarWarningDetailController.java 5KB
CarWarningDetailServiceImpl.java 5KB
MenuController.java 5KB
RoleController.java 5KB
MavenWrapperDownloader.java 5KB
CarWarrantyEntity.java 5KB
CarWarrantyController.java 4KB
CarWarrantyServiceImpl.java 4KB
VisualChartServiceImpl.java 4KB
CarLogEntity.java 4KB
CarLogController.java 4KB
VisualChartController.java 4KB
CarLogServiceImpl.java 4KB
BaseEntity.java 3KB
CarWarningDAO.java 2KB
UserCarEntity.java 2KB
CarWarningService.java 2KB
UserCarService.java 2KB
UserCarDAO.java 2KB
UserRoleService.java 2KB
RoleMenuService.java 2KB
UserRoleDAO.java 2KB
RoleMenuDAO.java 2KB
UserEntity.java 1KB
CarService.java 1KB
CarWarningDetailService.java 1KB
UserService.java 1KB
CarWarningDetailEntity.java 1KB
RoleMenuEntity.java 1KB
UserRoleEntity.java 1KB
RoleService.java 1KB
MenuService.java 1KB
CarWarrantyService.java 1020B
VisualChartEntity.java 1004B
MenuEntity.java 996B
RoleEntity.java 996B
VisualChartService.java 923B
CarWarningDetailDAO.java 866B
UserDAO.java 860B
CarLogService.java 845B
MenuDAO.java 760B
CarWarrantyDAO.java 712B
RegexUtil.java 697B
RoleDAO.java 696B
CarDAO.java 686B
CarLogDAO.java 678B
VisualChartDAO.java 656B
DateUtil.java 390B
DemoApplication.java 328B
avatar_03.jpg 22KB
avatar_04.jpg 21KB
avatar_06.jpg 20KB
avatar_05.jpg 20KB
avatar_01.jpg 18KB
avatar_02.jpg 18KB
2.b94be030.chunk.js 2.91MB
Icons.js 369KB
main.cddf178a.chunk.js 270KB
CarWarningTables.js 38KB
CarWarrantyTables.js 19KB
CarLogTables.js 17KB
Dashboard.js 17KB
UserTables.js 15KB
CarTables.js 15KB
Notifications.js 13KB
UserCarTables.js 12KB
HeaderView.js 12KB
RoleTables.js 11KB
Header.js 10KB
NotificationsView.js 9KB
CarWarningDetailTables.js 9KB
共 207 条
- 1
- 2
- 3
资源评论
MarcoPage
- 粉丝: 3113
- 资源: 3467
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功