简易Python爬虫示例资源-CSDN文库

共69个文件

py：31个

pyc：25个

html：9个

python

爬虫

需积分: 2 6 浏览量 2023-05-18 23:50:38 上传评论收藏 50KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

web-crawler-master.zip （69个子文件）

web-crawler-master

first_project

db.sqlite3 128KB

data_crawler

admin.py 505B

migrations

__init__.py 0B

0001_initial.py 1KB

__pycache__

0001_initial.cpython-38.pyc 880B

__init__.cpython-38.pyc 172B

models.py 999B

urls.py 110B

__pycache__

models.cpython-38.pyc 2KB

admin.cpython-38.pyc 786B

first_project

__init__.py 0B

wsgi.py 403B

urls.py 850B

settings.py 3KB

__pycache__

wsgi.cpython-38.pyc 577B

urls.cpython-38.pyc 1KB

settings.cpython-38.pyc 2KB

__init__.cpython-38.pyc 162B

asgi.py 403B

manage.py 637B

news

admin.py 516B

migrations

__init__.py 0B

0001_initial.py 1023B

__pycache__

0001_initial.cpython-38.pyc 962B

__init__.cpython-38.pyc 164B

models.py 432B

templates

news

article_list.html 440B

user_list.html 153B

article_detail.html 41B

year_archive.html 366B

month_archive.html 40B

base.html 198B

urls.py 343B

__pycache__

models.cpython-38.pyc 959B

urls.cpython-38.pyc 488B

admin.cpython-38.pyc 809B

views.cpython-38.pyc 1KB

views.py 1KB

polls

__init__.py 0B

tests.py 60B

admin.py 413B

migrations

__init__.py 0B

0001_initial.py 1KB

__pycache__

0001_initial.cpython-38.pyc 1018B

__init__.cpython-38.pyc 165B

apps.py 85B

models.py 760B

templates

polls

detail.html 547B

index.html 392B

results.html 323B

urls.py 509B

__pycache__

models.cpython-38.pyc 1KB

urls.cpython-38.pyc 553B

admin.cpython-38.pyc 571B

apps.cpython-38.pyc 372B

__init__.cpython-38.pyc 154B

views.cpython-38.pyc 2KB

static

polls

style.css 167B

views.py 2KB

crawler

getData.py 2KB

connection.py 581B

function.py 5KB

__pycache__

function.cpython-38.pyc 3KB

connection.cpython-38.pyc 678B

bash.exe.stackdump 1KB

README.md 1008B

SonNhaDep

getData.py 719B

function.py 475B

__pycache__

function.cpython-38.pyc 660B

### Web Crawler with requests and beautifulsoup4(bs4) library in python ### #### Installing for Development: #### * IDE: ``` - Download python (https://www.python.org/downloads/) - Install IDE support compile Python: VSCode, Pycharm, Sublime Text, ... ``` * Extension for IDE: ``` + For VSCode, you need install some extension to support code python: . HTML CSS Support . Python . Remote Development + For IDE different: Search more information on google ``` * To run: ``` - OPEN TERMINAL: + cd crawler + pip install requests // (pip3 install requests) + pip install beautifulsoup4 // (pip3 install beautifullsoup4) - OPEN getData.py file: ( * If you no need save data into database: . Comment some function use to connect database: Eg. insertData..(),... . No need to worry about connecting and dealing with databases. ) - Replace current available "url" variable in file with the one url address you want. - Reopen terminal and run with: python getData.py ```

评论收藏

内容反馈