[![license](https://img.shields.io/badge/license-MIT-success)](https://github.com/pdrm83/Rotten_Tomatoes_Scraper/blob/master/LICENSE)
# Rotten Tomatoes Scraper
You can extract information about **movies** and **actors** that are listed on the Rotten Tomatoes website using this
module. Each movie has different metadata such as *Rating*, *Genre*, *Box Office*, *Studio*, and *Scores*. The
*Genre* has 20+ subcategories that also gives you more granular information on a movie. These metadata can be helpful
for many data science projects. For actors you can extract movies listed in **highest-rated** or **filmography**
sections depending on your need. This module uses the BeautifulSoup package to parse HTML documents.
## Install
The module requires the following libraries:
* bs4
* requests
* lxml
Then, it can be installed using pip:
```python
pip3 install rotten_tomatoes_scraper
```
## Usage
This module contains two classes: **MovieScraper** and **CelebrityScraper**.
You can use *CelebrityScraper* to extract the complete list of movies that a celebrity participated by calling
`extract_metadata` method and using `section='filmography'`. Plus, you can also extract the list of top ranked movies
by using the same method and `section='highest'`.
```python
from rotten_tomatoes_scraper.rt_scraper import CelebrityScraper
celebrity_scraper = CelebrityScraper(celebrity_name='jack nicholson')
celebrity_scraper.extract_metadata(section='highest')
movie_titles = celebrity_scraper.metadata['movie_titles']
print(movie_titles)
['On a Clear Day You Can See Forever', 'The Shooting', 'Chinatown', 'Broadcast News']
```
You can also use *MovieScraper* to extract metadata of movies. If you want to find out what movie genres an actor has
participated, you can, first, extract the list of movies that he or she participated using `CelebrityScraper`. Then, you
must instantiate the `MovieScraper` and feed the `movie_title` to the `extract_metada` method. You can feed `movie_url`
or `movie_title` to extract the movie metadata. You can see the code below.
```python
from rotten_tomatoes_scraper.rt_scraper import MovieScraper
movie_scraper = MovieScraper(movie_title='VICKY CRISTINA BARCELONA')
movie_scraper.extract_metadata()
print(movie_scraper.metadata)
{'Score_Rotten': '81', 'Score_Audience': '74', 'Genre': ['comedy', 'drama', 'romance']}
```
```python
from rotten_tomatoes_scraper.rt_scraper import MovieScraper
movie_url = 'https://www.rottentomatoes.com/m/marriage_story_2019'
movie_scraper = MovieScraper(movie_url=movie_url)
movie_scraper.extract_metadata()
print(movie_scraper.metadata)
{'Score_Rotten': '94', 'Score_Audience': '85', 'Genre': ['comedy', 'drama']}
```
This module doesn't give you a full access to all the metadata that you may find in Rotten Tomatoes website. However,
you can easily use it to extract the most important ones.
And, that's pretty much it!
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
共14个文件
txt:5个
py:4个
pkg-info:2个
资源分类:Python库 所属语言:Python 资源全名:rotten_tomatoes_scraper-1.2.2.tar.gz 资源来源:官方 安装方法:https://lanzao.blog.csdn.net/article/details/101784059
资源推荐
资源详情
资源评论
收起资源包目录
rotten_tomatoes_scraper-1.2.2.tar.gz (14个子文件)
rotten_tomatoes_scraper-1.2.2
MANIFEST.in 0B
PKG-INFO 4KB
rotten_tomatoes_scraper.egg-info
PKG-INFO 4KB
requires.txt 18B
SOURCES.txt 428B
entry_points.txt 66B
top_level.txt 24B
dependency_links.txt 1B
setup.cfg 38B
setup.py 1KB
README.md 3KB
rotten_tomatoes_scraper
__init__.py 0B
test_scraper.py 3KB
rt_scraper.py 5KB
共 14 条
- 1
资源评论
- TheShine1482022-06-11用户下载后在一定时间内未进行评价,系统默认好评。
挣扎的蓝藻
- 粉丝: 12w+
- 资源: 15万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功