Find more PowerPoint templates on prezentr.com!
Big Data Platforms
NETFLIX
Recommender Systems
Abhi Ghose, Akarsh Sahu, Markus Wehr
Find more PowerPoint templates on prezentr.com!
Outline
▪ Business Problem
▪ Methodology
▪ Data Sources
▪ Data Visualizations
▪ Models
▪ Evaluation
▪ Customer Profiles
▪ Recommendation Visualizations
▪ Challenges & Scope for Improvements
Find more PowerPoint templates on prezentr.com!
Business Problem
…RECOMMENDATION SYSTEM CRUCIAL TO NETFLIX’ SUCCESS!
The longer people watch:
▪ The lower churn rate (increase revenue from membership fees)
▪ The more placements can be shown per person (potential revenue + reduced marketing expenses)
[1] Form 10K Q4 2018.
[2]https://www.fastcompany.com/90380266/more-product-placements-may-come-to-
netflix-but-dont-call-them-ads
[3] https://www.businessofapps.com/data/netflix-statistics/
Netflix Revenue Streams:
▪ Membership fees ($ 7.6B domestic, $ 7.8B international, $ 0.36B DVD
domestic)
[1]
▪ Potential future streams: Ad-placement (e.g., Stranger Things season 3
alone had placements worth ~$ 15M)
[2]
▪ Placements also help to reduce marketing expenses up to $ 1B per year
[3]
(e.g. KFC advertised Stranger Things, because their products appear in
season 2)
Find more PowerPoint templates on prezentr.com!
Methodology
Data
Aggregation
Exploratory
Data
Analysis
Data
Preprocess
k-means
ALS
Model
Evaluation
Recommendation
Visualizations
Clusters
Watch History
Recommendation
Top 200
Actors
Top 200
Directors
Find more PowerPoint templates on prezentr.com!
Data Sources
Data Sources Data Structure Combined Size Processed # Files
IMDB Database Individual .csv files for
genres, actors, directors,
ratings, etc.
6 GB
21 GB master dataset
(6 distributed clusters on
RCC Hadoop)
7
Netflix Database 4 combined .txt files with
single row for each
customer
4 GB 4
Top 200 Actors Oscar winning popular
actors .csv file
5 KB 1
Top 100 Directors Oscar winning popular
directors .csv file
5 KB 1