Join Strategies in Hive
Liyin Tang, Namit Jain
Software Engineer
Facebook
1
Common Join
2
Map Join
3
Auto MapJoin
4
Bucket Map Join
5
Bucket Sort Merge Map Join
6
Skew Join
Agenda
Common Join
Task A
Mapper
Mapper
Table X
Mapper
…
…
Mapper
Mapper
…
Mapper
Reducer
Table Y
Shuffle
Common Join Task
MapJoin
Mapper
Mapper
MapJoin
Task
Mapper
…
…
…
…
…
…
Task A
Task C
… Big
Table
Data
Record
Record
Record
Record
Record
…
…
…
Small Table
Data
1. Spawn mapper based on the big table
2. All files of all small tables are
replicated onto each mapper