Phoenix Rebirth:
Scalable MapReduce on a Large-Scale
Shared-Memory System
Richard Yoo, Anthony Romano, Christos Kozyrakis
Stanford University
http://mapreduce.stanford.edu
Yoo, Phoenix2 October 6, 2009
Talk in a Nutshell
Scaling a shared-memory MapReduce system on a 256-thread
machine with NUMA characteristics
Major challenges & solutions
• Memory mgmt and locality => locality-aware task distribution
• Data structure design => mechanisms to tolerate NUMA latencies
• Interactions with the OS => thread pool and concurrent allocators
Results & lessons learnt
• Improved speedup by up to 19x (average 2.5x)
• Scalability of the OS still the major bottleneck
Background
Yoo, Phoenix2 October 6, 2009
MapReduce and Phoenix
MapReduce
•
A functional parallel programming framework for large clusters
• Users only provide map / reduce functions
Map: processes input data to generate intermediate key / value pairs
Reduce: merges intermediate pairs with the same key
• Runtime for MapReduce
Automatically parallelizes computation
Manages data distribution / result collection
Phoenix: shared-memory implementation of MapReduce
• An efficient programming model for both CMPs and SMPs [HPCA’07]
Yoo, Phoenix2 October 6, 2009
Phoenix on a 256-Thread System
4 UltraSPARC T2+ chips connected by a single hub chip
1.
Large number of threads (256 HW threads)
2. Non-uniform memory access (NUMA) characteristics
300 cycles to access local memory, +100 cycles for remote memory
chip 0 chip 1
chip 2 chip 3
hub
mem 0
mem 3 mem 2
mem 1