OpenMPinthe RealWorld
ChristianTerboven,DieteranMey
{terboven,anmey}@rz.rwth‐aachen.de
Centerfor Computingand Communication
RWTHAachenUniversity,Germany
OpenMPTutorial
May12,Purdue University
OpenMPinthe RealWorld 12.05.2008–C.Terboven
Agenda
o Nested Parallelization
– FIRE:PatternRecognition
– NestedCP:Computation of Critial Points
– DynamicThreadBalacing for FLOWER
o OpenMPand C++
– DROPS:Navier‐StokesSolver
– VRFEM:Realtime FEMfor VR
o CMP/CMTArchitectures
o OpenMPonWindows
o Conclusion
Nested
Parallelizat ion
OpenMP
and C++
CMP/CTMP
Architectures
OpenMPon
Windows
Conclusion
2
OpenMPinthe RealWorld 12.05.2008–C.Terboven
FIRE:ImageRetrieval System
o FIRE=FlexibleImageRetrieva l Engine
– Compare the performance of common features on
differentdatabases
– Analysisof correlation of differentfeatures
ThomasDeselaers and DanielKeysers,
RWTHI6:Chair for HumanLanguage
Technologyand PatternRecognition
3
Nested
Parallelizat ion
OpenMP
and C++
CMP/CTMP
Architectures
OpenMPon
Windows
Conclusion
OpenMPinthe RealWorld 12.05.2008–C.Terboven
FIRE:ImageRetrieval System
o Q:query image,X:set of database images
o Q
m
,X
m
:m‐th feature of Qand X
o d
m
:distance measure,w
m
:weighting coefficient
o Returnthe kimages with lowest distance to query image
o Well‐suited for Shared‐Memoryparallelization:
DataMininginalargeimage database!
o Three levels to exploit parallelism:
– Process multiplequery images inparallel
– Process database comparison for one query image inparallel
– Computation of distanced might be parallelized as well
4
Nested
Parallelizat ion
OpenMP
and C++
CMP/CTMP
Architectures
OpenMPon
Windows
Conclusion
OpenMPinthe RealWorld 12.05.2008–C.Terboven
FIRE:Nested OpenMPimproves scalability
o How can Nested OpenMPimprove the scalability?
– Scalability onouter level is limitedbecause of output sync.
– OpenMPoverhead increases with the number of threads
– Datasetmight better fitto the number of threads
5
Speedup Sun Fire E25K, 72 dual-core UltraSPARC-IV processors
# Threads Only outer level Only inner level Nested OpenMP
1 1.0 1.0 1.0
4 3.88 3.63 3.93
8 6.98 7.63 7.65
16 12.46 15.09 15.12
32 25.97 23.69 28.45
144 133.3
Nested
Parallelizat ion
OpenMP
and C++
CMP/CTMP
Architectures
OpenMPon
Windows
Conclusion