Microsoft HPC Pack 2008 SDK
Classic HPC Development using Visual C++
Developed by Pluralsight LLC, in partnership
with Microsoft Corp.
All rights reserved, ©2008
© Microsoft Corporation, September 2008
Developed by Pluralsight LLC
Page 2 of 79
Table of Contents
Preface………………………………………………………………………………………………………………………………………………………………………………………………………….4
1. Problem Domain ............................................................................................................................................................................................................. 5
2. Data Parallelism and Contrast Stretching .......................................................................................................................................................................7
3. A Sequential Version of the Contrast Stretching Application .......................................................................................................................................10
3.1 Architecture of Sequential Version ....................................................................................................................................................................10
3.2 Allocating Efficient 2D Arrays ........................................................................................................................................................................... 13
3.3 Lab Exercise! ...................................................................................................................................................................................................... 14
4. Working with Windows HPC Server 2008 .................................................................................................................................................................... 15
4.1 Submitting a Job to the Cluster .......................................................................................................................................................................... 15
4.2 Lab Exercise! ..................................................................................................................................................................................................... 20
5. A Shared-Memory Parallel Version using OpenMP ..................................................................................................................................................... 20
5.1 Working with OpenMP in Visual Studio 2005/2008 ....................................................................................................................................... 23
5.2 Lab Exercise! ..................................................................................................................................................................................................... 23
6. A Distributed-Memory Parallel Version using MPI ..................................................................................................................................................... 28
6.1 Installing and Configuring MSMPI ................................................................................................................................................................... 30
6.2 Working with MSMPI in Visual Studio 2005/2008 ......................................................................................................................................... 30
6.3 MSMPI and HPC Server 2008 .......................................................................................................................................................................... 36
6.4 Lab Exercise! ..................................................................................................................................................................................................... 37
7. MPI Debugging, Profiling and Event Tracing ............................................................................................................................................................... 48
7.1 Profiling with ETW ............................................................................................................................................................................................ 48
7.2 Local vs. Cluster Profiling ................................................................................................................................................................................. 50
7.3 Lab Exercise! ...................................................................................................................................................................................................... 51
7.4 Don’t have Administrative Rights? Need targeted tracing? ............................................................................................................................. 53
7.5 MPI Debugging.................................................................................................................................................................................................. 53
7.6 Lab Exercise! ..................................................................................................................................................................................................... 56
Page 3 of 79
7.7 Remote MPI Debugging on the Cluster............................................................................................................................................................. 56
7.8 Lab Exercise! ..................................................................................................................................................................................................... 59
7.9 Other Debugging Tools ..................................................................................................................................................................................... 60
8. Using MPI’s Collective and Asynchronous Functions for an Improved Distributed-Memory Solution...................................................................... 60
8.1 Example ............................................................................................................................................................................................................. 60
8.2 Lab Exercise! ..................................................................................................................................................................................................... 64
9. Hybrid OpenMP + MPI Designs ................................................................................................................................................................................... 65
10. Managed Solutions with MPI.NET ............................................................................................................................................................................. 66
10.1 Lab Exercise!.................................................................................................................................................................................................... 69
11. Conclusions 70
11.1 References ........................................................................................................................................................................................................ 70
11.2 Resources .......................................................................................................................................................................................................... 71
Appendix A: Summary of Cluster and Developer Setup for HPC Server 2008 ................................................................................................................ 72
Appendix B: Troubleshooting HPC Server 2008 Job Execution ...................................................................................................................................... 75
Appendix C: Screen Snapshots .......................................................................................................................................................................................... 77
Feedback…………………………………………………………………………………………………………………………………………………………………………………………………….79
More Information and Downloads ................................................................................................................................................................................... 79
Page 4 of 79
Preface
This document is a tutorial on Microsoft HPC Server 2008. In particular, it presents a classic HPC development scenario centered around data
parallelism, using Visual C++, OpenMP, MPI and HPC Server 2008 to develop high-performance, parallel solutions. The complete tutorial includes
lab exercises, program solutions, and miscellaneous support files. Installation of the complete tutorial yields a folder with the following structure:
Page 5 of 79
This document presents a classic HPC development scenario — data parallelism in the context of image processing. Written for the C and C++
developer, this tutorial walks you through the steps of designing, writing, debugging and profiling a parallel application for Windows HPC
Server 2008. When you complete the tutorial, you’ll have the skills and expertise necessary to deliver high-performance, cluster-wide
applications for Windows HPC Server 2008.
1. Problem Domain
Image processing is a compute-intensive domain. Given the
size of today’s images, and the wide-range of special effects, it is
not uncommon to consume hours of CPU time in the processing
of a single image. A representative example of a problem in this
domain is contrast stretching, where contrast is enhanced by
lightening or darkening pixels based on neighboring pixels. For
every pixel P, the typical approach is to determine the min and
max of its 8 neighbors, and then adjust P upward/downward
based on the ratio of lightness to darkness in relation to P.
The best result is obtained by adjusting the image slowly,
and repeating until either (a) the image converges (no longer
changes from one iteration to the next), or (b) the desired effect
has been achieved (by performing a specified number of
iterations). For example, consider the images to your right. The
upper image is the original, capturing a sailboat in a South
Pacific harbor at sunset. The lower image is the equivalent
image after contrast stretching for 75 iterations. Notice that the
stretching reveals more clearly the presence of other boats in the
harbor, in particular their white masts. Contrast stretching is
one of the many techniques used to enhance images.
Programmatically, images are most easily treated as two-
dimensional arrays of integers. For simplicity, we’ll work with
bitmaps (.bmp), where each pixel is stored as 3 distinct integers
(0..255) representing the amount of blue, green and red at that
pixel. Thus, an M-by-N image contains M*N pixels, M*N*3
integers, and will be represented by a 2D array with N rows and
M columns. In this approach, each element of the array denotes
a single pixel, which we’ll represent using a structure containing
3 fields, each field an unsigned char since the range is 0..255:
typedef struct {
- 1
- 2
前往页