All performance results included below are only examples of possible
observations. Depending on your platform, your results may and likely will vary.
These results are only included as examples of the type of measurements you
should perform when answering these questions, and not as the actual values you
should expect to see.
The example commands provided are for binaries included with CUDA 6.
The commands for other versions of CUDA will likely vary, though the
underlying concepts are the same.
Additionally, many of these questions can be answered in multiple ways. If
your solution does not perfectly match with the provided example, but does
implement the same functionality and demonstrate the same performance
characteristics that would be considered an equivalent solution.
1 Refer to Figure 1-5 and finish the following patterns of data partition:
! Block partition along the x dimension for 2D data
! Cyclic partition along the y dimension for 2D data
! Cyclic partition along the z dimension for 3D data
2 Remove the cudaDeviceReset function from the file, then compile
and run it to see what would happen.
When cudaDeviceReset is removed, none of the prints from the GPU
Hello World from CPU!