华 中 科 技 大 学 硕 士 学 位 论 文
Abstract
Large-scale sparse matrix solver is a common problem in high-performance computing,
widely exists in engineering practice, particularly in computer simulation field. Using
conventional methods for solving sparse matrix will waste a lot of computing resources.
At present, both at home and abroad, research on sparse matrix computation in general
purpose GPU computing is less. Existing research is major on sparse matrix and vector
multiplication.
Sparse matrix vector multiplication on the GPU is achieved and optimized. In order to
solve the problem caused by uneven distribution for non-zero elements in sparse matrix
and the problem that threads in a same warp can‟t visit GPU memory in a merged way, the
SC-CSR sparse matrix vector multiplication method on GPU is proposed. To solve thread
waiting issue caused by load imbalance of thread computing in a warp and the memory
access issue as a result of the thread does not meet the merger of the global memory
access requirements, a sparse matrix vector multiplication approach based on VAB sparse
matrix storage format is proposed. The optimization of global memory access and the use
of texture memory and constant memory is proposed for the above two methods. Linear
equation solver on the GPU sparse matrix with Jacobi iteration and Generalized Minimum
Residual method is achieved and optimized. The optimization method proposed can be
extended to all the iterative method on GPU for solving sparse matrix linear equation of
universal significance. Finally, the sparse matrix equation solving is accelerated by the
host device communication and shared memory access optimization.
Experiments show that the sparse linear equation computation speed increases
significantly in relation to serial code on CPU with a speed up ranging from 10.3 to 74.0.
Keywords: CUDA architecture, GPGPU, sparse linear equation, iterative method
评论0
最新资源