没有合适的资源？快使用搜索试试~ 我知道了~

文库首页安全技术网络安全基于GPU的稀疏矩阵运算优化研究1

基于GPU的稀疏矩阵运算优化研究1

需积分: 0 1 下载量 175 浏览量 2022-08-03 21:20:52 上传评论收藏 1.93MB PDF 举报

温馨提示

试读

48页

摘要I1 绪论1.1 研究背景及意义 11.2 国内外研究现状 21.3 研究内容 61.4 文章组织结构 72 稀疏矩阵向量乘法在 GPU 上的实现与优化2

资源详情

资源评论

个分类号学号 M200972508

学校代码 10487 ___ 密级

硕士学位论文

基于 GPU 的稀疏矩阵运算优化研究

学位申请人：

梁添

学科专业：

计算机应用技术

指导教师：

章勤教授

答辩日期：

2012 年 2 月 8 日

A Thesis Submitted in Partial Fulfillment of the R equirements

for the Degree of Master of Engineering

The Research on Optimizing Sparse Matrix

Computation Based on GPU

Candidate ： Tian Liang

Major ： Computer Application

Supervisor： Professor Zhang Qin

Huazhong University of Science and Technology

Wuhan 430074, P.R.China

Feb., 2012

独创性声明

本人声明所呈交的学位论文是我个人在导师指导下进行的研究工作及取得的研

究成果。尽我所知，除文中已经标明引用的内容外，本论文不包含任何其他个人或

集体已经发表或撰写过的研究成果。对本文的研究做出贡献的个人和集体，均已在

文中以明确方式标明。本人完全意识到，本声明的法律结果由本人承担。

学位论文作者签名：

日期：年月日

学位论文版权使用授权书

本学位论文作者完全了解学校有关保留、使用学位论文的规定，即：学校有权

保留并向国家有关部门或机构送交论文的复印件和电子版，允许论文被查阅和借

阅。本人授权华中科技大学可以将本学位论文的全部或部分内容编入有关数据库进

行检索，可以采用影印、缩印或扫描等复制手段保存和汇编本学位论文。

保密□ ，在_____年解密后适用本授权书。

不保密□。

（请在以上方框内打“√”）

学位论文作者签名：指导教师签名：

日期：年月日日期：年月日

本论文属于

华中科技大学硕士学位论文

摘要

大规模稀疏矩阵的求解是高性能计算中的一个常见问题，广泛存在于工程实践

尤其是计算机仿真领域。用常规方法解稀疏矩阵时，会浪费大量的计算资源。目前，

在国内外，在通用计算领域对稀疏矩阵的运算研究较少。已有的研究主要是实现稀

疏矩阵和向量之间的乘法运算。

研究 GPU 上的稀疏矩阵向量乘法运算的实现并优化。针对于稀疏矩阵非零元素

分布不均造成的空转问题以及同一线程组中线程不能合并访存的问题，提出了一种

分段行合并存储策略的稀疏矩阵向量乘方法。针对于一个线程组内的线程间计算量

负载不均衡而造成的线程间等待问题以及因线程不满足对全局存储器的合并访问要

求而造成的访存延迟问题，提出了一种按行分块存储策略的稀疏矩阵向量乘方法。

并针对以上两种方法实现了全局存储器的访存优化并使用纹理存储器和常数存储器

对运算进行加速。实现了 GPU 上的稀疏矩阵线性方程求解的雅可比迭代法和广义最

小残量法并优化。提出的优化方法可以推广至所有的 GPU 下求解稀疏矩阵线性方程

的迭代法上，具有普遍意义。最后给出了主机设备通信优化和共享存储器的访存优

化方案。

测试表明，稀疏矩阵方程求解运算相比于获得了 10.3 至 74.0 范围的加速比。

关键词: CUDA 架构，GPU 通用计算，稀疏矩阵线性方程运算，迭代法

华中科技大学硕士学位论文

Abstract

Large-scale sparse matrix solver is a common problem in high-performance computing,

widely exists in engineering practice, particularly in computer simulation field. Using

conventional methods for solving sparse matrix will waste a lot of computing resources.

At present, both at home and abroad, research on sparse matrix computation in general

purpose GPU computing is less. Existing research is major on sparse matrix and vector

multiplication.

Sparse matrix vector multiplication on the GPU is achieved and optimized. In order to

solve the problem caused by uneven distribution for non-zero elements in sparse matrix

and the problem that threads in a same warp can‟t visit GPU memory in a merged way, the

SC-CSR sparse matrix vector multiplication method on GPU is proposed. To solve thread

waiting issue caused by load imbalance of thread computing in a warp and the memory

access issue as a result of the thread does not meet the merger of the global memory

access requirements, a sparse matrix vector multiplication approach based on VAB sparse

matrix storage format is proposed. The optimization of global memory access and the use

of texture memory and constant memory is proposed for the above two methods. Linear

equation solver on the GPU sparse matrix with Jacobi iteration and Generalized Minimum

Residual method is achieved and optimized. The optimization method proposed can be

extended to all the iterative method on GPU for solving sparse matrix linear equation of

universal significance. Finally, the sparse matrix equation solving is accelerated by the

host device communication and shared memory access optimization.

Experiments show that the sparse linear equation computation speed increases

significantly in relation to serial code on CPU with a speed up ranging from 10.3 to 74.0.

Keywords: CUDA architecture, GPGPU, sparse linear equation, iterative method

剩余47页未读，继续阅读

评论收藏

内容反馈

基于GPU的稀疏矩阵运算优化研究1

评论0

最新资源

基于GPU的稀疏矩阵运算优化研究1

评论0

最新资源

相关推荐

基于GPU的高效稀疏矩阵存储格式研究.pdf

基于GPU的稀疏矩阵存储格式优化研究.pdf

基于GPU的稀疏矩阵向量乘优化.pdf

用于稀疏矩阵运算的GPU内核库。_C++_CMake_下载.zip

基于GPU的矩阵乘法优化研究_殷建.caj

GPU上基于稀疏矩阵-矢量乘法统计的性能预测

一种在GPU上高精度大型矩阵快速运算的实现

基于GPU的稀疏矩阵Cholesky分解.pdf

Mali-T604 GPU的二维浮点矩阵运算并行优化方法.pdf

GPU稀疏矩阵向量乘的性能模型构造.pdf

行业分类-设备装置-基于GPU的稀疏矩阵LU分解方法.zip

基于GPU的高性能稀疏矩阵向量乘及CG求解器优化.pdf

基于GPU加速和矩阵优化的医学图像重建.pdf

opencl 多gpu矩阵数组运算

稀疏矩阵LU分解在GPU上的性能优化

基于GPU的对称正定稀疏矩阵复线性方程组迭代算法.pdf

GPU上稀疏矩阵与矢量乘积运算的一种改进.pdf

基于GPU的粒子系统的研究与应用

基于HYB格式稀疏矩阵与向量乘在CPU GPU异构系统中的实现与优化.pdf

BurpLoaderKeygen.jar.zip

最新版ISO/IEC 27001:2022、ISO 27002:2022中英文合集

Goby红队版-win-x64-2.4.7版本

Chrome Header Editor 插件

ISO SAE 21434-2021 中文版.pdf

网络安全+《2024网络安全报告》

OpenVAS GVM 中文翻译补丁

安全认证cisp教材全套

STM32F103C8T6核心板-电路原理图1.PDF