一种用于并行稀疏矩阵-矩阵乘法的输入感知自动.zip资源-CSDN文库

共70个文件

h：26个

mtx：16个

h5：7个

版权申诉

11 浏览量 2023-04-13 23:42:07 上传评论收藏 3.82MB ZIP 举报

并行稀疏矩阵-矩阵乘法（Parallel Sparse Matrix-Matrix Multiplication，简称SpGEMM）是计算密集型任务，在高性能计算和大数据分析领域中扮演着关键角色。它广泛应用于科学计算、图形处理、机器学习等多个领域。"输入感知自动"的概念在此中表示算法能够根据输入矩阵的特性和结构来优化计算过程，提高效率。本文将详细探讨“一种用于并行稀疏矩阵-矩阵乘法的输入感知自动”方法。我们理解稀疏矩阵的特点：在大型矩阵运算中，大量的元素可能为零，因此只关注非零元素可以显著减少计算量。在并行环境中，通过合理分配工作负载和减少通信开销，进一步提升计算性能。 1. 输入感知：这种方法的核心在于理解和利用输入矩阵的属性，如非零元素的分布、矩阵的密度（非零元素数量与总元素数量之比）以及对称性等。这些特性影响着计算的并行度和通信需求。例如，如果两个矩阵的非零元素高度集中，那么并行计算时可能需要更多的通信来协调不同处理器间的计算，而稀疏且均匀分布的非零元素则更利于并行化。 2. 自动优化：在实际应用中，不同的数据集可能需要不同的优化策略。自动优化机制意味着算法能够动态地适应不同的输入，无需人工干预。这通常涉及对矩阵特性的实时分析，以及基于这些分析选择最佳的计算策略，如划分策略、通信避免策略等。 3. 并行策略：并行SpGEMM通常采用两种主要策略：数据分区和任务分配。数据分区将矩阵分割成小块，分配给各个处理器，可以是行分区、列分区或基于非零元素的分区。任务分配则是将具体的计算任务分配给处理器，这可以是静态分配或动态调度，以平衡负载并最小化等待时间。 4. 通信优化：并行计算中的通信开销是性能瓶颈之一。输入感知的自动方法会考虑如何减少不必要的通信，例如通过局部计算和延迟合并等技术。在某些情况下，可能需要牺牲一部分计算效率来降低通信成本，以整体上提升系统性能。 5. 实现框架：IA-SpGEMM-master可能是一个开源实现，它可能提供了灵活的接口，允许用户输入自定义的稀疏矩阵，同时内部包含了针对不同硬件平台（如CPU、GPU）的优化代码。这样的框架有助于研究人员和开发者快速测试新的优化策略，并将其应用到实际问题中。 6. 性能评估：评估这种输入感知自动方法的关键指标包括计算速度、内存使用、通信效率和扩展性。通过基准测试和与其他算法的比较，可以衡量其在不同规模和类型的数据上的表现。 “一种用于并行稀疏矩阵-矩阵乘法的输入感知自动”方法是一种高效的计算策略，它结合了对输入矩阵特性的理解与自动优化，以最大化并行计算的潜力，降低通信开销，从而在大规模稀疏矩阵运算中实现更高的性能。这种技术对于现代计算环境中的高效计算具有重要意义，特别是在处理大数据和复杂计算问题时。

资源推荐

资源详情

资源评论

收起资源包目录

一种用于并行稀疏矩阵-矩阵乘法的输入感知自动.zip （70个子文件）

IA-SpGEMM-master

model.png 27KB

IA-SPGEMM-CPU_release

detail

format.h 1KB

ell

common_ell.h 5KB

dia

common_dia.h 6KB

dense

common_dense.h 1KB

csr

common_csr.h 8KB

coo

common_coo.h 5KB

common.h 850B

utime.h 458B

utils.h 381B

MatNet.py 3KB

Makefile 360B

1.jpg 336KB

main.cpp 20KB

Inputs

sample.mtx 4KB

b1_ss.mtx 663B

relat3.mtx 1KB

Trec5.mtx 2KB

small.mtx 611B

ch3-3-b2.mtx 1KB

dia.mtx 607B

Ragusa18.mtx 1KB

LFAT5.mtx 1KB

NetWeights

Amd_weights.h5 214KB

Intel_weights.h5 214KB

__pycache__

MatNet.cpython-36.pyc 3KB

MatNet.cpython-35.pyc 3KB

imgs

img1.txt 32KB

img2.txt 32KB

mmio.h 16KB

spgemm-cpu 594KB

mmio.c 17KB

IA-SPGEMM-GPU_release

spgemm-gpu 5.56MB

main.cu 15KB

P100_weights.h5 214KB

2.jpg 580KB

detail

format.h 2KB

ell

common_ell.h 6KB

dia

common_dia.h 6KB

ell_dev

common_ell_dev.h 11KB

dia_dev

common_dia_dev.h 8KB

dense

common_dense.h 1KB

csr

common_csr.h 6KB

csr_dev

common_csr_dev.h 8KB

coo

common_coo.h 5KB

cusp

common_cusp.h 830B

common.h 2KB

coo_dev

common_coo_dev.h 23KB

utime.h 936B

cusparse

common_cusparse.h 5KB

utils.h 381B

MatNet.py 3KB

Makefile 977B

Inputs

sample.mtx 4KB

b1_ss.mtx 663B

relat3.mtx 1KB

Trec5.mtx 2KB

small.mtx 611B

ch3-3-b2.mtx 1KB

dia.mtx 607B

NetWeights

P100_weights.h5 214KB

__pycache__

MatNet.cpython-35.pyc 3KB

imgs

img1.txt 32KB

img2.txt 32KB

mmio.h 16KB

mmio.c 17KB

NetWeights

P100_weights.h5 214KB

Amd_weights.h5 214KB

Intel_weights.h5 214KB

README.md 2KB

# IA-SPGEMM IA-SPGEMM is a An Input Auto-tuning Sparse General Matrix-Matrix Multiplication on Multicore and Manycore Architure. Currently supported components include: - SpGEMM algorithms for COO, DIA and ELL sparse storage format - Feature extraction and density representation - MatNet (mix CNN and BP) All tests default calculate the square of A for matrix inputs. The tool extracts all of the features and density representation as MatNet inputs. It is easy to use and provide unified interface. ## Getting Started In IA-SPGEMM system, the goal is to search an optimal format and algorithm that minimizes computing overhead. Setting up an IA-SPGEMM is easy. (1) run SpGEMM code on CPU with auto-tuning in double precision ```bash cd ./IA-SPGEMM-CPU_release; make; ./spgemm-cpu Inputs/dia.mtx; ``` (2) run SpGEMM code on GPU with auto-tuning in double precision ```bash cd ./IA-SPGEMM-GPU_release; make; ./spgemm-gpu Inputs/dia.mtx; ``` **Intel & AMD CPU example** <img src="https://github.com/AnonymousPPOPP2019/IA-SPGEMM/blob/master/IA-SPGEMM-CPU_release/1.jpg"/> **NVIDIA GPU example** <img src="https://github.com/AnonymousPPOPP2019/IA-SPGEMM/blob/master/IA-SPGEMM-GPU_release/2.jpg"/> ## Requirement - Intel MKL 2018 - CUSP v0.5.1 - cuSPARSE v8.0 - Python 3.6.2 - tensorflow 1.4.0 - keras 2.1.0 ## MatNet Details of the neural network Weights are in IA-SPGEMM-CPU_release/NetWeights and IA-SPGEMM-GPU_release/NetWeights MatNet structure is below: <img src="https://github.com/AnonymousPPOPP2019/IA-SPGEMM/blob/master/model.png"/>

评论收藏

内容反馈

版权申诉