没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
Lecture 9 CSE 260 – Parallel Computation(Fall 2015) Scott B. BadenPerformance modeling Further improvements to matrixmultiplicationToday’s lecture • Performance modeling • An improved matrix multiplyScott B. Baden / CSE 260, UCSD / Fall '15 3Performance modeling • Given N, application flop rate, and peakrates of the hardware u Determine if app is compute bound orcommunication bound u Predict performance of unblocked algorithm andaccount for discrepancy with observation • The naï
资源推荐
资源详情
资源评论
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![zip](https://img-home.csdnimg.cn/images/20210720083646.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083646.png)
![jar](https://img-home.csdnimg.cn/images/20210720083455.png)
![zip](https://img-home.csdnimg.cn/images/20210720083646.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![application/x-rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![epub](https://img-home.csdnimg.cn/images/20210720083646.png)
![thumb](https://img-home.csdnimg.cn/images/20210720083646.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![](https://csdnimg.cn/release/download_crawler_static/16820494/bg1.jpg)
Lecture 9
CSE 260 – Parallel Computation
(Fall 2015)
Scott B. Baden
Performance modeling
Further improvements to matrix
multiplication
![](https://csdnimg.cn/release/download_crawler_static/16820494/bg2.jpg)
Today’s lecture
• Performance modeling
• An improved matrix multiply
Scott B. Baden / CSE 260, UCSD / Fall '15 3
![](https://csdnimg.cn/release/download_crawler_static/16820494/bg3.jpg)
Performance modeling
• Given N, application flop rate, and peak
rates of the hardware
u Determine if app is compute bound or
communication bound
u Predict performance of unblocked algorithm and
account for discrepancy with observation
• The naïve algorithm
u N
3
multiply-adds
u Without tiling, algorithm loads N
3
doubles
precision words@ 8 bytes/word (we ignore C)
• The hardware
u One GPU of the K80 can perform 832 MADs /
cycle and transfer 240 GB/sec
u Processor clock runs at 823.5 MHz
Scott B. Baden / CSE 260, UCSD / Fall '15 4
![](https://csdnimg.cn/release/download_crawler_static/16820494/bg4.jpg)
5
Tesla Kepler K80/K20m (GK 210/110)
• Sorken has device capability 3.7, Stampede has 3.5
u 11¼ (5) GB device memory (frame buffer)@ 240 (208) GB/s
u 1.5MB (1.25MB) shared L2 Cache (by all SMXs)
u 13 SMXs (2496 cores) on Sorken and Stampede
• Sorken’s K80 (GK210 GPU) has more registers and larger shared memory
per device than Stampede’s K20m (GK110 GPU)
u 192 SP cores, 64 DP cores, 32 SFUs, 32 Load/Store units
u Each scalar core: fused multiply adder, truncates intermediate result
u 112K (64KB) on-chip memory configurable as scratchpad memory + L1 cache
u 128K (64K) x 32-bit registers up to 255/thread
u 1 FMA /cycle = 2 flops/cycle/ DP core*64 DP/SMX*13 SMX = 1664 flops/cyc
@823.5 MHz (705.5 MHz ) = 2.74 TFLOPS per GPU (1.17)
Nvidia
Scott B. Baden / CSE 260, UCSD / Fall '15 5
剩余22页未读,继续阅读
资源评论
![avatar-default](https://csdnimg.cn/release/downloadcmsfe/public/img/lazyLogo2.1882d7f4.png)
![avatar](https://profile-avatar.csdnimg.cn/default.jpg!1)
weixin_38706951
- 粉丝: 4
- 资源: 930
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助
![voice](https://csdnimg.cn/release/downloadcmsfe/public/img/voice.245cc511.png)
![center-task](https://csdnimg.cn/release/downloadcmsfe/public/img/center-task.c2eda91a.png)
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
![feedback](https://img-home.csdnimg.cn/images/20220527035711.png)
![feedback](https://img-home.csdnimg.cn/images/20220527035711.png)
![feedback-tip](https://img-home.csdnimg.cn/images/20220527035111.png)
安全验证
文档复制为VIP权益,开通VIP直接复制
![dialog-icon](https://csdnimg.cn/release/downloadcmsfe/public/img/green-success.6a4acb44.png)