没有合适的资源?快使用搜索试试~ 我知道了~
AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide
5星 · 超过95%的资源 需积分: 9 18 下载量 26 浏览量
2013-03-10
17:50:15
上传
评论 1
收藏 5.18MB PDF 举报
温馨提示
试读
288页
AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf
资源推荐
资源详情
资源评论
rev2.4
AMD Accelerated Parallel Processing
OpenCL
™
Programming Guide
December 2012
ii
© 2012 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo,
AMD Accelerated Parallel Processing, the AMD Accelerated Parallel Processing logo, ATI,
the ATI logo, Radeon, FireStream, FirePro, Catalyst, and combinations thereof are trade-
marks of Advanced Micro Devices, Inc. Microsoft, Visual Studio, Windows, and Windows
Vista are registered trademarks of Microsoft Corporation in the U.S. and/or other jurisdic-
tions. Other names are for informational purposes only and may be trademarks of their
respective owners. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by
permission by Khronos.
The contents of this document are provided in connection with Advanced Micro Devices,
Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the
accuracy or completeness of the contents of this publication and reserves the right to
make changes to specifications and product descriptions at any time without notice. The
information contained herein may be of a preliminary or advance nature and is subject to
change without notice. No license, whether express, implied, arising by estoppel or other-
wise, to any intellectual property rights is granted by this publication. Except as set forth
in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever,
and disclaims any express or implied warranty, relating to its products including, but not
limited to, the implied warranty of merchantability, fitness for a particular purpose, or
infringement of any intellectual property right.
AMD’s products are not designed, intended, authorized or warranted for use as compo-
nents in systems intended for surgical implant into the body, or in other applications
intended to support or sustain life, or in any other application in which the failure of AMD’s
product could create a situation where personal injury, death, or severe property or envi-
ronmental damage may occur. AMD reserves the right to discontinue or make changes to
its products at any time without notice.
Advanced Micro Devices, Inc.
One AMD Place
P.O. Box 3453
Sunnyvale, CA 94088-3453
www.amd.com
For AMD Accelerated Parallel Processing:
URL: developer.amd.com/appsdk
Developing: developer.amd.com/
Support: developer.amd.com/appsdksupport
Forum: developer.amd.com/openclforum
AMD ACCELERATED PARALLEL PROCESSING
Preface iii
Copyright © 2012 Advanced Micro Devices, Inc. All rights reserved.
Preface
About This Document
This document provides a basic description of the AMD Accelerated Parallel
Processing environment and components. It describes the basic architecture of
stream processors and provides useful performance tips. This document also
provides a guide for programmers who want to use AMD Accelerated Parallel
Processing to accelerate their applications.
Audience
This document is intended for programmers. It assumes prior experience in
writing code for CPUs and a basic understanding of threads (work-items). While
a basic understanding of GPU architectures is useful, this document does not
assume prior graphics knowledge. It further assumes an understanding of
chapters 1, 2, and 3 of the OpenCL Specification (for the latest version, see
http://www.khronos.org/registry/cl/ ).
Organization
This AMD Accelerated Parallel Processing document begins, in Chapter 1, with
an overview of: the AMD Accelerated Parallel Processing programming models,
OpenCL, the AMD Compute Abstraction Layer (CAL), the AMD APP Kernel
Analyzer, and the AMD APP Profiler. Chapter 2 discusses the compiling and
running of OpenCL programs. Chapter 3 describes using GNU debugger (GDB)
to debug OpenCL programs. Chapter 4 is a discussion of general performance
and optimization considerations when programming for AMD Accelerated Parallel
Processing devices. Chapter 5 details performance and optimization
considerations specifically for Southern Island devices. Chapter 6 details
performance and optimization devices for Evergreen and Northern Islands
devices. Appendix A describes the supported optional OpenCL extensions.
Appendix B details the installable client driver (ICD) for OpenCL. Appendix C
details the compute kernel and contrasts it with a pixel shader. Appendix D lists
the device parameters. Appendix E describes the OpenCL binary image format
(BIF). Appendix F describes the OpenVideo Decode API. Appendix G describes
the interoperability between OpenCL and OpenGL. The last section of this book
is a glossary of acronyms and terms, as well as an index.
AMD ACCELERATED PARALLEL PROCESSING
iv Preface
Copyright © 2012 Advanced Micro Devices, Inc. All rights reserved.
Conventions
The following conventions are used in this document.
Related Documents
• The OpenCL Specification, Version 1.1, Published by Khronos OpenCL
Working Group, Aaftab Munshi (ed.), 2010.
• AMD, R600 Technology, R600 Instruction Set Architecture, Sunnyvale, CA,
est. pub. date 2007. This document includes the RV670 GPU instruction
details.
• ISO/IEC 9899:TC2 - International Standard - Programming Languages - C
• Kernighan Brian W., and Ritchie, Dennis M., The C Programming Language,
Prentice-Hall, Inc., Upper Saddle River, NJ, 1978.
• I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P.
Hanrahan, “Brook for GPUs: stream computing on graphics hardware,” ACM
Trans. Graph., vol. 23, no. 3, pp. 777–786, 2004.
• AMD Compute Abstraction Layer (CAL) Intermediate Language (IL)
Reference Manual. Published by AMD.
• Buck, Ian; Foley, Tim; Horn, Daniel; Sugerman, Jeremy; Hanrahan, Pat;
Houston, Mike; Fatahalian, Kayvon. “BrookGPU”
http://graphics.stanford.edu/projects/brookgpu/
• Buck, Ian. “Brook Spec v0.2”. October 31, 2003.
http://merrimac.stanford.edu/brook/brookspec-05-20-03.pdf
• OpenGL Programming Guide, at http://www.glprogramming.com/red/
• Microsoft DirectX Reference Website, at http://msdn.microsoft.com/en-
us/directx
• GPGPU: http://www.gpgpu.org, and Stanford BrookGPU discussion forum
http://www.gpgpu.org/forums/
mono-spaced font A filename, file path, or code.
* Any number of alphanumeric characters in the name of a code format, parameter,
or instruction.
[1,2) A range that includes the left-most value (in this case, 1) but excludes the right-most
value (in this case, 2).
[1,2] A range that includes both the left-most and right-most values (in this case, 1 and 2).
{x | y} One of the multiple options listed. In this case, x or y.
0.0f
0.0
A single-precision (32-bit) floating-point value.
A double-precision (64-bit) floating-point value.
1011b A binary value, in this example a 4-bit value.
7:4 A bit range, from bit 7 to 4, inclusive. The high-order bit is shown first.
italicized word or phrase The first use of a term or concept basic to the understanding of stream computing.
AMD ACCELERATED PARALLEL PROCESSING
Preface v
Copyright © 2012 Advanced Micro Devices, Inc. All rights reserved.
Contact Information
URL: developer.amd.com/appsdk
Developing: developer.amd.com/
Support: developer.amd.com/appsdksupport
Forum: developer.amd.com/openclforum
REVISION HISTORY
Rev Description
1.3 e Deleted encryption reference.
1.3f Added basic guidelines to CL-GL Interop appendix.
Corrected code in two samples in Chpt. 4.
1.3g Numerous changes to CL-GL Interop appendix.
Added subsections to Additional Performance Guidance on CPU Programmers
Using OpenCL to Program CPUs and Using Special CPU Instructions in the
Optimizing Kernel Code subsection.
2.0 Added ELF Header section in Appendix E.
2.1 New Profiler and KernelAnalyzer sections in chapter 4.
New AMD gDEBugger section in chapter 3.
Added extensions to Appendix A.
Numerous updates throughout for Southern Islands, especially in Chapters 1
and 5.
Split original chapter 4 into three chapters. Now, chapter 4 is general consid-
erations for Evergreen, Northern Islands, and Southern Islands; chapter 5 is
specifically for Southern Islands devices; chapter 6 is for Evergreen and
Northern Islands devices.
Update of Device Parameters in Appendix D.
2.1a Reinstated some supplemental compiler options in Section 2.1.4.
Changes/additions to Table 4.3
2.1b Minor change in Section 1.8.3, indicating that LDS model has not changed
from previous GPU families.
2.4 Addition of channel mapping information (chpt 5). Minor corrections through-
out. Deletion of duplicate material from chpt 6. Inclusion of upgraded index.
Minor rewording and corrections.Corrections in wording. Corrections to figure
1.8 for SI. Addition of AMD extensions. Memory object properties table delin-
eated for VM enabled/disabled. Added table for comparison of CPU/GPU in
AMD Trinity APU.
剩余287页未读,继续阅读
资源评论
- CplusEx2015-03-11非常好,值得学习
liweifeng78
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功