没有合适的资源?快使用搜索试试~ 我知道了~
资源详情
资源评论
资源推荐
DO NOT DISTRIBUTE- COPYRIGHTED MATERIAL
Uncorrected proofs - for course adoption review only
Comp. by: ps0002Bla dhana lakshmi Stage: Proof ChapterID: 0001131926 Date:26/11/09
Time:17:04:50
Programming
Massively Parallel
Processors
A Hands-on Approach
B978-0-12-381472-2.00013-1, 00013
Kirk-Hwu, 978-0-12-381472-2
DO NOT DISTRIBUTE- COPYRIGHTED MATERIAL
Uncorrected proofs - for course adoption review only
Comp. by: ps0002Bla dhana lakshmi Stage: Proof ChapterID: 0001131926 Date:26/11/09
Time:17:04:50
B978-0-12-381472-2.00013-1, 00013
Kirk-Hwu, 978-0-12-381472-2
DO NOT DISTRIBUTE- COPYRIGHTED MATERIAL
Uncorrected proofs - for course adoption review only
Comp. by: PG0300pdjeapradaban Stage: Proof ChapterID: 0001131927 Date:25/11/09
Time:17:50:00
Programming
Massively Parallel
Processors
A Hands-on Approach
David Kirk and Wen-mei Hwu
AMSTERDAM • BOSTON • HEIDELBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Mor
g
an Kaufmann Publishers is an imprint of Elsevier
B978-0-12-381472-2.00014-3, 00014
Kirk-Hwu, 978-0-12-381472-2
DO NOT DISTRIBUTE- COPYRIGHTED MATERIAL
Uncorrected proofs - for course adoption review only
Comp. by: PG1264NJayamalathi Stage: Proof ChapterID: 0001131928 Date:25/11/09
Time:13:36:01
Morgan Kaufmann Publishers is an imprint of Elsevier.
30 Corporate Drive, Suite 400, Burlington, MA 01803, USA
This book is printed on acid-free paper.
#
2010 ELSEVIER Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, recording, or any information storage and retrieval system, without
permission in writing from the publisher. Details on how to seek permission, further information about
the Publisher’s permissions policies and our arrangements with organizations such as the Copyright
Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/
permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher
(other than as may be noted herein).
NVIDIA, the NVIDIA logo, CUDA, GeForce, Quadro, and Tesla are trademarks or registered
trademarks of NVIDIA Corporation in the U.S. and other countries.
OpenCL is a trademark of Apple Inc.
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience
broaden our understanding, changes in research methods, professional practices, or medical treatment
may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and
using any information, methods, compounds, or experiments described herein. In using such information
or methods they should be mindful of their own safety and the safety of others, including parties for
whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume
any liability for any injury and/or damage to persons or property as a matter of products liability,
negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas
contained in the material herein.
Library of Congress Cataloging-in-Publication Data
Application Submitted
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
ISBN: 978-0-12-381472-2
For information on all Morgan Kaufmann publications,
visit our Web site at www.mkp.com or www.elsevierdirect.com
Printed in United States of America
101112131454321
B978-0-12-381472-2.00015-5, 00015
Kirk-Hwu, 978-0-12-381472-2
DO NOT DISTRIBUTE- COPYRIGHTED MATERIAL
Uncorrected proofs - for course adoption review only
Comp. by: ps0002Bla dhana lakshmi Stage: Proof ChapterID: 0001131929 Date:27/11/09
Time:15:14:35
Contents
Preface ....................................................................... ...............................................i x
Acknowledgements.................................................... ..............................................xv
Dedication............................................................................................................. xvii
CHAPTER 1 INTRODUCTION................................................................................1
1.1 GPUs as Parallel Computers ..........................................................2
1.2 Architecture of a Modern GPU ......................................................8
1.3 Why More Speed or Parallelism? ................................................10
1.4 Parallel Programming Languages and Models............................13
1.5 Overarching Goals ........................................................................15
1.6 Organization of the Book.............................................................16
CHAPTER 2 HISTORY OF GPU COMPUTING .....................................................21
2.1 Evolution of Graphics Pipelines ..................................................21
2.1.1 The Era of Fixed-Function Graphics Pipelines.................. 22
2.1.2 Evolution of Programmable Real-Time Graphics .............26
2.1.3 Unified Graphics and Computing Processors ....................29
2.1.4 GPGPU: An Intermediate Step...........................................31
2.2 GPU Computing ...........................................................................32
2.2.1 Scalable GPUs.....................................................................33
2.2.2 Recent Developments..........................................................34
2.3 Future Trends................................................................................34
CHAPTER 3 INTRODUCTION TO CUDA..............................................................39
3.1 Data Parallelism............................................................................39
3.2 CUDA Program Structure ............................................................41
3.3 A Matrix–Matrix Multiplication Example...................................42
3.4 Device Memories and Data Trans fer...........................................46
3.5 Kernel Functions and Threading..................................................51
3.6 Summary .......................................................................................56
3.6.1 Function Declarations .........................................................56
3.6.2 Kernel Launch.....................................................................56
3.6.3 Predefined Variables .................................................. .........56
3.6.4 Runtime API........................................................................ 56
CHAPTER 4 CUDA THREADS.............................................................................59
4.1 CUDA Thread Organization ........................................................59
4.2 Using blockIdx and threadIdx ..........................................64
4.3 Synchronization and Transparent Scalability ..............................68
v
B978-0-12-381472-2.00016-7, 00016
Kirk-Hwu, 978-0-12-381472-2
剩余74页未读,继续阅读
lulu8719
- 粉丝: 11
- 资源: 92
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0