OpenCL Programming Guide
By Aaftab Munshi, Benedict R Gaster, Timothy G. Mattson, James Fung, Dan Ginsburg
Addison-Wesley Professional; 1 edition (July 25, 2011)
- Understanding OpenCL’s core models, concepts, terminology, goals, and rationale
- Discovering and preparing available resources
- Programming with OpenCL C, its built-in functions, and runtime API
- Using buffers, sub-buffers, images, samplers, and events
- Sharing and synchronizing data with OpenGL and Microsoft’s Direct3D
- Simplifying development with the C++ Wrapper API
- Using OpenCL Embedded Profiles to support devices ranging from cellphones to supercomputer nodes
- Building complete applications: image histograms, edge detection filters, physics simulations, Fast Fourier Transforms, optical flow, and more
- Using OpenCL with PyOpenCL, including issues in porting from C++ to Python
- Performing matrix multiplication and high-performance sparse matrix multiplication
Using the new OpenCL (Open Computing Language) standard and framework, you can write applications that access all available programming resources: CPUs, GPUs, accelerators, and even external processors. Already implemented by Apple, ATI, NVIDIA, and other leaders, OpenCL has outstanding potential for PCs, servers, handheld/embedded devices, high-performance computing, and even cloud systems. This is the first comprehensive, authoritative, and practical guide to OpenCL 1.1 specifically for working developers and software architects.
Authored by five leading OpenCL authorities, OpenCL Programming Guide covers the entire specification. It reviews key use cases, shows how OpenCL can express a wide range of parallel algorithms, and offers complete reference material on both the API and OpenCL C programming language.
Through complete case studies and downloadable code examples, the authors show how to write complex parallel programs that decompose workloads across many different devices. They also present all the essentials of OpenCL software performance optimization, including probing and adapting to hardware.
The book contains an wide selection of code samples, covering API usage, OpenCL C development, to full sample applications covering the OpenCL 1.1 specification. To support the open development of these samples and allow additions over time they are hosted as Google Code project: opencl-book-samples.
Please email errors and other comments to firstname.lastname@example.org and enter errata into the “Issues” list Google Code project: opencl-book-samples.
Software engineers, programmers, hardware engineers, students / advanced students
Table of Contents
- Introduction to OpenCL
- HelloWorld: An OpenCL Example
- Platforms, Contexts, and Devices
- Programming with OpenCL C
- OpenCL C Built-in Functions
- Programs and Kernels
- Buffers and Sub-Buffers
- Images and Samplers
- Iteroperability with OpenGL
- Iteroperability with Direct3D
- C++ Wrapper API
- OpenCL Embedded Profile
- Image Histogram
- Sobel Edge Detection Filter
- Parallelizing Dikjstra’s Single Source Shortest Path Graph Algorithm
- Coth Simulation in the Bullet Physics SDK
- Simulating the Ocean with Fast Fourier Transform
- Optical Flow
- Using OpenCL with PyOpenCL
- Matrix Multiplication with OpenCL
- Sparse Matrix-vector Multiplication
- Summary of OpenCL 1.1
Aaftab Munshi is the spec editor for the OpenGL ES 1.1, OpenGL ES 2.0 and OpenCL specifications andco-author of the book OpenGL ES 2. 0 Programming Guide (with Dan Ginsburg and Dave Shreiner, AddisonWesley). He currently works at Apple.
Benedict R. Gaster is a software architect working on programming models for next-generation heterogeneousprocessors, in particular looking at high-level abstractions for parallel programming on the emerging class ofprocessors that contain both CPUs and accelerators such as GPUs. Benedict has contributed extensively to theOpenCL’s design and has represented AMD at the Khronos Group open standard consortium. Benedict has a Ph.D in computer science for his work on type systems for extensible records and variants. He has beenworking at AMD since 2008.
Tim Mattson is an old-fashioned parallel programmer starting in the mid-80’s with the Caltech Cosmic Cubeand continuing up to the present. Along the way, he has worked with most classes of parallel computer (vectorsupercomputers, SMP, VLIW, NUMA, MPP, clusters, and many-core processors). Tim has published extensively including the books Patterns for Parallel Programming (with Beverly Sanders and Berna Massingill, Addison Wesley, 2004) and An Introduction to Concurrency in Programming Languages (with Matthew J.Sottile and Craig E Rasmussen, CRC Press, 2009). Tim has a Ph.D. in chemistry for his work on molecularscattering theory. He has been working at Intel since 1993.
Dan Ginsburg currently works at Children’s Hospital Boston as a Principal Software Architect in the Fetal-Neonatal Neuroimaging and Development Science Center where he uses OpenCL for accelerating neuroimaging algorithms. Previously, he worked for Still River Systems developing GPU-accelerated image registration software for the Monarch 250 proton beam radiotherapy system. Dan was also Senior Member ofTechnical Staff at AMD, where he worked for over eight years in a variety of roles, including the developmentof OpenGL drivers, the creation of desktop and handheld 3D demos, and leading the development of handheld GPU developer tools. Dan holds a B.S. in computer science from Worcester Polytechnic Institute and anM.B.A. from Bentley University.
James Fung has been developing computer vision on the GPU as it progressed from graphics to general purposecomputation. James has a Ph.D. in Electrical and Computer Engineering from the University of Toronto andnumerous IEEE and ACM publications in the areas of parallel GPU Computer Vision and Mediated Reality.He is currently a Developer Technology Engineer at NVIDIA where he examines computer vision and imageprocessing on graphics hardware.