IISWC-2011 November 6-8, 2011 Austin, TX, USA |
While Moore's Law has continued to provide smaller semiconductor
devices, the effective end of uniprocessor performance scaling has
(finally) instigated mainstream computing to adopt parallel hardware
and software. Based on their derivation from high-performance
programmable graphics architectures, modern GPUs have emerged as the
world's most successful parallel architecture. Today, a single GPU has
a peak performance of over 650 GFlops and 175 GBytes/second of memory
bandwidth. The combination of high compute density and energy
efficiency (GFlops/Watt) has motivated the world's fastest
supercomputers to employ GPUs, including 3 of the top 5 on the
June 2011 Top 500 list. This presentation will first describe the
fundamentals of contemporary GPU architectures and the
high-performance systems that are built around them. I will then
highlight three substantial challenges that face the design of future
parallel computing systems on the road to Exascale: (1) the power
wall, (2) the bandwidth wall, and (3) the programming wall. Finally, I
will describe NVIDIA's Echelon research project that is developing
architectures and programming systems that aim to address these
challenges and drive continued performance scaling of parallel
computing from embedded systems to supercomputers.
Steve Keckler joined NVIDIA in December 2009 where he serves as
Director of Architecture Research. He is also Professor of both
Computer Science and Electrical and Computer Engineering at the
University of Texas at Austin, where he has served on the faculty
since 1998. His research team at UT-Austin developed scalable parallel
processor and memory system architectures, including non-uniform cache
architectures; explicit data graph execution processors, which merge
dataflow execution with sequential memory semantics; and
micro-interconnection networks to implement distributed processor
protocols. All of these technologies were demonstrated in the TRIPS
experimental computer system. Keckler was previously at the
Massachusetts Institute of Technology from 1990 to 1998, where he led
the development of the M-Machine experimental parallel computer
system. He is a Fellow of the IEEE, an Alfred P. Sloan Research Fellow
and a recipient of the NSF CAREER award, the ACM Grace Murray Hopper
award, the President's Associates Teaching Excellence Award at
UT-Austin, and the Edith and Peter O'Donnell award for Engineering. He
earned a BS in Electrical Engineering from Stanford University and an
MS and a Ph.D. in Computer Science from the Massachusetts Institute of
Technology.
This website is maintained by the IISWC-2011 Committee.
Please contact Wei Huang if you have any questions.