IISWC 2011 Home

IISWC-2011

November 6-8, 2011

Austin, TX, USA

KEYNOTE: GPU Computing and the Road to Extreme-Scale Parallel Systems

ABSTRACT
While Moore's Law has continued to provide smaller semiconductor devices, the effective end of uniprocessor performance scaling has (finally) instigated mainstream computing to adopt parallel hardware and software. Based on their derivation from high-performance programmable graphics architectures, modern GPUs have emerged as the world's most successful parallel architecture. Today, a single GPU has a peak performance of over 650 GFlops and 175 GBytes/second of memory bandwidth. The combination of high compute density and energy efficiency (GFlops/Watt) has motivated the world's fastest supercomputers to employ GPUs, including 3 of the top 5 on the June 2011 Top 500 list. This presentation will first describe the fundamentals of contemporary GPU architectures and the high-performance systems that are built around them. I will then highlight three substantial challenges that face the design of future parallel computing systems on the road to Exascale: (1) the power wall, (2) the bandwidth wall, and (3) the programming wall. Finally, I will describe NVIDIA's Echelon research project that is developing architectures and programming systems that aim to address these challenges and drive continued performance scaling of parallel computing from embedded systems to supercomputers.

BIO SKETCH
Steve Keckler joined NVIDIA in December 2009 where he serves as Director of Architecture Research. He is also Professor of both Computer Science and Electrical and Computer Engineering at the University of Texas at Austin, where he has served on the faculty since 1998. His research team at UT-Austin developed scalable parallel processor and memory system architectures, including non-uniform cache architectures; explicit data graph execution processors, which merge dataflow execution with sequential memory semantics; and micro-interconnection networks to implement distributed processor protocols. All of these technologies were demonstrated in the TRIPS experimental computer system. Keckler was previously at the Massachusetts Institute of Technology from 1990 to 1998, where he led the development of the M-Machine experimental parallel computer system. He is a Fellow of the IEEE, an Alfred P. Sloan Research Fellow and a recipient of the NSF CAREER award, the ACM Grace Murray Hopper award, the President's Associates Teaching Excellence Award at UT-Austin, and the Edith and Peter O'Donnell award for Engineering. He earned a BS in Electrical Engineering from Stanford University and an MS and a Ph.D. in Computer Science from the Massachusetts Institute of Technology.

This website is maintained by the IISWC-2011 Committee.

Please contact Wei Huang if you have any questions.