IISWC-2017

Oct 1-3, 2017

 Seattle, Washington, USA


 

Program

 

 

 

Day 1, Oct 1st

8:00-8:30 Breakfast
8:45-12:00 Sunday Morning - Tutorial 1
dist-gem5: Modeling and Simulating a Distributed Computer System Using Multiple Simulation
10:15-10:45 Coffee Break
12:00-13:00 Lunch
13:00-17:00 Sunday Afternoon
Tutorial 2:
Deep Learning Acceleration on Mobile Platforms
14:45-15:15 Coffee Break

 

Day 2, Oct 2nd

8:00-8:45 Breakfast
8:45-9:00 Opening & Welcome
9:00-10:00 Keynote Address I by Jason Cong, UCLA
10:00-10:15 Coffee Break
10:15-12:15 Session 1: Datacenters and HPC
12:15-13:30 Lunch
13:30-15:00 Session 2: Memory Systems I
15:00-15:15 Coffee Break
15:15-16:45 Session 3: I/O, Storage and VMs
16:45-17:45 Poster Session

 

Day 3, Oct 3rd

8:00-8:30 Breakfast
8:30-9:30 Keynote Address II by Derek Chiou, Microsoft
9:30-10:30 Session 4: Tail Latency
10:30-10:45 Coffee Break
10:45-12:15 Session 5: Memory Systems II
12:15-13:30 Lunch
13:30-15:30 Session 6: Mobile Systems and GPUs
15:30-15:45 Coffee Break
15:45-17:45 Session 7: Benchmarks and Soft Errors

 

Program Details

 

Day 1, Oct 1st

8:00-8:30 Breakfast
8:30-12:00 Sunday Morning - Tutorial 1
dist-gem5: Modeling and Simulating a Distributed Computer System Using Multiple Simulation
Organizers:Nam Sung Kim and Mohammad Alian, University of Illinois, Urbana-Champaign
Abstract:The single-thread performance improvement of processors has been sluggish for the past decade as Dennard’s scaling is approaching its fundamental physical limit. Thus, the importance of efficiently running applications on a parallel/distributed computer system has continued to increase and diverse applications based on parallel/distributed computing models such as MapReduce and MPI have thrived.

In a parallel/distributed computing system, the complex interplay amongst processor, node, and network architectures strongly affects the performance and power efficiency. In particular, we observe that all the hardware and software aspects of the network, which encompasses interface technology, switch/router capability, link bandwidth, topology, traffic patterns, and protocols, significantly impact the processor and node activities. Therefore, to maximize performance and power efficiency, it is critical to develop various optimization strategies cutting across processor, node, and network architectures, as well as their software stacks, necessitating full-system simulation. However, our community lacks a proper research infrastructure to study the interplay of these subsystems. Facing such a challenge, we have released a gem5-based simulation infrastructure dubbed dist-gem5 to support full-system simulation of a parallel/distributed computer system using multiple simulation host. This tutorial will cover an introduction to dist-gem5 including relevant background knowledge.
Room: TBA
10:15-10:45 Coffee Break
12:00-13:00 Lunch
Room: TBD
13:00-17:00 Sunday Afternoon
Tutorial 2:
Deep Learning Acceleration on Mobile Platforms
Organizers:Yiran Chen, Duke University
Abstract:Although Deep Neural Networks (DNN) are ubiquitously utilized in many applications, it is generally difficult to deploy DNNs on resource-constrained devices, e.g., mobile platforms. In practical use, both testing (inference) phase and sophisticated training (learning) phase are required, calling for efficient testing and training methods with higher accuracy and shorter converging time. In this tutorial, we first introduce DNNs from a historical perspective, and then present some representative techniques to reduce the computation cost of DNN, including network pruning, model compression, low precision design etc. In the last part of the tutorial, we will show some examples to perform and optimize the training and testing of DNN on distributed mobile systems.
Room: TBA
14:45-15:15 Coffee Break

 

Day 2, Oct 2nd

8:00-8:45 Breakfast
8:45-9:00 Opening & Welcome
9:00-10:00   Keynote Address I: Characterization and Acceleration for Genomic Sequencing and Analysis
Jason Cong
Distinguished Chancellor's Professor, UCLA Computer Science Department
Director, Center for Customizable Domain-Specific Computing
10:00-10:15 Coffee Break
10:15-12:15 Session 1: Datacenters and HPC
Session Chair: Amro Awad
MeNa: A Memory Navigator for Modern Hardware in Scale-out Environment.
Hosein Mohammadi Makrani, Houman Homayoun (George Mason University)

Evaluating Energy Storage for a Multitude of Uses in the Datacenter.
Narayanan (Penn State), Di Wang (Microsoft) Abdullah-al Mamun Anand Sivasubramaniam, Hosam Fathy (Penn State), Sean James (Microsoft)

Co-Locating and Concurrent Fine-Tuning MapReduce Applications on Microservers for Energy Efficiency
Maria Malik, Houman Homayoun (George Mason University)

AutoMatch: An Automated Framework for Relative Performance Estimation and Workload Distribution on Heterogeneous HPC Systems
Ahmed E. Helal, Wu-chun Feng, Changhee Jung, Yasser Y. Hanafy (Virginia Tech)

12:15-13:30 Lunch
13:20-14:50 Session 2: Memory Systems I
Session Chair: Eric Chung
Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference.
Shin-Ying Lee, Carole-Jean Wu (Arizona State University)

A Graphics Tracing Framework for Exploring CPU+GPU Memory Systems.
Andreas Sembrant, Trevor E. Carlsson, Erik Hagersten, David Black-Schaffer (Uppsala University)

Demystifying the Characteristics of 3D-Stacked Memories: A Case Study for Hybrid Memory Cube
Ramyad Hadidi, Bahar Asgari, Burhan Ahmad Mudassar, Saibal Mukhopadhyay, Sudhakar Yalamanchili, Hyesoon Kim (Georgia Tech)

15:00-15:15 Coffee Break
15:15-16:45 Session 3: I/O, Storage and VMs
Session Chair: Jieming Yin
Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and Large-Scale SSD Array Systems.
Sungjoon Koh, Jie Zhang, Miryeong Kwon, Jungyeon Yoon (SK Telecom), David Donofrio (Lawrence Berkeley National Laboratory), Nam Sung Kim (UIUC), Myoungsoo Jung (Yonsei University)

TraceTracker: Hardware/Software Co-Evaluation for Large-Scale I/O Workload Reconstruction.
Miryeong Kwon, Jie Zhang, Gyuyoung Park, Wonil Choi (Penn State), David Donofrio, John Shalf (Lawrence Berkeley Lab), Mahmut Kandemir (Penn State) Myoungsoo Jung (Yonsei University)

Cross-Layer Workload Characterization of Meta-Tracing JIT VMs.
Berkin Ilbeyi (Cornell University), Carl Friedrich Bolz-Tereick (Heinrich-Heine-Universität Düsseldorf), Christopher Batten (Cornell University)

16:45-17:45 Poster Session
Analyzing Graphics Workloads on Tile-based GPUs.
German Ceballos, Andreas Sembrant, Trevor E. Carlson, David Black-Schaffer (Uppsala University)

Understanding Power-performance Relationship of Energy-efficient Modern DRAM Devices. for Chip Multiprocessor Workloads.
Sukhan Lee, Yuhwan Ro (Seoul National University), Young Hoon Son, Hyunyoon Cho (Samsung Electronics) Nam Sung Kim (UIUC) Jung Ho Ahn (Seoul National University)

Memory Requirements of Hadoop, Spark, and MPI Based Big Data Applications on Commodity Server Class Architecture.
Hosein Mohammadi Makrani, Houman Homayoun (George Mason University)

Fine-Grained Energy Profiling for deep Convolutional Neural Network on the Jetson TX1.
Crefeda Faviola Rodrigues, Graham Riley, Mikel Lujan (The University of Manchester)

Approximeter: Automatically Finding and Quantifying Code Sections for Approximation.
Riad Akram, Abdullah Muzahid (University of Texas at San Antonio)

Determining Work Partitioning on Closely Coupled Heterogeneous Computing Systems Using Statistical Design of Experiments.
Yectli A. Huerta, Brent Swartz, David J. Lilja (University of Minnesota)

A Framework for Fast and Fair Evaluation of Automata Processing Hardware.
Xiaodong Yu, Kaixi Hou, Hao Wang, Wu-chun Feng (Virginia Tech)

Understanding the Thermal Challenges of High-Performance Mobile Devices with a Detailed Platform Temperature Model.
Ying-Ju Yu, Carole-Jean Wu (Arizona State University)

 

Day 3, Oct 3rd

8:00-8:30 Breakfast
8:30-9:30  Keynote Address II: The Microsoft Catapult Project
Derek Chiou, Microsoft
9:30-10:30 Session 4: Tail Latency
Session Chair: Changhee Jung
Workload Characterization of Interactive Cloud Services on Big and Small Server Platforms.
Shuang Chen, Shay Galon, Christina Delimitrou (Cornell University) Srilatha Manne (Cavium Inc.) José F. Martínez (Cornell University)

Why Do Programs Have Heavy Tails?
Hiroshi Sasaki, Fang-Hsiang Su (Columbia University) Teruo Tanimoto (Kyushu University) Simha Sethumadhavan (Columbia University)

10:30-10:45 Coffee Break
10:45-12:15 Session 5: Memory Systems II
Session Chair: Andrew Putnam
Congestion-Aware Memory Management on NUMA Platforms: A VMware ESXi Case Study.
Jagadish Kotra (Penn State) Seongbeom Kim (Google) Kamesh Madduri, Mahmut T. Kandemir (Penn State)

Work as a Team or Individual: Characterizing System-level Impacts of Main Memory Partitioning.
Jung Ho Ahn, Daejin Jung, Eojin Lee, Jongwook Chung, Sukhan Lee (Seoul National University), Sheng Li (Intel)

Exploring the Impact of Memory Block Permutation on Performance of a Crossbar ReRAM Main Memory.
Morteza Ramezani, Nima Elyasi (Penn State) Mohammad Arjomand (Georgia Tech) Mahmut T. Kandemir, Anand Sivasubramaniam (Penn State)

12:15-13:30 Lunch
13:30-15:30 Session 6: Mobile Systems and GPUs
Session Chair: Jieming Yin
Exploring Computation-Communication Tradeoffs in Camera Systems.
Amrita Mazumdar, Thierry Moreau, Sung Kim, Armin Alaghi, Luis Ceze, Mark Oskin, Visvesh Sathe (University of Washington)

Characterizing Diverse Handheld Apps for Customized Hardware Acceleration.
Prasanna Venkatesh Rengasamy, Haibo Zhang, Nachiappan Chidhambaram Nachiappan, Shulin Zhao, Anand Sivasubramaniam, Mahmut Kandemir, Chita R Das (Penn State)

Moka: Model-based Concurrent Kernel Analysis.
Leiming Yu, Xun Gong, Yifan Sun, Qianqian Fang (Northeastern University) Norm Rubin (NVIDIA Research) David Kaeli (Northeastern University)

Understanding the Performance-Accuracy Tradeoffs of Floating-Point Arithmetic on GPUs.
Sruthikesh Surineni, Huyen Nguyen (University of Missouri), Ruidong Gu, Michela Becchi (North Carolina State University)

15:30-15:45 Coffee Break
15:45-17:45 Session 7: Benchmarks and Soft Errors
Session Chair: Michael Papamichael
LORE: A Loop Repository for the Evaluation of Compilers.
Zhi Chen (UC Irvine) Zhangxiaowen Gong, Justin Josef Szaday (UIUC), David C. Wong (Intel), David Padua (UIUC), Alexandru Nicolau, Alexander V Veidenbaum, Neftali Watkinson (UC Irvine), Zehra Sura (IBM Research) Saeed Maleki (Microsoft Research), Josep Torrellas, Gerald DeJong (UIUC)

FLiT: Cross-Platform Floating-Point Result-Consistency Tester and Workload.
Geof Sawaya, Michael Bentley, Ian Briggs, Ganesh Gopalakrishnan (University of Utah), Dong H. Ahn (Lawrence Livermore National Laboratory)

HeteroSync: A Benchmark Suite for Fine-Grained Synchronization on Tightly Coupled GPUs.
Matthew D. Sinclair, Johnathan Alsop, Sarita V. Adve (UIUC)

Characterizing The Impact of Soft Errors Across Microarchitectural Structures and Implications for Predictability.
Bagus Wibowo, Abhinav Agrawal, James Tuck (North Carolina State University)