SUPERCOMPUTING FRONTIERS 2017
MARCH 13 – 16, 2017
Matrix@Biopolis, Singapore
Matrix@Biopolis, Singapore
SCF17 Tutorials offer attendees a variety of short courses on key topics and technologies relevant to high performance computing, programming & novel architectures. These tutorials also provide the opportunity to interact with recognized leaders in the field and to learn about the latest technology trends, theory, and practical techniques.
Our tutorials are open to all conference attendees except those who are on 1-Day passes but registrations for the tutorials are required. For those of you who are only interested in attending the tutorials but not the conference from 14 – 16 March 2017, we have introduced a special tutorials-only fee of SG$150. Please check out the details on our registration page.
The rooms assigned for the tutorials will be announced on the morning of March 13. Please proceed to Level 4 of the Matrix to sign in for your selected tutorials and to find out where your tutorial will be held.
Our registration desk will open from 8am onwards.
Please note that pre-registration is essential to guarantee your seat for the tutorials. If you have not registered for your tutorials, entry will only be granted if there is still available space.
Time: 9.00am – 5.00pm
Venue: Level 4, Matrix Building, Biopolis
Breaks:
Morning Tea Break – 10.30am – 11.00am
Lunch Break – 1.00pm – 2.00pm
Afternoon Tea Break – 3.30pm – 4.00pm
Presenter(s):
Nicolas Walker, Senior Solutions Architect, NVIDIA SEA
Jeff Adie, HPC / Deep Learning Solutions Architect, NVIDIA APJ
Aik Beng Ng, Deep Learning Solutions Architect, NVIDIA APJ
You can verify your system is supported by going to this test site and verifying the “WebSockets (Port 80)” section has all green check marks.
Abstract:
Learn the latest techniques on how to design, train, and deploy neural network-powered machine learning in your applications. You’ll explore widely used open-source frameworks and NVIDIA’s latest GPU-accelerated deep learning platforms. You will learn:
Lab 1:
Getting Started with Deep Learning
Learn how to leverage deep neural networks (DNN) within the deep learning workflow to solve a real-world image classification problem using NVIDIA DIGITS. You’ll walk through the process of data preparation, model definition, model training and troubleshooting, validation testing and strategies for improving model performance using GPUs. On completion of this lab, you will be able to use NVIDIA DIGITS to train a DNN on your own image classification application.
Pre-Requisite:
Lab 3:
Neural Network Deployment
Abstract: Once a deep neural network (DNN) has been trained using GPU acceleration, it needs to be deployed into production. The step after training is called inference as it uses a trained DNN to make predictions from new data.
In this lab we will show different approaches to deploying a trained DNN for inference. The first approach is to directly use inference functionality within a deep learning framework, in this case DIGITS and Caffe. The second approach is to integrate inference within a custom application by using a deep learning framework API, again using Caffe but this time through it’s Python API. The final approach is to use the NVIDIA TensorRT™ which will automatically create an optimized inference run-time from a trained Caffe model and network description file. You will learn about the role of batch size in inference performance as well as various optimizations that can be made in the inference process. You’ll also explore inference for a variety of different DNN architectures trained in other DLI labs.
Pre-Requisite:
Time: 9.00am – 5.00pm
Venue: Level 4, Matrix Building, Biopolis
Breaks:
Morning Tea Break – 10.30am – 11.00am
Lunch Break – 1.00pm – 2.00pm
Afternoon Tea Break – 3.30pm – 4.00pm
Presenter(s):
Rama Kishan Malladi, Application Engineeer, Intel Software Service Group
Jyotsna Khemka, Technical Consulting Engineer, Intel Software Service Group
Abstract:
Slot 1: 9:00 – 10:30
Intel® Architecture for Software Developers
Learn about the parallel architecture, technical advances and features of the latest and future Intel processors, especially Xeon, Xeon Phi – newest Knights Landing. Brief introduction to latest Intel® architecture Intel® Xeon Phi™ coprocessor architecture overview.
Slot 2: 11:00 – 1:00
Vectorization for Intel® C++ & Fortran Compiler
• Introduction to SIMD for Intel® Architecture
• Vector Code Generation
• Compiler & Vectorization
• Validating Vectorization Success
• Reasons for Vectorization Fails
• Vectorization of Special Program Constructs & Loops
Slot 3: 12:00 – 1:00
Understanding vectorization and how it impacts performance: Vectorization Advisor
* Features
* Workflow
* Understanding the results
* Demo
Slot 4: 2:00 – 3:30
Intel® Advisor Roofline
A Roofline chart is a visual representation of application performance in relation to hardware limitations, including memory bandwidth and computational peaks
The Roofline provides insight into:
• Where your performance bottlenecks are
• How much performance is left on the table because of them
• Which bottlenecks are possible to address, and which ones are worth addressing
• Why these bottlenecks are most likely occurring
• What your next steps should be
New Era for OpenMP*: Beyond Traditional Shared Memory Parallel Programming
OpenMP* has been the de facto standard for parallel programming on shared memory systems OpenMP 4.5 added major enhancements to features to the OpenMP API to keep pace with the latest technological trends in HPC and beyond. The new features in OpenMP 4.5 significantly improve the expressiveness of OpenMP on modern architectures and increase its applicability for complex application codes. This presentation describes the evolution of OpenMP and provides an overview of the major new elements including support for accelerators/coprocessors/GPUs, SIMD extensions, doacross loop, task loop, task priority, and thread affinity.
Slot 5: 4:00 – 5:00
Intel’s Machine Learning
Intel is committed to AI and is making major investments across technology, training, resources and R&D to advance AI for business and society. In this session, we cover the Intel portfolio – software and hardware solutions for machine learning in the data center from general purpose (Xeon, Xeon Phi) to targeted silicon (FPGA and Nervana technology). At the edge, Intel also offers a portfolio of processors (Core, Atom, Joule, etc) that utilize common intelligent APIs for distributed and collaborative intelligence.
Time: 9.00am – 5.00pm
Venue: Level 4, Matrix Building, Biopolis
Breaks:
Morning Tea Break – 10.30am – 11.00am
Lunch Break – 1.00pm – 2.00pm
Afternoon Tea Break – 3.30pm – 4.00pm
Presenter(s):
Patrick Wohlschlegel, Senior Engineering Manager, Enterprise & HPC Tools, ARM
The tutorials will increase your understanding of a range of time-saving techniques associated with these specialist tools for debugging, application analysis and code optimization, covering:
– General development best practices: cut down the time spent resolving one-off problems arising during code development
– Proactive optimization of HPC applications: lessons from the “Allinea Performance Roadmap”
– Failsafe automation: testing as part of your daily workflow to eradicate bugs and performance problems.
Time: 9.00am – 1.00pm
Venue: Level 4, Matrix Building, Biopolis
Breaks:
Morning Tea Break – 10.30am – 11.00am
Lunch Break – 1.00pm – 2.00pm
Presenter(s):
Jeremy Kerr, Open POWER Architect, IBM Australia
Abhisek Chatterjee, Program Director – OpenSource Technology Development & OpenPower Enablement, IBM Australia
Abstract:
The OpenPOWER project delivers the high-performance and highly parallel POWER processor architecture on an entirely open source software stack – including workload applications, the operating system, and even low-level system firmware.
This technically-focussed tutorial describes the evolution of the OpenPOWER hardware and software architecture, and our leading design principles.
We will explain the initialisation and management processes of OpenPOWER platforms, particularly in the context of high performance computing environments.
As OpenPOWER is specifically designed for Linux, we are able to make targeted optimisations for the Linux operating system and infrastructure.
We’ll cover the key details of the operating system implementation, and how it integrates with the platform firmware. In addition to implementing the base platform, we have been porting and optimising common HPC workloads and infrastructure to OpenPOWER.
The tutorial will provide detail on those efforts, and explain some of the deployments of those so far.
For those working with their own code, we’ll discuss some key guidelines and hints for running workloads efficiently on OpenPOWER, and the compilers and development tools that are available.
Time: 9.00am – 1.00pm
Venue: Matrix Building, Biopolis
Breaks:
Morning Tea Break – 10.30am – 11.00am
Lunch Break – 1.00pm – 2.00pm
Presenter(s):
Hui Liang, Lead Data Scientist, APJ Innovation Center, Hewlett Packard Enterprise, Singapore
Wei Ann Lim, Data Scientist, APJ Innovation Center, Hewlett Packard Enterprise, Singapore
Track 1: Machine Learning
Introduction to Machine Learning
Linear Regression
Logistic Regression
Clustering
Track 2: Text Analytics
Introduction to Text Analysis
Applications
Specific Techniques for Text Analysis
Track 3: HPC + Data Analytics
Introduction to Deep Learning
Application of Deep Learning in Natural Language Processing