High Performance I/O and in situ data processing using the ADIOS framework
- Scott A. Klasky, group leader for Scientific Data in the Computer Science and Mathematics Division, Oak Ridge National Laboratory
Norbert Podhorszki, team leader in the Scientific Data Group in the Computer Science and Mathematics Division, Oak Ridge National Laboratory
- As concurrency and complexity continue to increase on high-end machines, I/O performance is rapidly becoming a fundamental challenge to achieving exascale computing. Wider adoption of higher-level I/O abstractions will be critically important to address this challenge. Modern I/O libraries provide data models, portable APIs, storage abstractions, and self-describing data containers. They achieve high performance and scalability, allow data to be managed more effectively throughout the data life-cycle, and enable reproducible science.
Part I of this tutorial will provide an overview of parallel I/O systems and summarize the key techniques for obtaining high performance I/O on high-performance computing (HPC) resources at scale. Part II introduces the ADIOS library, delving through their usage models and examples, showing how to achieve high performance scalable I/O. Part III explains data compression. Part IV covers techniques for creating in situ analytics and teaches how to generate visualization services. Over one half of this tutorial will be hands-on sessions, where we provide access to the software and go through live examples using a Virtualbox VM. We will have users work with real examples including the heat equation, a brusselator example, and the Gray Scott Model of Reaction Diffusion.
- About Presenters:
- Scott A. Klasky is a distinguished scientist and the group leader for Scientific Data in the Computer Science and Mathematics Division at the Oak Ridge National Laboratory. He holds an appointment at the University of Tennessee, and Georgia Institute of Technology. He obtained his Ph.D. in Physics from the University of Texas at Austin (1994), specializing in general relativity. Dr. Klasky is a world expert in scientific computing and scientific data management, co-authoring over 300 papers.
Norbert Podhorszki is a senior scientist and team leader in the Scientific Data Group in the Computer Science and Mathematics Division at the Oak Ridge National Laboratory. He is one of the key developers of ADIOS that won an R&D 100 award in 2013. His main research interest is in creating I/O and staging solutions for in situ processing of data on leadership class computing systems. He received his Ph.D. in Information Technology from the Eötvös Loránd University of Budapest. He has worked in the field of logic programming, performance monitoring and analysis of message-passing programs, application monitoring in Grid environments and application development in Desktop Grids, scientific workflow technologies in supercomputing projects, and high performance I/O. He is author and co-author of more than 100 scientific papers and book chapters. He has given many tutorials on ADIOS, including several times at the Supercomputing and the International Supercomputing conferences.
Distributed computing and spatial data analysis in Julia
- Bogumił Kamiński, PhD, Associate Professor, SGH Warsaw School of Economics
Przemysław Szufel, PhD, Assistant Professor, SGH Warsaw School of Economics
- The goal of this course is to provide hands-on experience for computational scientists to leverage the power of distributed computing and simulations in the Julia language. You are invited to bring a laptop and install the software that will be used during the workshop (see the installation instructions below).
During the workshop we will present practical case study requiring massive computations. Both thread- and process- based parallelization in Julia will be discussed. We will also show how to create various types of distributed computational clusters.
The Julia language use case presented during the workshop will be a multi-agent, large-scale, vehicle routing simulation model that can be used analyze commuter’s behavior in large cities. The numerical simulations will be run on actual road transportation system data taken from the Open Street Map project.
The course assumes intermediate prior knowledge of scientific programming in at least one language (e.g. Python, Gnu R, Java or Julia). In the first part of the tutorial we will provide an introduction to Julia programming. No significant prior knowledge of distributed computing or experience with large-scale simulation models is required.
- Important Notes:
- Please download Julia from https://julialang.org/downloads/ and follow the installation instructions presented at https://julialang.org/downloads/platform.html
- Install Julia packages that will be used throughout the course. In the command line (console) type
juliaand paste the following Julia code:
Pkg.add(“IJulia”) # optional for participants wanting to use Jupyter Notebooks
- A convenient programming environment for the Julia language is Atom plugin named Juno. In order to install Atom with Juno please follow the steps below:
- Download and install Atom (available at https://atom.io/).
- Start Atom and press Ctrl + , ( Ctrl key + comma key ) to open the Atom settings screen.
- Select the Install tab.
- In the Search packages field, type
uber-junoand press Enter.
- You will see the
uber-junopackage developed by JunoLab—click Install to install the package.
- Optionally, it is possible to use Julia within Jupyter notebook.
In order to try Julia inside a Jupyter notebook start the Julia console and run the two following commands:
notebook(dir=pwd()) # a new web browser tab should open with Jupyter Notebook
# showing current Julia working directory
- About Presenters:
- Bogumił Kamiński is the Head of Decision Analysis and Support Unit at SGH Warsaw School of Economics and Adjunct Professor at Data Science Laboratory, Ryerson University, Toronto. He is a member of the Management Committee of European Social Simulation Association (ESSA), and Vice President of Institute for Operations Research and Management Sciences (INFORMS) Polish Chapter. His field of expertise is operations research, with special focus on industrial applications of forecasting, optimization and simulation. He has been involved in development of core Julia language and its packages related to data science workflow. He is one of the top answerers for julia-lang tag on StackOverflow.
Przemysław Szufel is an Assistant Professor in Decision Support and Analysis Unit at Warsaw School of Economics. He is a member of the Management Committee of European Social Simulation Association (ESSA). His current research focuses on methods for execution of large-scale simulations for numerical experiments and optimization. He is an author or a co-author of several Open Source tools for high performance and numerical simulation (such as KissCluster, D-MASON, Isislab SOF, SilverDecisions, PyCX), and actively participates in their development and a co-author of various algorithms for distributed simulation-optimization models (such as AKG, AOCBA).
Advanced scientific visualization with VisNow platform
- Brief agenda:
- Introduction to Scientific Visualization and Visual Analysis.
- Visualization systems and paradigms
- Generic data structures
- Introduction to VisNow
- Hands-on Session #1 – 2D data visualization
- Hands-on Session #2 – 3D data visualization
- Hands-on Session #3 – Vector data visualization
- Hands-on Session #4 – Unstructured data visualization.
- Bartosz Borucki, Interdisciplinary Centre for Mathematical and Computational Modelling University of Warsaw, Poland
Krzysztof Nowiński, Interdisciplinary Centre for Mathematical and Computational Modelling University of Warsaw,
- Visual analysis is one of the most powerful tools for data exploration and interpretation. It takes advantage of visualization techniques and allows scientists to work with their research data in interactive and intuitive way. In today’s HPC environment and Big Data era, data analysis techniques, together with visualization, gain on importance. However, the amounts of data and the sizes of single datasets impose the need for adequate software tools. In this tutorial we will address this problem by providing participants with strong tool for data processing, visualization and visual analysis – VisNow, an open source generic platform based on data flow paradigm. The goal of this tutorial is to introduce the audience to the concept of visual analysis, show basic ideas of scientific visualization and to go step-by-step through several case studies in hands-on sessions based on our platform. Problems of visualization of common HPC data structures, including 2-D and 3-D, scalar and vector, regular and unstructured data will be covered and adequate elements of the software described to give participants the basics of VisNow usage.
How to Bring Secure and Scalable HPC to Your Enterprise with Lenovo and SUSE Solutions
- Artur Duszczyk, Senior Technical Sales Representative, Lenovo
- Marcin Madey, General Manager, SUSE Polska
- High-performance computing (HPC) is no longer the sole domain of well-funded research institutions. Enterprises across industries such as finance, manufacturing and healthcare are employing HPC today to gain a competitive edge. Capturing these HPC benefits for your enterprise is now easier than ever. By teaming up with Lenovo and SUSE, not only do you get a powerful and efficient HPC solution with strong support, you also partner with organizations renowned for their HPC leadership. Attaned our session and learn more how together, Lenovo and SUSE offer secure, scalable and performant HPC solutions designed for ease of use.
IBM Q – programming a quantum computer
- Tomasz Stopa, PhD, IBM-Q Ambassador, IBM Kraków Software Laboratory
- The session will be divided into two parts.
First part (approximately 1h) will cover a short introduction to quantum computing and overview of IBM’s quantum computer architecture. During this session both physical implementation as well as software environment of IBM-Q will be presented with examples.
The second part will be a hands-on workshop using real IBM quantum computer in a cloud. Both interactive composer as well as QISKit SDK examples will be used to implement simple quantum algorithms.
The course is designed as introduction into the topic of quantum computation and as such does not require any specific prior technology nor programming knowledge. However, basic knowledge of quantum physics and python programming will help to fully benefit from the session.
- Important Notes:
- In the second part of the workshop, participants should have Python 3.7 programming environment with qiskit installed on their computers. The easiest way to do that is to download and install Anaconda distribution: https://www.anaconda.com/distribution/#download-section, followed by qiskit installation using ‘pip install qiskit’.
- About presenter:
- Tomasz Stopa works in IBM Software Laboratory in Kraków, Poland. He obtained his Ph.D. in Physics from AGH University of Science and Technology in Kraków. With background in theoretical solid state physics he moved to software development 12 years ago. Currently works as development manager in software asset management space. Tomasz is also IBM-Q Ambassador and IBM Master Inventor with several patents and publications.