HPC 2021 - Program

HPC 2021

High Performance Computing

State of the Art, Emerging Disruptive Innovations and Future Scenarios

An International Advanced Workshop

July 26 – 30, 2021, Cetraro, Italy

Programme Committee

Agenda

Panel

Final Programme

Programme Committee

L. GRANDINETTI (Chair)

University of Calabria

G. ALOISIO

University of Salento

P. BECKMAN

Argonne National Lab.

R. BISWAS

NASA Ames Research Center

G. DE PIETRO

National Research Council of Italy

J. DONGARRA

University of Tennessee

S. S. DOSANJH

Lawrence Berkeley National Lab.

G. FOX

Indiana University

W. GENTZSCH

The UberCloud

H. KOBAYASHI

Tohoku University

S. MATSUOKA

Tokyo Institute of Technology

P. MESSINA

Argonne National Laboratory

K. MICHIELSEN

Jülich Supercomputing Centre and RWTH Aachen University

M. PARASHAR

Rutgers University

V. PASCUCCI

University of Utah and Pacific Northwest National Lab.

T. STERLING

Indiana University

M. TROYER

Microsoft Research

V. VOEVODIN

Moscow State University ‘Lomonosov’

ITALY

U.S.A.

ITALY

U.S.A.

GERMANY

JAPAN

U.S.A.

RUSSIA

Organizing Committee

L. GRANDINETTI (Chair)

ITALY

M. ALBAALI

(OMAN)

J. DONGARRA

(U.S.A.)

W. GENTZSCH

(GERMANY)

P. BECKMAN

(U.S.A.)

P. MESSINA

(U.S.A.)

D. TALIA

(ITALY)

Sponsors

AMAZON WEB SERVICES
ARM
ColdQuanta
CMCC
CSC Finnish Supercomputing Center
CSCS Swiss National Supercomputing Centre
Department of Engineering for Innovation - University of Salento
D-WAVE
EOFS
4C INSIGHTS
IBM
National Research Council of Italy - ICAR - Institute for High Performance Computing and Networks
INTEL
Juelich Supercomputing Center, Germany (t.b.c.)
Leonardo
NextSilicon
NVIDIA
PARTEC

Media Partners

logo_amazon

HPCwire is the #1 news and information resource covering the fastest computers in the world and the people who run them. With a legacy dating back to 1987, HPC has enjoyed a rich history of world-class editorial and topnotch journalism, making it the portal of choice selected by science, technology and business professionals interested in high performance and data-intensive computing. For topics ranging from late-breaking news and emerging technologies in HPC, to new trends, expert analysis, and exclusive features, HPCwire delivers it all and remains the HPC communities’ most reliable and trusted resource. Don’t miss a thing – subscribe now to HPCwire’s weekly newsletter recapping the previous week’s HPC news, analysis and information at: www.hpcwire.com.

ubercloud

UberCloud provides Cloud With One Click – a fully automated, secure, on-demand, browser-based and self-service Engineering Simulation Platform

for engineers and scientists to build, compute, and analyze their engineering simulations. Our unique HPC software containers facilitate software packaging and portability, simplify access and use of any public, privale, hybrid, and multi-cloud resources, and ease software maintenance and support for end-users, IT teams, and their cloud service providers.

Please follow UberCloud on LinkedIn and contact us for performing a Proof of Concept in the Cloud.

Speakers

Lucio Grandinetti (Chair)

Department of Computer Engineering, Electronics, and Systems Science

University of Calabria – UNICAL

and

Center of Excellence for High Performance Computing

ITALY

KATRIN AMUNTS

Human Brain Project

Chair of The Science and Infrastructure Board

Scientific Research Director

and

Director

Institute for Neuroscience and Medicine

Structural and Functional Organisation of the Brain

Forschungszentrum Juelich GmbH, Juelich

and

Director Institute for Brain Research University of Duesseldorf, Duesseldorf

GERMANY

FRANK BAETKE

EOFS

European Open File System Organization

formerly

Hewlett Packard Enterprise

Munich

GERMANY

MARIO CANNATARO

Department of Medical and Surgical Sciences

University of Catanzaro

ITALY

CARLO CAVAZZONI

Leonardo S.p.A.

Head of Cloud Computing

Director High Performance Computing Lab

Chief Technology&Innovation Office

Genova

ITALY

NICHOLAS CHANCELLOR

Department of Physics

Durham University

UNITED KINGDOM

DANIELE DRAGONI

Leonardo S.p.A.

High Performance Computing Lab

Genova

ITALY

GEOFFREY FOX

School of Informatics, Computing and Engineering

Department of Intelligent Systems Engineering

and

Digital Science Center

and

Data Science program

University of Indiana

Bloomington, IN

USA

WOLFGANG GENTZSCH

The UberCloud

Regensburg

GERMANY

and

Sunnyvale, CA

USA

VLADIMIR GETOV

Distributed and Intelligent Systems Research Group

School of Computer Science and Engineering

University of Westminster

London

UNITED KINGDOM

VICTORIA GOLIBER

Senior Technical Analyst

D-Wave Systems Inc.

GERMANY

ODEJ KAO

Distributed and Operating Systems Research

Group

and

Einstein Center Digital Future

Berlin University of Technology

GERMANY

KIMMO KOSKI

CSC - Finnish IT Center for Science

Espoo

FINLAND

THOMAS LIPPERT

Juelich Supercomputing Centre

Forschungszentrum Juelich

Juelich

GERMANY

STEFANO MARKIDIS

KTH Royal Institute of Technology

Computer Science Department / Computational Science and Technology Division

Stockholm

SWEDEN

WOLFGANG NAGEL

Center for Information Services and

High Performance Computing

Technische Universitaet Dresden

Dresden

GERMANY

ULRICH RUEDE

CERFACS and Universitaet Erlangen-Nuernberg

Erlangen

GERMANY

THOMAS SCHULTHESS

CSCS

Swiss National Supercomputing Centre

Lugano

and

ETH

Zurich

SWITZERLAND

GILAD SHAINER

NVIDIA

Menlo Park, CA

USA

THOMAS STERLING

School of Informatics, Computing and Engineering

and

AI Computing Systems Laboratory

Indiana University

Bloomington, IN

USA

SERGII STRELCHUK

Department of Applied Mathematics and Theoretical Physics

and

Centre for Quantum Information and Foundations

University of Cambridge

Cambridge

UNITED KINGDOM

FRANCESCO TACCHINO

Quantum Applications Research

IBM Quantum

IBM Research – Zurich

Zurich
SWITZERLAND

DOMENICO TALIA

Department of Computer Engineering, Electronics, and Systems

and

DtoK Lab – Scalable Data Analytics

University of Calabria

ITALY

Workshop Agenda

Monday, July 26^th

Session	Time	Speaker/Activity
	9:15 – 9:30	*Welcome Address*
Session I		State of the Art and Key Developments
	9:30 – 10:15	T. STERLING Towards an Active Memory Architecture for Time-Varying Graph-based Execution
	10:15 – 11:00	G. FOX HPTMT High-Performance Data Science and Data Engineering based on Data-parallel Tensors, Matrices, and Tables
	11:00 – 11:30	COFFEE BREAK
	11:30 – 12:15	W. NAGEL Data Analytics and AI on HPC Systems: About the impact on Science
	12:15 – 12:45	CONCLUDING REMARKS
Session II		Emerging Computer Systems and Solutions
	17:30 – 18:00	G. SHAINER Cloud Native Supercomputing
	18:00 – 18:30	C. CAVAZZONI High Performance Computing and Cloud Computing, key enablers for digital transformation
	18:30 – 19:00	COFFEE BREAK
	19:00 – 19:30	F. BAETKE The Role of EOFS and the Future of Parallel File Systems for HPC
	19:30 – 19:45	CONCLUDING REMARKS

Tuesday, July 27^th

Session	Time	Speaker/Activity
Session III		Advances in HPC Systems and Projects
	10:00 – 10:30	T. STERLING Parallel Runtime Systems for Dynamic Resource Management and Task Scheduling
	10:30 – 11:00	S. MARKIDIS Exascale Programming Models for Heterogeneous Systems
	11:00 – 11:30	COFFEE BREAK
	11:30 - 12:00	T. SHULTHESS Exascale and then what?
	12:00 – 12:30	K. KOSKI Building the European EuroHPC Ecosystem
	12:30 – 12:45	CONCLUDING REMARKS
Session IV		Machine Learning and Deep Learning Methods
	18:00 – 18:30	S. MARKIDIS Brain-like Machine Learning and HPC
	18:30 – 19:00	G. FOX Deep Learning for Time Series
	19:00 – 19:30	COFFEE BREAK
	19:30 – 20:00	CONCLUDING REMARKS AND DISCUSSION Discussion Theme Is the “belle époque” of classical High Performance Computer Systems coming at the end?

Wednesday, July 28^th

Session	Time	Speaker/Activity
Session V		Quantum Computing
	9:30 – 10:00	V. GOLIBER Practical Quantum Computing
	10:00 – 10:30	S. STRELCHUK Knowing your quantum computer: benchmarking, verification and classical simulation at scale
	10:30 – 11:00	F. TACCHINO Quantum computing for natural sciences and machine learning applications
	11:00 – 11:30	COFFEE BREAK
	11:30 – 12:00	N. CHANCELLOR A domain wall encoding of variables for quantum annealing
	12:00 – 12:30	D. DRAGONI Quantum computer, dream or reality?
	12:30 – 12:45	CONCLUDING REMARKS
Session VI		Pervasive and Disruptive Impact of HPC on Biosciences
	17:45 – 18:30	K. AMUNTS Computing the Brain
	18:00 – 18:30	COFFEE BREAK
	19:00 – 19:30	M. CANNATARO High Performance Computing for Bioinformatics
	19:30 – 19:45	CONCLUDING REMARKS
	REMIND 20:45	*Gala Dinner*

Thursday, July 29^th

Session	Time	Speaker/Activity
Session VII		Advances in Data Processing, Cloud Systems and Challenging Applications
	9:30 – 10:00	V. GETOV Dynamic Decentralized Workload Scheduling for Cloud Computing
	10:00 – 10:30	O. KAO AIOps as a future of Cloud Operations
	10:30 – 11:00	D. TALIA Data-Centric Programming for Large-Scale Parallel Systems - The DCEx Model
	11:00 – 11:30	COFFEE BREAK
	11:30 – 12:00	W. GENTZSCH An automated, self-service, multi-cloud engineering simulation platform for a complex living heart simulation workflow with ML
	12:00 – 12:15	CONCLUDING REMARKS
	12:15 – 13:00	PLENARY DISCUSSION
		*A visionary planning of the program for a perfect conference on HPC and its future disruptive developments and innovations* *Chairperson: G. Joubert*
	18:00 – 19:00	PANEL DISCUSSION “The Intersection of Quantum Computing and HPC” During the past several decades, supercomputing speeds have gone from Gigaflops to Teraflops to Petaflops. As the end of Moore’s law approaches, the HPC community is increasingly interested in disruptive technologies that could help continue these dramatic improvements in capability. This interactive panel will identify key technical hurdles in advancing quantum computing to the point it becomes useful to the HPC community. Some questions to be considered: When will quantum computing become part of the HPC infrastructure? What are the key technical challenges (hardware and software)? What HPC applications might be accelerated through quantum computing? *Chairperson: W. Nagel* *Panelists: G. Fox, V. Goliber, T. Sterling, S. Strelchuk and others (t.b.a.)***

Friday, July 30^th

Self-organized groups meeting

Personal communications

Chairpersons

SESSION I

VLADIMIR GETOV

Distributed and Intelligent Systems Research Group

School of Computer Science and Engineering

University of Westminster

London

UNITED KINGDOM

SESSION II

STEFANO MARKIDIS

KTH Royal Institute of Technology

Computer Science Department / Computational Science and Technology Division

Stockholm

SWEDEN

SESSION III

ODEJ KAO

Distributed and Operating Systems Research

Group

and

Einstein Center Digital Future

Berlin University of Technology

GERMANY

SESSION IV

THOMAS STERLING

School of Informatics, Computing and Engineering

and

AI Computing Systems Laboratory

Indiana University

Bloomington, IN

USA

SESSION V

GERHARD JOUBERT

Technical University Clausthal

GERMANY

SESSION VI

WOLFGANG GENTZSCH

The UberCloud

Regensburg

GERMANY

and

Sunnyvale, CA

USA

SESSION VII

GEOFFREY FOX

School of Informatics, Computing and Engineering

Department of Intelligent Systems Engineering

and

Digital Science Center

and

Data Science program

University of Indiana

Bloomington, IN

USA

Abstracts

Computing the Brain

Katrin Amunts

Human Brain Project, Chair of The Science and Infrastructure Board / Scientific Research Director, Institute for Neuroscience and Medicine, Structural and Functional Organisation of the Brain, Forschungszentrum Juelich GmbH, Juelich, Germany

and

Institute for Brain Research, Heinrich Heine University Duesseldorf, University Hospital Duesseldorf, Germany

Neuroscience research is covering a large spectrum of empirical and theoretical approaches, with an increasing demand in terms of computation, data handling, analytics and storage. This is true in particular for research targeting the human brain, with its incredible high number of neurons that form complex networks. Demands for HPC arise from a heterogeneous portfolio of neuroscientific approaches: (i) studying human brains at cellular and ultra-structural level with Petabytes for a single brain data set; (ii) the reconstruction of the human connectome, i.e., the totality of connections of nerve cells and their ways to interact; (iii) modeling and simulation at different levels of brain organization with finer details, to make models biologically more realistic; (iv) the analysis of large, multimodal data sets using workflows that employ deep learning and machine learning, simulation, graph based interference etc.; (v) large cohort studies including many thousands of subjects with data from neuroimaging, behavioral tests, genetics, biochemical markers etc. to disclose relationships between genes, environment and the brain, while considering large variations between subjects. To address such diverse requirements, the Human Brain Project is developing its digital research infrastructure EBRAINS, and FENIX, as the HPC platform for Computing the Brain.

Back to Session VI

The Role of EOFS and the Future of Parallel File Systems for HPC

Frank Baetke

EOFS European Open File System Organization formerly Hewlett Packard Enterprise, Munich, GERMANY

Parallel File Systems are an essential part of almost all HPC-Systems. The need for that architectural concept originated with the growing influence and finally complete takeover of the HPC spectrum by parallel computers either defined as clusters or as MPPs following the nomenclature of the TOP500.

A major step towards parallel file systems for the high end of HPC systems occurred around 2001 when the US DoE funded the development of such an architecture called LUSTRE as part of the ASCI path forward project with external contractors that included Cluster File Systems Inc. (CFS), Hewlett Packard and Intel. The acquisition of the assets of CFS by SUN Microsystems in 2007 and its subsequent acquisition by ORACLE in 2010 led to a crisis with the cancellation of future work on LUSTRE.

To save the assets and ensure further development a few HPC-focused individuals founded organizations as EOFS, OpenSFS and Whamcloud to move LUSTRE to a community-driven development. In 2019 EOFS and OpenSFS jointly acquired the LUSTRE trademark, logo and related assets.

In Europe development of a parallel file system focused on HPC began in 2005 at the German Fraunhofer Society also as an open-source project dubbed FhGFS (Fraunhofer Global Parallel File System) that has now - driven by its spin-off ThinkParQ and renamed BeeGFS – gained worldwide recognition and visibility.

In contrast to community-driven open-source concepts several proprietary parallel file systems are widely in use with IBM’s Spectrum Scale – originally known as GFPS – having the lead in HPC with a significant number of installations at the upper ranks in the TOP500 list. But there are other interesting proprietary concepts with specific areas of focus and related benefits.

In this talk we will review the role of EOFS (European Open File Systems - SCE) and provide hints towards the future of the HPC parallel file systems landscape.

Note: all trademarks are the property of their respective owners

Back to Session II

High Performance Computing for Bioinformatics

Mario Cannataro

Department of Medical and Surgical Sciences, University of Catanzaro, ITALY

Omics sciences (e.g. genomics, proteomics, and interactomics) are gaining an increasing interest in the scientific community due to the availability of novel, high throughput platforms for the investigation of the cell machinery, and have a central role in the so called P4 (predictive, preventive, personalized and participatory) medicine and in particular in cancer research. High-throughput experimental platforms and clinical diagnostic tools, such as next generation sequencing, microarray, mass spectrometry, and medical imaging, are producing overwhelming volumes of molecular and clinical data and the storage, integration, and analysis of such data is today the main bottleneck of bioinformatics pipelines.

This Big Data trend in bioinformatics, poses new challenges both for the efficient storage and integration of the data and for their efficient preprocessing and analysis. Thus, managing omics and clinical data requires both support and spaces for data storing as well as algorithms and software pipelines for data preprocessing, integration, analysis, and sharing. Moreover, as it is already happening in several application fields, the service-oriented model enabled by the Cloud is more and more spreading in bioinformatics.

Parallel Computing offers the computational power to face this Big Data trend, while Cloud Computing is a key technology to hide the complexity of computing infrastructures, to reduce the cost of the data analysis task, and to change the overall model of biomedical and bioinformatics research towards a service-oriented model.

The talk introduces main omics data (e.g. gene expression and SNPs, mass spectra, protein-protein interactions) and discusses some parallel and distributed bioinformatics tools and their application in real case studies in cancer research, as well as recent initiatives to exploit international Electronic Health Records to face COVID-19, including:

preprocessing and mining of microarray data for pharmacogenomics applications,
biological networks alignment, community detection, and applications in brain connectome,
integrative bioinformatics, integration and enrichment of biological pathways,

analysis of international Electronic Health Records to face the COVID-19 pandemic: the Consortium for Clinical Characterization of COVID-19 by EHR (4CE).

Short bio

Mario Cannataro is a Full Professor of computer engineering at the University "Magna Græcia" of Catanzaro, Italy, and the Director of the Data Analytics Research Center. His current research interests include parallel computing, bioinformatics, health informatics, artificial intelligence. He published three books and more than 300 papers in international journals and conference proceedings. Mario Cannataro is a Senior Member of ACM, ACM SIGBio, IEEE, BITS (Bioinformatics Italian Society) and SIBIM (Italian Society of Biomedical Informatics).

Back to Session VI

High Performance Computing and Cloud Computing, key enablers for digital transformation

Carlo Cavazzoni

Leonardo S.p.A., Head of Cloud Computing, Director High Performance Computing Lab, Chief Technology & Innovation Office, Genova, Italy

HPC for many industries is becoming a key technology for the competitiveness and digitalization. In particular every industry will have to applying digital technologies determining a paradigm shift, the value of goods/services move from the exploitation of physical systems to the exploitation of knowledge. AI, computer simulations and other digital technologies are tools to help mining out more knowledge, faster. The more the better.

In this scenario HPC is a tool, a tool to process BigData, enable AI and perform simulations, and more often it is combined with Cloud Computing services (virtual machines and containers especially popular for BigData and AI frameworks).

HPC can accelerate the creation of value thanks to the capability to generate new knowledge and perform more accurate predictions (e.g. developing Digital Twins).

Whereas computational capacity is a fundamental resource for competitiveness, row computational capacity alone is useless, the software is the key to unlock the value. This is why, beside the supercomputer, we need to create the capability to implement applications or improve the already existing one.

In the talk I will present how Leonardo with the key contribution of the HPC Lab, intends to implement leadership software tools and computational infrastructure able to add value to the company, and ultimately transform it, to be more digital than physical.

Back to Session II

A domain wall encoding of variables for quantum annealing

Nicholas Chancellor

Department of Physics, Durham University, United Kingdom

I will discuss the application of a relatively new method for encoding discrete variables into binary ones on a quantum annealer. This encoding is based on the physics of domain walls in frustrated Ising spin chains and can be shown to perform better than the traditional one-hot encoding both in terms of efficiency of embedding the problems into quantum annealers and in terms of performance on actual devices.

I first review this encoding strategy and contrast it with the one-hot technique as well as numerical evidence of an embedding advantage following the discussion in [Chancellor Quantum Sci. Technol. 4 045004]. Next, I will discuss recent experimental evidence presented in [Chen, Stollenwerk, Chancellor arXiv:2102.12224] which shows that this encoding can lead to a large improvement in the performance of quantum annealers on coloring problems, this improvement is large enough that using the domain-wall encoding on an older generation D-Wave 2000Q quantum processing unit yields superior result to using the one-hot encoding on a more advanced Advantage QPU, indicating that better encoding can make a large difference in performance. Additionally I will touch on some more recent work inolving the quadratic assignment problem. Finally, I will discuss the importance of this encoding for the simulation of quantum field theories directly on trasverse Ising model quantum annealers [Abel, Chancellor, Spannowsky Phys. Rev. D 103, 016008].

Back to Session V

Quantum Computer, dream or reality?

Daniele Dragoni

Leonardo S.p.A., High Performance Computing Lab, Genova, ITALY

As the miniaturization of semiconductor transistors approaches its physical limits, the performance increase of microprocessors is slowing down to the point that the operating frequency increase from one chip generation to the next it is almost nil. In the attempt to catch up with Moore's law, the computing architectures have evolved to take full advantage of parallelization schemes: vectorial, multicore, GPU, etc ... Following the current trends, however, it will never be possible to efficiently address selected computational tasks of practical interest.

In this scenario, it is clear that any hypothesis that leads to a radical overcoming of the limitations of digital computing is highly interesting. In particular, quantum-computing devices that operate by exploiting the principles of quantum physics are believed to provide a route for such a paradigmatic shift. In practice, however, building a quantum computer is an engineering challenge unmatched (comparable to nuclear fusion). To date, quantum computers have been built with very few logical units (the Qubits), and it is not yet fully clear if and when they will prove superior to digital computers in concrete problems of practical interest. In the presentation we will introduce the research streams that we are following in the quantum computing domain, from quantum inspired up to real quantum applications that we will test on simulated and physical quantum computers. Finally, we will analyze all the elements and steps to consider for the introduction of quantum computing within our own infrastructure.

Back to Session V

HPTMT High-Performance Data Science and Data Engineering based on Data-parallel Tensors, Matrices, and Tables

Geoffrey Fox

School of Informatics, Computing and Engineering, Department of Intelligent Systems Engineering; Digital Science Center and Data Science program

University of Indiana Bloomington, IN, USA

The continuously increasing size and complexity of data-intensive applications demand high-performance but still highly usable environments. We integrate a set of ideas developed in various data science and data engineering frameworks. They employ a set of operators on specific data abstractions that include vectors, matrices, arrays, tensors, graphs, and tables. Our key concepts are inspired by systems like MPI, HPF (High-Performance Fortran), NumPy, Pandas, Spark, Modin, PyTorch, TensorFlow, RAPIDS(NVIDIA), and OneAPI (Intel). Further, it is crucial to support different languages in everyday use in the Big Data arena, including Python, R, C++, and Java. We note the importance of Apache Arrow and Parquet for enabling language-agnostic high performance and interoperability. We identify the fundamental principles of operator-based architecture for data-intensive applications that are needed for performance and usability success. We illustrate these principles by a discussion of examples using our software environments, Cylon and Twister2 that embody HPTMT. We describe the results of benchmarks that are being developed by MLCommons (MLPerf).

Back to Session I

Deep Learning for Time Series

Geoffrey Fox

School of Informatics, Computing and Engineering, Department of Intelligent Systems Engineering; Digital Science Center and Data Science program

University of Indiana Bloomington, IN, USA

We show that one can study several sets of sequences or time-series in terms of an underlying evolution operator which can be learned with a deep learning network. We use the language of geospatial time series as this is a common application type but the series can be any sequence and the sequences can be in any collection (bag) - not just those in Euclidean space-time -- as we just need sequences labeled in some way and having properties dependent on this label (position in abstract space). This problem has been successfully tackled by deep learning in many ways and in many fields. Comparing deep learning for such time series with coupled ordinary differential equations used to describe multi-particle systems, motivates the introduction of an evolution operator that describes the time dependence of complex systems. With an appropriate training process, we interpret deep learning applied to spatial time series as a particular approach to finding the time evolution operator for the complex system giving rise to the spatial time series. Whimsically we view this training process as determining hidden variables that represent the theory (as in Newton’s laws) of the complex system. We apply these ideas to predicting Covid infections and Earthquake occurrences.

Back to Session IV

An automated, self-service, multi-cloud engineering simulation platform for a complex living heart simulation workflow with ML

Wolfgang Gentzsch

The UberCloud, Germany and Sunnyvale, CA, USA

Co-authors: Daniel Gruber, Director of Architecture at UberCloud; Yaghoub Dabiri, Scientist at 3DT Holdings; Julius Guccione, Professor of Surgery at the UCSF Medical Center, San Francisco; and Ghassan Kassab, President at California Medical Innovations Institute, San Diego.

Many companies are finding that replicating an existing on-premise HPC architecture in the Cloud does not lead to the desired breakthrough improvements. With this in mind, from day one, a fully automated, self-service, and multi-cloud Engineering Simulation Platform has been developed, resulting in highly increased productivity of the HPC engineers, significantly improving IT security, reducing cloud costs and administrative overhead to a minimum, and maintaining full control for engineers and corporate IT over their HPC cloud environment and corporate assets.

This platform has been implemented on Google Cloud Platform (GCP) for 3DT Holdings for their highly complex Living Heart Project and Machine Learning, with the final result of reducing simulation times from many hours per simulation to just a few seconds of highly accurate prediction of an optimal medical device placement during heart surgery.

The team ran 1500 simulations needed to train the ML algorithm. The whole simulation process took place as a multi-cloud approach, with all computations running on 1500 HPC clusters in Google GCP, and management, monitoring, and health-checks orchestrated from Azure Cloud and performed through SUSE’s Kubernetes management platform Rancher.

Technology used: UberCloud Engineering Simulation Platform, multi-node HPC-enhanced Docker containers, Kubernetes, SUSE Rancher, Dassault Abaqus, Tensorflow, preemptible GCP instances (c2_standard_60), managed Kubernetes clusters (GKE), Google Filestore, Terraform, and DCV remote visualization.

Back to Session VII

Dynamic Decentralized Workload Scheduling for Cloud Computing

Vladimir Getov

Distributed and Intelligent Systems Research Group, School of Computer Science and Engineering, University of Westminster, London, UNITED KINGDOM

Virtualized frameworks typically form the foundations of Cloud systems, where Virtual Machine (VM) instances provide execution environments for a diverse range of applications and services. Modern VMs support Live Migration (LM) – a feature wherein a VM instance is transferred to an alternative node dynamically without stopping its execution. This paper presents a detailed design of a decentralized agent-based scheduler, which can be used to manage workloads within the computing cells of a Cloud system using Live Migration. Our proposed solution is based on the concept of service allocation negotiation, whereby all system nodes communicate between themselves, and the scheduling logic is decentralized. The presented architecture has been implemented, with multiple simulation runs using real-world workloads.

The focus of this research is to analyze and evaluate the LM transfer cost which we define as the total size of data to be transferred to another node for a particular migrated VM instance. Several different virtualization approaches are categorized with a shortlist of candidate VMs for evaluation. The paper highlights the major areas of the LM transfer process – CPU registers, memory, permanent storage, and network switching – and analyzes their impact on the volume of information to be migrated which includes the VM instance with the required libraries, the application code and any data associated with it. Then, using several representative applications, we report experimental results for the transfer cost of LM for respective VM instances. We also introduce a novel Live Migration Data Transfer (LMDT) formula, which has been experimentally validated and confirms the exponential nature of the LMDT process. Our estimation model supports efficient design and development decisions in the process of analyzing and building modern Cloud systems based on dynamic decentralized workload scheduling.

Back to Session VII

Practical Quantum Computing

Victoria Goliber

Senior Technical Analyst, D-Wave Systems Inc., GERMANY

D-Wave's mission is to unlock the power of quantum computing for the world. We do this by delivering customer value with practical quantum applications for a diverse set of problems. Join us to learn about the tools that D-Wave has available and how they are impacting business around the world. We’ll conclude with a live demo showing how easy it is to get started and build quantum applications today.

Back to Session V

AIOps as a future of Cloud Operations

Odej Kao

Distributed and Operating Systems Research Group and Einstein Center Digital Future, Berlin University of Technology, GERMANY

Artificial Intelligence for IT Operations (AIOps) combines big data and machine learning to replace a broad range of IT operations tasks including availability, performance, and monitoring of services. By exploiting log, tracing, metric, and network data, AIOps aim at detecting service and system anomalies before these turn into failures. This talk will present the developed methods for automated anomaly detection, root cause analysis, for remediation, optimization, and for automated initiation of self-stabilizing activities. Extensive experimental measurements and Initial results show that AIOps platforms can help to reach the required level of availability, reliability, dependability, and serviceability for future settings, where latency and response times are of crucial importance. While the automation is mandatory due to the system complexity and the criticality of a QoS-bounded response, the measures compiled and deployed by the AI-controlled administration are not easily understood or reproducible. Therefore, explainable actions taken by the automated system is becoming a regulatory requirement for future IT infrastructures. Finally, we describe a developed and deployed system named logsight.ai in order to provide an example for the design of the corresponding architecture, tools, and methods.

CV Odej Kao

Odej Kao is full professor at Technische Universität Berlin, head of the research group on distributed and operating systems, chairman of the Einstein Center Digital Future with 50 interdisciplinary professors, and chairman of the DFN board. Moreover, he is the CIO of the university and principal investigator in the national centers on Big Data and on Foundations of Learning and Data. Dr. Kao is a graduate from the TU Clausthal (master computer science in 1995, PhD in 1997, habilitation in 2002). In 2002 Dr. Kao joined the Paderborn University as associated professor for operating systems and director of the center for parallel computing. In 2006, he moved to Berlin and focused his research on AIOps, big data / streaming analytics, cloud computing, and fault tolerance. He has published over 350 papers in peer-reviewed proceedings and journals.

Back to Session VII

Building the European EuroHPC Ecosystem

Kimmo Koski

CSC - Finnish IT Center for Science, Espoo, Finland

LUMI is one of the three pre-exaflop systems acquired by EuroHPC Joint Undertaking (JU), which after being fully operational, will provide more than 500 PF of computing power for European research and industry. The system, hosted by CSC, the Finnish IT Center for Science, and run by a consortium of 10 European countries, will install in two phases: first parts during the summer of 2021 and rest at the end of 2021.

LUMI will be an essential part of European HPC collaboration and one of the main platforms for European research. It will fit together with other EuroHPC sites, such as pre-exascale systems of Spain and Italy, five petascale systems and future exaflop installations, all forming together the European HPC ecosystem.

The talk introduces LUMI and its role in European HPC Ecosystem, and discusses various aspects motivating the architectural and functional choices when building an international collaboration with heterogeneous resources placed in different countries. Talk discusses the different needs and priorities of research, which are driving the decisions aiming at the optimal performance for the most challenging applications. Benefits obtained from research and industry, are addressed.

The talk discusses the eco-efficient and low carbon footprint operational environment and its impact for the European Green Deal development. In addition, it analyzes the opportunities for developing the European competitive advantage through intensive collaboration in building the European EuroHPC Ecosystem.

Back to Session III

Exascale Programming Models for Heterogeneous Systems

Stefano Markidis

KTH Royal Institute of Technology, Computer Science Department / Computational Science and Technology Division, Stokholm, SWEDEN

The first exascale supercomputer is likely to be online any time soon. A production-quality programming environment, probably based on existing dominant programming interfaces such as MPI, needs to be in place to support application deployment and development on the exascale machines. The most striking characteristic of an exascale supercomputer will be the amount of available parallelism required to achieve the exaFLOPS barrier with the High-Performance LINPACK benchmark. The first exascale machine will provide programmers with between 100 million and a billion threads. The second characteristic of an exascale supercomputer will be the high level of heterogeneity of the compute and memory subsystems. This fact drastically increases the number of FLOPS per Watt, making it feasible to build an exascale machine using a power budget in the order of 20-100 MW range. Low-power microprocessors, accelerators, and reconfigurable hardware are the main design choice for an exascale machine. This heterogeneity in the compute will also be accompanied by deeper memory hierarchies comprised of high-performance and low-power memory technologies.

While it is not yet evident what would be the best programming approach for developing applications on large-scale heterogeneous supercomputers, a consensus in the HPC community is that programmers need an extension of the dominant programming models to ensure the programmability of new architectures. In this talk, I introduce the EPiGRAM-HS project to address the heterogeneity challenge of programming exascale supercomputers. EPiGRAM-HS improves their programmability, extending MPI and GASPI to exploit accelerators, reconfigurable hardware, and heterogeneous memory systems. In addition, EPiGRAM-HS takes MPI and GASPI at the core of the software stack and extending the programmability and productivity with additional software layers.

Back to Session III

Brain-like Machine Learning and HPC

Stefano Markidis

KTH Royal Institute of Technology, Computer Science Department / Computational Science and Technology Division, Stokholm, SWEDEN

The modern deep learning methods based on backpropagation have surged in popularity and have been used in multiple domains and application areas. At the same time, there are other machine learning algorithms inspired by modern models of brain neocortex functioning. Unlike traditional deep learning, these models use a localized (and unsupervised) brain-like rule to determine the neural network’s weights and biases. The learning of the graph connection weights complies with Hebb’s postulate: learning depends only on the available local information provided by the activities of the pre- and post-synaptic units. A Hebbian learning rule allows higher scalability and better utilization of HPC systems. In this talk, I introduce brain-like machine learning and describe the Bayesian Confidence Propagation (BCPNN) Neural Network, one of the most established brain-inspired machine learning methods. I also discuss the potential for these emerging methods to exploit HPC systems and present an HPC BCPNN implementation, called StreamBrain, for CPUs, GPUs, and FPGAs.

Back to Session IV

Data Analytics and AI on HPC Systems: About the impact on Science

Wolfgang Nagel

Center for Information Services and High Performance Computing, Technische Universitaet Dresden, GERMANY

Methods and techniques of Artificial Intelligence (AI) and Machine Learning (ML) have been investigated for decades in pursuit of a vision where computers can mimic human intelligence. In recent years, these methods have become more mature and, in some specialized applications, evolved to super-human abilities, e.g. in image recognition or in games such as Chess and Go. Nonetheless, formidable questions remain in the area of fundamental algorithms, training data usage, or explainability of results, to name a few. The AI – and especially the ML – developments have been boosted by powerful HPC-Systems, mainly driven by the GPU architectures built in in many if not most HPC systems these days. The talk will explain the challenges of integrating AI and HPC into “monolithic” systems. And it will provide a broad overview of what impact the availability of such systems will have on the science system.

Back to Session I

Cloud Native Supercomputing

Gilad Shainer

NVIDIA, Menlo Park, CA, USA

High performance computing and Artificial Intelligence are the most essential tools fueling the advancement of science. In order to handle the ever growing demands for higher computation performance and the increase in the complexity of research problems, the world of scientific computing continues to re-innovate itself in a fast pace. The session will review the recent development of the cloud native supercomputing architecture, aiming on bringing together bare metal performance and cloud services.

Back to Session II

Towards an Active Memory Architecture for Time-Varying Graph-based Execution

Thomas Sterling

School of Informatics, Computing and Engineering and AI Computing Systems Laboratory, Indiana University, Bloomington, IN, USA

A diversity of new GPUs and special purpose devices are under development and in production for significant acceleration of a wide range of Machine Learning and AI applications. For such problems exhibiting high data reuse, these emerging platforms hold great promise in commercial, medical, and defense domains. In workflows heavily dependent upon irregular graph structures with rapidly changing topologies defined by intra-graph meta-data like links, edges, and arcs, a new generation of innovative memory-centric architectures are being literally invented, some by entrepreneurs through new start-up companies. Integration and tight coupling of memory with support logic has decades of prior experiment. The new generation of architecture innovation is being pursued to address such challenges as latency hiding, global naming, graph processing idioms, and associated overheads for AI, ML, and AMR. Chief among these is extreme scalability at the limitations of Moore’s Law and nanoscale semiconductor fabrication technology. The Active Memory Architecture (AMA) is one possible new class of graph-driven memory-centric architecture. The AMA is under development, supported by NASA, to exploit opportunities exposed by classic von Neumann architecture cores and advanced concepts for graph processing. This address will present the innovative principles being explored through the AMA and describe a prototype currently under testing. All questions from the audience will be welcome throughout the presentation.

Brief Biography

Thomas Sterling is a Full Professor of Intelligent Systems Engineering at Indiana University (IU) serving as Director of the AI Computing Systems Laboratory at IU’s Luddy School of Informatics, Computing, and Engineering. Since receiving his Ph.D from MIT as a Hertz Fellow, Dr. Sterling has engaged in applied research in parallel computing system structures, semantics, and operation in industry, government labs, and academia. Dr. Sterling is best known as the "father of Beowulf" for his pioneering research in commodity/Linux cluster computing for which he shared the Gordon Bell Prize in 1997. His current research is associated with innovative extreme scale computing through memory-centric non von Neumann architecture concepts to accelerate dynamic graph processing. In 2018, he co-founded the new tech company, Simultac, and serves as its President and Chief Scientist. Dr. Sterling was the recipient of the 2013 Vanguard Award and is a Fellow of the AAAS. He is the co-author of seven books and holds six patents. Most recently, he co-authored the introductory textbook, “High Performance Computing”, published by Morgan-Kaufmann in 2017.

Back to Session I

Parallel Runtime Systems for Dynamic Resource Management and Task Scheduling

Thomas Sterling

School of Informatics, Computing and Engineering and AI Computing Systems Laboratory, Indiana University, Bloomington, IN, USA

Runtime systems, principally through software, play diverse roles in the management of resources expanding dynamic control and filling perceived gaps between compilers and operating systems on the one hand and hardware execution on the other. They can add workflow management of distributed processing components or more fine-grained supervision for optimality of efficiency and scaling through introspection. MPI, OpenMP and other user programming interfaces (e.g., Java, Python, Lisp) incorporate some runtime functionality as does SLURM, Charm++ or Cilk++ to mention a few. Legion, Habanero and HPX operate at the intra-application multi-thread level. Detailed experiments with HPX-5 explored the potential advantages but also limitations of runtime functionality and their sensitivity to application flow control properties. This presentation describes and discusses the findings and conclusions of this investigation demonstrating the potential improvements in some cases but also areas in which they may prove a hindrance due to software overheads with little or no gains. The talk concludes by considering future runtimes optimized for different objective functions like memory bandwidth or latency from the conventional ALU/FPU utilization. Exposed are possible targets for hardware mechanisms for greater efficiency and scalability by reduction of overhead times.

Back to Session III

Knowing your quantum computer: benchmarking, verification and classical simulation at scale

Sergii Strelchuk

Department of Applied Mathematics and Theoretical Physics and Centre for Quantum Information and Foundations, University of Cambridge, United Kingdom

To ensure that a quantum device operates as expected, we need to check its functioning on two levels. On a lower level, we need to map all the noise sources and ensure they do not render our device classical. On a higher level, we need to have a practical method to confirm that the output to the computational problem produced by the quantum computer can be trusted. In this talk, I will explain the core ideas behind these tasks and discuss the unexpected role of classical simulability which emerges in the above scenarios.

Back to Session V

Quantum computing for natural sciences and machine learning applications

Francesco Tacchino

Quantum Applications Researcher, IBM Quantum, IBM Research – Zurich, Switzerland

The future of computing is being shaped today around rapidly growing technologies, such as quantum and neuromorphic systems, in combination with high performance classical architectures. In the coming years, these innovative information processing paradigms may radically transform and accelerate the mechanisms of scientific discovery, potentially opening new avenues of research.

In particular, quantum computing could offer scalable and efficient solutions for many classically intractable problems in different domains including physics, chemistry, biology and medicine, as well as optimisation, artificial intelligence and finance. In this talk, I will review the state-of-the-art and recent progress in the field, both in terms of hardware and software, and present some advanced applications, with a focus on natural sciences, materials design and machine learning.

Back to Session V

Data-Centric Programming for Large-Scale Parallel Systems -

The DCEx Model

Domenico Talia

Department of Computer Engineering, Electronics, and Systems and DtoK Lab,

University of Calabria, ITALY

For designing scalable parallel applications, data-oriented programming models are effective solutions based on the exploitation of local data structures and on the limitation of the amount of shared data among parallel processes. This talk discusses the main features and the programming mechanisms of the DCEx programming model designed for the implementation of data-centric large-scale parallel applications. The basic idea of the DCEx model is structuring programs into data-parallel blocks to be managed by a large number of parallel threads. Parallel blocks are the units of distributed-memory parallel computation, communication, and migration in the memory/storage hierarchy. Threads execute close to data using near-data synchronization according to the PGAS model. A machine learning use case is also discussed showing the DCEx features for Exascale programming.

Back to Session VII