HPC 2024 - Program

HPC 2024

High Performance Computing

State of the Art, Emerging Disruptive Innovations and Future Scenarios

An International Advanced Workshop

June 24 – 28, 2024, Cetraro, Italy

Programme Committee

Agenda

Panels

Final Programme

Programme Committee

LUCIO GRANDINETTI (Chair)

Department of Computer Engineering, Electronics, and Systems Science

University of Calabria – UNICAL

and

Center of Excellence for High Performance Computing

ITALY

THOMAS LIPPERT (Co-chair)

Juelich Supercomputing Center

Institute for Advanced Simulation

Forschungszentrum Juelich

Juelich

GERMANY

GIOVANNI ALOISIO

CMCC Strategic Board member & Director of the CMCC Supercomputing Center

Euro Mediterranean Center on Climate Change (CMCC Foundation)

and

University of Salento

ITALY

FRANK BAETKE

EOFS

European Open File System Organization

formerly

Hewlett Packard Enterprise

Munich

GERMANY

PETER BECKMAN

Argonne National Laboratory

Argonne, IL

USA

RUPAK BISWAS

NASA

Exploration Technology Directorate

High End Computing Capability Project

NASA Ames Research Center

Moffet Field, CA

USA

CHARLIE CATLETT

Argonne National Laboratory

Argonne, Illinois

USA

JACK DONGARRA

Innovative Computing Laboratory

Computer Science Department

University of Tennessee

Knoxville, TN

USA

IAN FOSTER

Argonne National Laboratory

Data Science and Learning Division

Argonne, IL

and

Dept. of Computer Science

The University of Chicago

Chicago, IL

USA

WOLFGANG GENTZSCH

Simr, formerly known as UberCloud

Regensburg

GERMANY

and

Sunnyvale, CA

USA

VLADIMIR GETOV

Distributed and Intelligent Systems Research Group

School of Computer Science and Engineering

University of Westminster

London

UNITED KINGDOM

KIMMO KOSKI

CSC – The Finnish IT Center for Science

Helsinki

FINLAND

SALVATORE MANDRÀ

Senior Research Scientist and Task Lead

Quantum Artificial Intelligence Lab (QuAIL)

KBR, Inc

NASA, Ames Research Center

CA, USA

STEFANO MARKIDIS

KTH Royal Institute of Technology

Computer Science Department

Stockholm

SWEDEN

SATOSHI MATSUOKA

RIKEN

Director Center for Computational Science

Kobe

and

Department of Mathematical and Computing Sciences

Tokyo Institute of Technology

Tokyo

JAPAN

PAUL MESSINA

US DOE Argonne National Laboratory

Argonne Associate and Distinguished Fellow

Argonne, Illinois

USA

KEVIN OBENLAND

Quantum Information and Integrated Nanosystems

Massachusetts Institute of Technology

Lincoln Laboratory

Boston, MA

USA

VALERIO PASCUCCI

Center for Extreme Data Management, Analysis and Visualization

and

Scientific Computing and Imaging Institute

School of Computing, University of Utah

and

Laboratory Fellow, Pacific Northwest National Laboratory

USA

KRISTEN PUDENZ

Vice President of Research Collaborations

Atom Computing

Berkeley, California

USA

RICK STEVENS

Argonne National Laboratory

Argonne, Illinois

USA

MICHELA TAUFER

The University of Tennessee

Electrical Engineering and Computer Science Dept.

Knoxville, TN

USA

ROBERT WISNIEWSKI

SAMSUNG

Senior Vice President and Chief Architect of HPC

Head of Samsung’s SAIT Systems Architecture Lab

USA

Organizing Committee

L. GRANDINETTI (Co-chair)

ITALY

T. LIPPERT (Co-chair)

GERMANY

P. BECKMAN

(USA)

C. CATLETT

(USA)

J. DONGARRA

(USA)

W. GENTZSCH

(GERMANY)

Sponsors

ATOM COMPUTING
CEREBRAS
CMCC Euro-Mediterranean Center on Climate Change
CSC Finnish Supercomputing Center
DWAVE Systems
EOFS
GROQ
Hewlett Packard Enterprise
Juelich Supercomputing Center, Germany
LENOVO
NVIDIA
PSIQUANTUM
QUANTUM BRILLIANCE
SAMBANOVA SYSTEMS
SAMSUNG
SIPEARL
THINKPARQ
University of Calabria Department of Computer Engineering, Electronics, and Systems
UNIVERSITY OF SALENTO

Media Partners

Immagine che contiene testo, Carattere, logo, Elementi grafici

Descrizione generata automaticamente

HPCwire is a news portal and weekly newsletter covering the fastest computers in the world and the people who run them. As the trusted source for HPC news since 1987, HPCwire has served as the publication of record on the issues, challenges, opportunities, and conflicts relevant to the global High Performance Computing space. Its reporting covers the vendors, technologies, users, and the uses of high performance, AI- and data-intensive computing within academia, government, science, and industry.

Subscribe now at www.hpcwire.com.

Immagine che contiene testo, Carattere, logo, Elementi grafici

Descrizione generata automaticamente

https://insidehpc.com/about/

About insideHPC

Founded in 2006, insideHPC is a global publication recognized for its comprehensive and insightful coverage across the HPC-AI community, linking vendors, end-users and HPC strategists. insideHPC has a large and loyal audience drawn from public and private companies of all sizes, government agencies, research centers, industry analysts and academic institutions. In short: the buyers and influencers of HPC, HPC-AI and associated emerging technologies.

THE News Analysis Site for HPC Insiders: Written and edited by seasoned technology journalists, we’re all about HPC and AI, offering feature stories, commentary, news coverage, podcasts and video interviews with HPC’s leading voices. Like the evolving HPC ecosystem, insideHPC’s coverage continually expands into emerging focus areas to better serve our growing readership and advertising base, insideHPC in 2023 will deliver an updated format and new spotlight coverage of of enterprise HPC, HPC-AI, exascale (and post-exascale) supercomputing, quantum computing, cloud HPC, edge computing, High Performance Data Analytics and the geopolitical implications of supercomputing.

Simr, formerly UberCloud, is dedicated to helping manufacturers thrive by advancing SimOps, a framework of best practices for automating simulation processes that enhance decision-making, innovation, efficiency, and quality in product design and engineering. Founded by engineers for engineers, Simr builds on a versatile, high-performance platform compatible with any compute infrastructure, setting a new standard in manufacturing design. Our platform allows engineers to securely design and test product concepts using existing workflows and tools, ensuring complete control over their simulation workflows and data. Visit us at simr.com and follow us on LinkedIn.

Speakers

ILKAY ALTINTAS

San Diego Supercomputer Center

and

Workflows for Data Science (WorDS) Center of Excellence

and

WIFIRE Lab

University of California at San Diego, CA

USA

FRANK BAETKE

EOFS

European Open File System Organization

GERMANY

PETE BECKMAN

US DOE Argonne National Laboratory

and

University of Chicago

and

Northwestern University / Argonne National Lab. Institute for Science and Engineering

USA

RUPAK BISWAS

NASA

Exploration Technology Directorate

High End Computing Capability Project

NASA Ames Research Center

Moffet Field, CA

USA

GIL BLOCH

NVIDIA

Santa Clara, CA

USA

SERGIO BOIXO

GOOGLE

Quantum Artificial Intelligence Laboratory, Google AI

Santa Barbara, CA

USA

ERNESTO BONOMI

GROQ

Mountain View, CA

USA

CHARLIE CATLETT

Argonne National Laboratory

Argonne, Illinois

USA

ANTONIO CORCOLES

IBM Quantum

T.J. Watson Research Center

Yorktown Heights, NY

USA

PATRICIA DAMKROGER

HPE SVP

GM HPC & AI Solutions

Spring, Texas

USA

JACK DONGARRA

Electrical Engineering and Computer Science Department

and

Innovative Computing Laboratory

University of Tennessee

Knoxville, TN, USA

and

Oak Ridge National Laboratory, USA

and

University of Manchester, U.K.

DANIELE DRAGONI

Leonardo S.p.A.

High Performance Computing Lab.

Genova

ITALY

IAN FOSTER

Argonne National Laboratory

Data Science and Learning Division

Argonne, IL

and

Dept. of Computer Science

The University of Chicago

Chicago, IL

USA

WOLFGANG GENTZSCH

Simr, formerly known as UberCloud

Regensburg

GERMANY

and

Sunnyvale, CA

USA

VLADIMIR GETOV

Distributed and Intelligent Systems Research Group

School of Computer Science and Engineering

University of Westminster

London

UNITED KINGDOM

VLAD GHEORGHIU

Institute for Quantum Computing, University of Waterloo

and

SoftwareQ Inc, Waterloo

Waterloo, Ontario

CANADA

ALEXANDER GLATZLE

CEO and Co-Founder PLANQ

Munich

GERMANY

RAJEEB HAZRA

QUANTINUUM

Broomfield, Colorado

USA

FRANK HEROLD

ThinkParQ GmbH

GERMANY

ANDY HOCK

Cerebras Systems

Sunnyvale, California

USA

TORSTEN HOEFLER

ETH Zurich

Full Professor Department of Computer Science

and

Director Scalable Parallel Computing Laboratory

Zurich

SWITZERLAND

NOBUYASU ITO

RIKEN Center for Computational Science

Kobe

JAPAN

MICHAEL JAMES

CEREBRAS

Sunnyvale, CA

USA

GORAN JOHANSSON

Professor Chalmers Tekniska Hogskola

Gothenburg

and

Co-director WACQT

Wallenberg Centre for Quantum Technology

Stockholm

SWEDEN

ANDREY KANAEV

U.S. National Science Foundation

Program Director

Office of Advanced Cyberinfrastructure

Computer and Information Science and Engineering Directorate

Alexandria, VA

USA

HIROAKI KOBAYASHI

Architecture Laboratory

Department of Computer and Mathematical Sciences

Graduate School of information Sciences

Tohoku University

JAPAN

KIMMO KOSKI

CSC - Finnish IT Center for Science

Espoo

FINLAND

ELICA KYOSEVA

Director Quantum Algorithms Engineering

NVIDIA

Santa Clara, California

USA

LORENZO LEANDRO

Quantum Machines Inc.

Milan

ITALY

THOMAS LIPPERT

Juelich Supercomputing Center

Institute for Advanced Simulation

Forschungszentrum Juelich

Juelich

GERMANY

YUTONG LU

Full Professor, School of Computer Science and Engineering

Director, National Supercomputer Center in Guangzhou

Sun Yat-Sen University

Guangzhou Higher education Mega Center

Guangzhou

CHINA

RICCARDO MANENTI

Rigetti Computing

Berkeley, CA

USA

STEFANO MARKIDIS

KTH Royal Institute of Technology

Computer Science Department / Computational Science and Technology Division

Stockholm

SWEDEN

SATOSHI MATSUOKA

Director RIKEN Center for Computational Science, Kobe

and

Tokyo Institute of Technology, Tokyo

JAPAN

PAUL MESSINA

US DOE Argonne National Laboratory, Argonne Illinois

Argonne Associate and Distinguished Fellow

USA

JOHN MORTON

Professor University College London - UCL

Director of UCL Quantum Science and Technology Institute

and

Co-Founder and CTO of QUANTUM MOTION

London

U.K.

MARTIN MUELLER

SambaNova Systems Inc

Palo Alto, CA

USA

KEVIN OBENLAND

Quantum Information and Integrated Nanosystems

Lincoln Laboratory

Massachusetts Institute of Technology MIT

Boston, MA

USA

IRWAN OWEN

D-Wave Systems Inc.

GERMANY and USA

NASH PALANISWAMY

QUANTINUUM

Broomfield, Colorado

USA

MANISH PARASHAR

Scientific Computing and Imaging Institute

and

School of Computing

University of Utah, Salt Lake City

USA

FLORIAN PREIS

Quantum Brilliance GmbH

Stuttgart

GERMANY

KRISTEN PUDENZ

Atom Computing

Berkeley, California

USA

KENTARO SANO

Team Leader

Processor Research Team

Center for Computational Science, RIKEN

Kobe

JAPAN

RAFFAELE SANTAGATI

Quantum Computing Scientist

Boheringer Ingelheim

GERMANY

ANNA SCAIFE

University of Manchester

Manchester

THOMAS SCHULTHESS

CSCS

Swiss National Supercomputing Centre

Lugano

and

ETH

Zurich

SWITZERLAND

PETE SHADBOLT

Co-Founder

PsiQuantum

Palo Alto, California

USA

RICK STEVENS

US DOE Argonne National Laboratory

Computing, Environment, Life Sciences Laboratory

and

University of Chicago

USA

FRED STREITZ

Center for Forecasting and Outbreak Analytics (CFA/CDC)

USA

and

National AI Research Resource Task Force (NAIRR-TF)

USA

and

Lawrence Livermore

National Laboratory (LLNL/DOE)

Livermore, California

USA

SERGII STRELCHUK

Department of Applied Mathematics and Theoretical Physics

and

Centre for Quantum Information and Foundations

University of Cambridge

Cambridge

and

University of Warwick

Computer Science Department

Warwick Quantum Centre

Warwick

ESTELA SUAREZ

Juelich Research Center

Juelich

GERMANY

SAMANTIKA SURY

SAMSUNG Electronics America

Westford, MA

USA

WILLIAM TANG

Princeton University Dept. of Astrophysical Sciences,

Princeton Plasma Physics Laboratory

and

Center for Statistics and Machine Learning (CSML)

and

Princeton Institute for Computational Science & Engineering (PICSciE)

Princeton University

USA

MICHELA TAUFER

The University of Tennessee

Electrical Engineering and Computer Science Dept.

Knoxville, TN

USA

SCOTT TEASE

Lenovo

Vice President HPC and AI

Morrisville, NC

USA

MIWAKO TSUJI

RIKEN Center for Computational Science

Kobe

JAPAN

ANDREW WHEELER

HPE Fellow & VP

Hewlett Packard Labs

Fort Collins, CO

USA

RIO YOKOTA

Tokyo Institute of Technology

Tokyo

JAPAN

Workshop Agenda

Monday, June 24^th

Session	Time	Speaker/Activity
	10:00 – 10:15	*Welcome Address*
Session I		State of the art and future scenarios
	10:15 – 10:45	J. DONGARRA An Overview of High Performance Computing and Responsibly Reckless Algorithm
	10:45 – 11:15	S. MATSUOKA Riken TRIP-AGIS and FugakuNEXT - greatly accelerating next generation AI for Science
	11:15 – 11:45	COFFEE BREAK
	11:45 – 12:15	A. WHEELER Navigating AI’s impact on energy efficiency and resource consumption
	12:15 – 12:45	R. HAZRA Quantum Computing and High-Performance Computing Rivals or Allies?
	12:45 – 13:00	CONCLUDING REMARKS
Session II		Emerging Computer Systems and Solutions
	17:00 – 17:30	I. ALTINTAS Bridging the Data Gaps to Democratize AI in Science, Education and Society
	17:30 – 18:00	Y. LU Application Driven Optimizations in High-Performance Interconnects for Supercomputing
	18:00 – 18:30	F. STREITZ HPC and Machine Learning for Molecular Biology: ADMIRRAL Project Update
	18:30 – 19:00	COFFEE BREAK
	19:00 – 19:30	R. BISWAS HPC to enable NASA missions
	19:30 – 20:00	W. GENTZSCH SimOps, a New HPC Community Initiative Focusing on Simplifying Use and Operation of Scientific and Engineering Simulations
	19:30 – 19:45	CONCLUDING REMARKS

Tuesday, June 25^th

Session	Time		Speaker/Activity
Session III			Advances in HPC Technology and Systems, Architecture and Software
	9:30 – 9:55		S. SURY Breaking the HPC Communication Wall with Tightly-coupled Supernodes
	9:55 - 10:20		G. BLOCH Entering A New Frontier of AI Networking Innovation
	10:20 – 10:45		A. KANAEV Challenges of Deploying Emerging Computing Technologies for U.S. Academic Research
	10:45 – 11:15		COFFEE BREAK
	11:15 – 11:40		K. KOSKI LUMI HPC Ecosystem – Today and Tomorrow
	11:40 – 12:05		M. TAUFER The National Science Data Fabric: Democratizing Data Access for Science and Society
	12:05 -12:30		F. BAETKE Towards an Operational Crisis in HPC System Software: The File System Example
	12:30 – 12:55		F. HEROLD Accelerating Extreme HPC Scale-out and AI Environments
	12:55 – 13:10		CONCLUDING REMARKS
Session IV			AI Processing: Challenges and Perspectives
	16:25 – 16:50		R. YOKOTA Lessons Learned from Pre-training Large Language Models
	16:50 – 17:15		P. BECKMAN What lies beyond the edge?
	17:15 – 17:40		I. FOSTER Embodied agents as scientific assistants
	17:40 – 18:05		E. SUAREZ Modular Supercomputing, HPC and AI
	18:05 – 18:30		COFFEE BREAK
	18:30 – 18:55		W. TANG Accelerating Progress in Delivering Clean Energy Fusion for the World with AI, ML, and Exascale Computing
	19:00 – 20:00	PANEL DISCUSSION 1 “How AI Workloads are Influencing system Architecture and Software Stacks: New Tricks for Old (and new) Dogs” Chairpersons: Pete Beckman and Charlie Catlett, US DOE Argonne National Laboratory, USA In the past 3-4 years, artificial intelligence (AI) training has rapidly become a significant application of high-performance computing (HPC). A single training run for even a relatively small AI model can consume an entire HPC system for days (or longer). Many HPC systems today have been designed either explicitly with AI training in mind or, by virtue of GPUs and accelerators intended for traditional workloads, ideally suited nonetheless. With the expected results of many large-scale AI models for science creating “foundation models” for use by hundreds (or thousands) of users, inference will begin to dominate HPC workloads as well. The implications of AI requirements on HPC systems “both hardware and software stacks” are influencing new hardware architectures as well as the software stacks associated with “traditional” HPC systems. We have asked representatives from several HPC/AI companies to reflect on these trends and where they might take us in the next several years.

Wednesday, June 26^th

Session	Time	Speaker/Activity
Session V		AI on HPC Platforms
	9:15 – 9:40	E. BONOMI Unlocking the Power of AI: Leveraging Dense Linear Algebra and Large Language Models on Groq’s LPU
	9:40 – 10:05	M. MUELLER Breaking the Memory Wall for Generative AI Systems
	10:05 – 10:30	M. JAMES Revolutionizing HPC and AI: The Power of Wafer-Scale Systems
	10:30 – 10:55	T. HOEFLER Improving Future Climate Predictions with Artificial Intelligence
	10:55 – 11:30	COFFEE BREAK
Session VI		The QUANTUM COMPUTING Promises 1
	11:30 – 11:55	K. PUDENZ Neutral Atoms at the Kiloqubit Scale
	11.55 – 12:20	H. KOBAYASHI Performance evaluation of vector annealing on NEC vector processor SX-Aurora TSUBASA
	12:20 – 12:45	E. KYOSEVA Defining the quantum-accelerated supercomputing at NVIDIA
	12:45 – 13:00	CONCLUDING REMARKS
Session VII		The QUANTUM COMPUTING Promises 2
	16:30 – 16:55	G. JOHANSSON WACQT - the Swedish quantum computer effort and testbed
	16:55 – 17:20	J. MORTON Silicon chips made quantum
	17:20 – 17:45	K. OBENLAND Toward Utility Scale Quantum Computing Applications in Physical Science
	17:45 – 18:10	S. STRELCHUK Provable Advantage in Quantum PAC Learning
	18:10 – 18:35	V. GHEORGHIU Distributed Quantum Compiling
	18:35 – 19:00	COFFEE BREAK
	19:00 – 19:25	F. PREIS Advancements in HPC Integration with Quantum Brilliance's Room-Temperature Quantum Accelerators
	19:25 – 19:50	I. OWEN Quantum Annealing Today: Updates from D-Wave
	19:50 – 20:05	CONCLUDING REMARKS

Thursday, June 27^th

Session	Time	Speaker/Activity
Session VIII		The Quantum Computing Prospects and Deployments
	9:30 – 9:55	A. CORCOLES Future of HPC: Integrating quantum with massively parallel computing
	9:55 – 10:20	N. PALANISWAMY Charting Your Path to Fault Tolerant Quantum Computing with Quantinuum
	10:20 – 10:45	R. MANENTI Developing a Quantum Computer with Tunable Couplers
	10:45 – 11:10	P. SHADBOLT Progress towards large-scale fault-tolerant quantum computing with photons
	11:10 – 11:40	COFFEE BREAK
	11:40 – 12:05	S. MARKIDIS Moving Beyond QPU as an Accelerator: Embracing Non-Von Neumann Approaches in Quantum Programming Models
	12:05 – 12:30	L. LEANDRO The road to Quantum Advantage via Classical Control and Integration
	12:30 – 12:45	CONCLUDING REMARKS
Session IX		Quantum Computing Key Applications
	16:30 - 17:00	M. TSUJI Our first move and second step toward "HPC-Oriented" Quantum-HPC Hybrid platform software
	17:00 – 17:25	D. DRAGONI QUANTUM COMPUTING at Leonardo: an industrial end-user standpoint
	17:25 – 17:50	R. SANTAGATI Drug design on quantum computers
	17:50 – 18:15	N. ITO Social simulation with HPC and future Quantum Computing
	18:15 – 18:45	COFFEE BREAK
	18:45 -19:45	PANEL DISCUSSION 2 “The Intersection of Quantum Computing and HPC” Chairperson: Rupak Biswas, NASA, Exploration Technology Directorate, High End Computing Capability Project, NASA Ames Research Center, Moffet Field, CA, USA During the past several decades, supercomputing speeds have gone from Gigaflops to Teraflops, to Petaflops and Exaflops. As the end of Moore’s law approaches, the HPC community is increasingly interested in disruptive technologies that could help continue these dramatic improvements in capability. This interactive panel will identify key technical hurdles in advancing quantum computing to the point it becomes useful to the HPC community. Some questions to be considered: · When will quantum computing become part of the HPC infrastructure? · What are the key technical challenges (hardware and software)? · What HPC applications might be accelerated through quantum computing?
*Is the “belle époque” of classical High Performance Computer Systems coming at the end?*

Friday, June 28^th

Session	Time	Speaker/Activity
Session X		Key Projects, Novel Developments and Challenging Applications
	10:00 – 10:25	K. SANO CGRA Architectures for High-Performance Computing and AI
	10:25 – 10:50	A. GLATZLE Neutral-atom quantum computing within the Munich Quantum Valley
	10:50 – 11:15	COFFEE BREAK
	11:15 – 11:40	V. GETOV Novel Methodology for Application Performance Modelling and Evaluation
	11:40 – 12:05	T. SCHULTHESS Launching the Grace Hopper Superchip on the ‘Alps’ Cloud-Native Supercomputer
	12:05 – 12:30	M. PARASHAR Harnessing the Edge for Science
	12:30 – 12:45	CONCLUDING REMARKS

Chairpersons (t.b.c.)

SESSION I

WOLFGANG GENTZSCH

The UberCloud

GERMANY AND USA

SESSION II

VLADIMIR GETOV

Distributed and Intelligent Systems Research Group

School of Computer Science and Engineering

University of Westminster

UNITED KINGDOM

SESSION III

PETE BECKMAN

Argonne National Laboratory

USA

SESSION IV

TORSTEN HOEFLER

ETH Zurich

Full Professor Department of Computer Science

Director Scalable Parallel Computing Laboratory

SWITZERLAND

SESSION V

IAN FOSTER

Argonne National Laboratory

Data Science and Learning Division

Dept. of Computer Science

The University of Chicago

USA

SESSION VI

NASH PALANISWAMY

QUANTINUUM

USA

SESSION VII

STEFANO MARKIDIS

KTH Royal Institute of Technology

Computer Science Department

SWEDEN

SESSION VIII

GORAN JOHANSSON

Professor Chalmers Tekniska Hogskola

Co-director WACQT

Wallenberg Centre for Quantum Technology

SWEDEN

SESSION IX

MANISH PARASHAR

Scientific Computing and Imaging Institute

School of Computing

University of Utah

USA

SESSION X

HIROAKI KOBAYASHI

Architecture Laboratory

Department of Computer and Mathematical Sciences

Graduate School of information Sciences

Tohoku University

JAPAN

Panels

“How AI Workloads are Influencing system Architecture and Software Stacks: New Tricks for Old (and new) Dogs”

Tuesday June 25^th, 19:00 – 20:00

Chairpersons: Pete Beckman and Charlie Catlett, US DOE Argonne National Laboratory, USA

In the past 3-4 years, artificial intelligence (AI) training has rapidly become a significant application of high-performance computing (HPC). A single training run for even a relatively small AI model can consume an entire HPC system for days (or longer).

Many HPC systems today have been designed either explicitly with AI training in mind or, by virtue of GPUs and accelerators intended for traditional workloads, ideally suited nonetheless.

With the expected results of many large-scale AI models for science creating “foundation models” for use by hundreds (or thousands) of users, inference will begin to dominate HPC workloads as well.

The implications of AI requirements on HPC systems “both hardware and software stacks” are influencing new hardware architectures as well as the software stacks associated with “traditional” HPC systems.

We have asked representatives from several HPC/AI companies to reflect on these trends and where they might take us in the next several years.

“The Intersection of Quantum Computing and HPC”

Thursday, June 27^th, 18:45 -19:45

Chairperson:

Rupak BISWAS, NASA Ames Research Center

During the past several decades, supercomputing speeds have gone from Gigaflops to Teraflops, to Petaflops and Exaflops. As the end of Moore’s law approaches, the HPC community is increasingly interested in disruptive technologies that could help continue these dramatic improvements in capability. This interactive panel will identify key technical hurdles in advancing quantum computing to the point it becomes useful to the HPC community. Some questions to be considered:

When will quantum computing become part of the HPC infrastructure?
What are the key technical challenges (hardware and software)?
What HPC applications might be accelerated through quantum computing?

Abstracts

Bridging the Data Gaps to Democratize AI in Science, Education and Society

Ilkay Altintas

San Diego Supercomputer Center and Workflows for Data Science (WorDS) Center of Excellence and WIFIRE Lab University of California at San Diego, CA USA

The democratization of Artificial Intelligence (AI) necessitates an ecosystem where data and research infrastructure are seamlessly integrated and universally accessible. This talk overviews the imperative of bridging the gaps between these components through robust services, facilitating an inclusive AI landscape that empowers diverse research communities and domains. The National Data Platform (NDP) aims to lower the barriers to entry for AI research and applications through an integrated services approach to streamline AI workflows, from data acquisition to model deployment. This approach underscores the importance of open, extensible, and equitable systems in driving forward the capabilities of AI, ultimately contributing to the resolution of grand scientific and societal challenges. Through examining real case studies leveraging open data platforms and scalable research infrastructure, the talk will highlight the role of composable systems and services in NDP to catalyze a platform to empower users from all backgrounds to engage in meaningful research, learning, and discovery.

Back to Session II

Towards an Operational Crisis in HPC System Software: The File System Example

Frank Baetke

EOFS, European Open File System Organization, Germany

The talk will address key observations made at two EOFS workshops in 2022 and 2024, an EOFS panel at ISC 2023 and focus sessions at the EuroHPC Summit 2024. Operational as well as educational aspects will be discussed.

Communication between users of large HPC centers and the IT staff responsible for system and storage/file-system management is becoming an increasing problem as most users are unaware of or uninterested in the operational aspects of large application programs and the associated challenges of multiple storage hierarchies, demand scheduling, etc.

This kind of disconnect causes more and more problems in terms of load balancing, resource efficiency and system responsiveness and leads to frustration on both sides.

On the education side, operating systems, storage, I/O and file systems are no longer considered interesting and important topics in computer science/information tright'>

Abstracts

Bridging the Data Gaps to Democratize AI in Science, Education and Society

Ilkay Altintas

San Diego Supercomputer Center and Workflows for Data Science (WorDS) Center of Excellence and WIFIRE Lab University of California at San Diego, CA USA

Back to Session II

Towards an Operational Crisis in HPC System Software: The File System Example

Frank Baetke

EOFS, European Open File System Organization, Germany

This kind of disconnect causes more and more problems in terms of load balancing, resource efficiency and system responsiveness and leads to frustration on both sides.

On the education side, operating systems, storage, I/O and file systems are no longer considered interesting and important topics in computer science/information technology curricula. Lectures on operating systems, file systems, etc. have been abandoned in favor of AI, web services and other areas considered hot. Today, it is possible to earn a university degree in computer science without ever having attended lectures on operating systems and related middleware.

Back to Session III

What lies beyond the edge?

Pete Beckman

Argonne National Laboratory, Argonne, IL, USA

AI is on the move — bigger, smarter, and richer. Larger and sophisticated models are being pushed to the edge. Smart infrastructure, smart sensors, and intelligent scientific instruments are being deployed around the world in a new kind of AI-enabled computing continuum. The Sage (sagecontinuum.org) infrastructure allows scientists to deploy AI algorithms to the edge (AI@Edge), to analyze and autonomously respond to the highest resolution of data. The infrastructure allows computer scientists to explore AI algorithms such as federated learning, self supervised learning as well as bi-directional interactions between instruments and computation. But what’s next? What lies beyond the edge?

Back to Session IV

Entering A New Frontier of AI Networking Innovation

Gil Bloch

NVIDIA, Santa Clara, CA, USA

NVIDIA Quantum InfiniBand and Spectrum-X Ethernet have emerged as the de facto network standards for training and deploying AI at scale. InfiniBand’s in-network computing, ultra-low latency, and high bandwidth capabilities have facilitated creating larger and more complex foundational models. Spectrum-X is the first Ethernet platform capable of supporting AI infrastructure, delivering networking optimized for generative AI to hyperscale AI clouds and enterprises. We’ll dive deep into the architectural aspects of NVIDIA Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms and their essential roles in next-generation AI data center designs.

Back to Session III

Unlocking the Power of AI: Leveraging Dense Linear Algebra and Large Language Models on Groq’s LPU

Ernesto Bonomi

GROQ, Mountain View, CA, USA

This presentation explores the intersection of dense linear algebra and Large Language Models (LLMs), highlighting Groq’s innovative Language Processing Units (LPU) as the foundation for a new generation of AI. By examining the LPU’s static and deterministic dataflow computing paradigm, we will illustrate the advantages of its SIMD architecture and scalability, demonstrating how these enable LLMs to respond at unprecedented speed, with real-time processing capabilities. We will also delve into the distinction between training and inference, considering objectives, complexity, and costs. Finally, a live demo will showcase the remarkable performance of LPU, highlighting the transformative potential of this technology.

Back to Session V

Future of HPC: Integrating quantum with massively parallel computing

Antonio D. Corcoles

IBM Quantum, T.J. Watson Research Center, Yorktown Heights, NY, USA

As quantum computing systems continue to scale in size and quality, and error resilience approaches start to enable interesting computational regimes in what we call the era of quantum utility, the integration of quantum with massively parallel computing becomes critical to unlock the full potential of both technologies in a way that exceeds the capabilities of either one alone. This integration is poised to provide a rich environment for experts to experiment and optimize resources in quantum algorithms and applications. Given the limitations in efficiently emulating quantum applications to find the most optimal implementations, direct interaction with evolving quantum hardware becomes essential for application development. In this talk I will present the state of the art of quantum computing and will touch on some architectural ideas towards the integration of quantum and traditional HPC systems through a use case that exhibits the interplay of both technologies in a heterogeneous workflow.

Back to Session VIII

An Overview of High Performance Computing and Responsibly Reckless Algorithms

Jack Dongarra

Electrical Engineering and Computer Science Department; Innovative Computing Laboratory, University of Tennessee, Knoxville, TN, USA; Oak Ridge National Laboratory, USA and University of Manchester, UK

In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies, range of latencies, and increased run--time environment variability will make these problems much harder.

Mixed precision numerical methods turn out to be paramount for increasing the throughput of traditional and artificial intelligence (AI) workloads beyond riding the wave of the hardware alone. Reducing precision comes at the price of trading away some accuracy for performance (reckless behavior) but in noncritical segments of the workflow (responsible behavior) so that the accuracy requirements of the application can still be satisfied.

Back to Session I

Quantum Computing at Leonardo: an industrial end-user standpoint

Daniele Dragoni

Leonardo S.p.A., High Performance Computing Lab., Genova, Italy

Quantum Computing (QC) is an emerging paradigm that offers the potential to solve complex problems that are considered intractable within the classical/digital computing domain. While tangible quantum advantages have yet to manifest in practical scenarios, numerous industries are actively exploring its potential advantages, striving to secure competitive edges within their respective sectors.

In my presentation, I will outline Leonardo’s strategic approach to thoroughly evaluate the capabilities and limitations of QC within the aerospace, security, and defense domains. I will delve into our stance on QC from an industrial end-user perspective, illustrating examples of ongoing initiatives and practical applications that we are pursuing through integrated HPC and QC methodologies, aligning with national strategic objectives.

Back to Session IX

Embodied agents as scientific assistants

Ian Foster

Argonne National Laboratory, Data Science and Learning Division, Argonne, IL and Dept. of Computer Science, The University of Chicago, Chicago, IL, USA

An embodied agent is a computational entity that can interact with the world through a physical body or representation and adapt its actions based on learnings from these interactions. I discuss the potential for such agents to serve as next-generation scientific assistants, for example by acting as Cognitive Partners and Laboratory Assistants. In the former case, agents, with their machine learning and data-processing capabilities, complement the cognitive processes of human scientists by offering real-time data analysis, hypothesis generation, and experimental design suggestions; in the latter, they engage directly with the scientific environment on the scientist’s behalf, for example by performing experiments in bio-labs or running simulations on supercomputers. I invite participants to envision a future in which human scientists and embodied agents collaborate seamlessly, fostering an era of accelerated scientific discoveries and broader horizons of understanding. I hope to encourage debate about what technical advances will be required to achieve this future, how we will ensure the safe and ethical use of such agents, how and to what end we may seek to preserve human intuition, and the possible redefinition of scientific discovery if machines are able to theorize and validate.

Back to Session IV

SimOps, a New HPC Community Initiative Focusing on Simplifying Use and Operation of Scientific and Engineering Simulations

Wolfgang Gentzsch

The UberCloud, Regensburg, GERMANY and Sunnyvale, CA, USA

In today’s fast-paced, competitive world, engineering teams are under intense pressure to design high-quality, innovative products in record time. The driving force behind this acceleration is the growing reliance on engineering simulation in the product design process. With an explosion in simulation software sophistication and almost unlimited compute power, engineering teams rely on simulation to create high-quality breakthrough products. Today, simulation is no longer a luxury, it’s required for survival. From designing brand new products to improving existing ones, simulation empowers companies to innovate, validate ideas, and compete in a global economy.

But simulation engineers still face many hurdles that limit their productivity and the quality of products they create, and their contribution to their companies’ next generation products. New applications like digital twins and artificial intelligence come with new requirements for new software and hardware capabilities that further increase the complexity of simulations and of the underlying computing infrastructure. What can we do to master these new challenges and reduce the operational burden on engineers and IT to manage complex HPC environments.

In this short presentation, we will announce a new HPC community initiative that aims at reducing the challenges and the operational burden on engineers and IT to use, operate, and manage complex simulation environments that come with the ever evolving applications, technologies, and infrastructures. We will demonstrate a set of Best Practices that support engineers and HPC experts in simplifying use and operation of simulation environments to make them more productive, deliver higher-quality results, and thus contribute to the success of their company.

Back to Session II

Novel Methodology for Application Performance Modelling and Evaluation

Vladimir Getov

Distributed and Intelligent Systems Research Group, School of Computer Science and Engineering, University of Westminster, London, United Kingdom

Computer simulation of physical real-world phenomena emerged with the invention of electronic digital computing and has been increasingly adopted as one of the most successful modern methods for scientific discovery. Arguably, the main reasons for this success have been the rapid development of novel computer technologies that has led to the creation of powerful supercomputers, large distributed systems, high-performance computing frameworks with access to huge data sets, and high throughput communications. In addition, unique and sophisticated scientific instruments and facilities, such as giant electronic microscopes, nuclear physics accelerators, or sophisticated equipment for medical imaging are becoming integral parts of those complex computing infrastructures.

Subsequently, the term ‘e-science’ was quickly embraced by the professional community to capture these new revolutionary methods for scientific discovery via computer simulations of physical systems. The relevant application codes are typically based on finite-element algorithms, while the computations constitute heavy workloads that conventionally are dominated by floating-point arithmetic. Examples include application areas such as climate modeling, plasma physics (fusion), medical imaging, fluid flow, and thermo-evolution.

Over the years, most of the relevant benchmarking projects have covered predominantly dense physical system simulations, in which high computational intensity carries over when parallel implementations are built to solve bigger problems faster. Since emphasis was on dense problems, this approach resulted in systems with increasing computational performance and was the presumption behind the introduction of the very popular semi-annual Top 500 rankings of supercomputers. However, in the last 10-15 years many new applications with very high economic potential have emerged — such as big data analytics, machine learning, real-time feature recognition, recommendation systems, and even physical simulations — that feature irregular or dynamic solution grids. These applications spend much more of their computation in non-floating-point operations such as address computations and comparisons, with addresses that are no longer very regular or cache-friendly. The computational intensity of such programs is far less than for dense kernels, and the result is that for many real codes today, even those in traditional scientific cases, the efficiency of the floating-point units that have become the focal point of modern core architectures has dropped from the >90% to <5%. This emergence of applications with data-intensive characteristics — e.g. with execution times dominated by data access and data movement — has been recognized recently as the “3rd Locality Wall” for advances in computer architecture.

To highlight the inefficiencies described above, and to identify architectures which may be more efficient, a new benchmark called HPCG (High Performance Conjugate Gradient) was introduced several years ago. HPCG also solves Ax=B problems, but where A is a very sparse matrix so that, on evaluated systems, floating-point efficiency mirrors that seen in full scientific codes. Recent detailed analysis confirms that HPCG performance in terms of useful floating-point operations is dominated by memory bandwidth to the extent that the number of cores and their floating-point capabilities are irrelevant. Therefore, our selected benchmark codes that cover the “Physical System Simulations” application area of interest are the High-Performance LINPACK (HPL) and the HPCG. Both are very popular codes with very good regularity of results in recent years. Our approach is to explore a 3-dimensional space — dense systems performance, sparse systems performance, and energy efficiency for both cases. With HPL as the representative of dense system performance and HPCG as the representative for sparse systems performance, the available benchmarking results provide excellent opportunities for comparisons and interpretation, as well as lay out a relatively well-balanced overall picture of the whole application domain for physical system simulations.

Back to Session X

Distributed Quantum Compiling

Vlad Gheorghiu

Institute for Quantum Computing, University of Waterloo and SoftwareQ Inc, Waterloo, Ontario, Canada

Quantum computing’s potential to solve complex problems hinges on the ability to scale up for the execution of large-scale quantum algorithms. One promising approach to scalability is distributed quantum computing, where multiple nodes, each with a relatively small number of qubits, are interconnected via Einstein-Podolsky-Rosen (EPR) channels. These channels are generated on demand and exhibit stochastic behavior, presenting unique challenges in the distribution of logical circuits across the network. In this talk, I present a novel distributed compiling strategy tailored for such an architecture. Our approach effectively partitions quantum circuits and maps them onto a network of interconnected quantum nodes, optimizing for both performance and feasibility under the constraints of stochastic EPR channel generation. I validate the compiling strategy through a series of benchmark circuits, demonstrating its practical application and potential for real-world quantum computing tasks. If time permits, I will also provide a live demonstration of our distributed compiling method in action, showcasing its effectiveness and operational viability.

Back to Session VII

Neutral-atom quantum computing within the Munich Quantum Valley

Alexander Glätzle

CEO and Co-Founder PLANQ, Munich, Germany

Quantum computers utilizing ultracold atoms confined in optical lattices exhibit exceptional potential for addressing computationally complex problems. These systems provide extended qubit coherence times, eliminate manufacturing variations, and scale to thousands of qubits, all while operating at room temperature. In this talk, we present the development of digital quantum computers within the Munich Quantum Valley, a collaborative effort between the Max Planck Institute of Quantum Optics, the Leibniz Supercomputing Center, and planqc. Our focus is on integrating a neutral atom quantum computer into a high-performance computing environment to achieve quantum-accelerated HPC.

Back to Session X

Accelerating Extreme HPC Scale-out and AI Environments

Frank Herold

ThinkParQ GmbH, Germany

This session will outline how we have managed to drive the development of key features, functions, and challenges to remain a disruptive parallel file system, that continues to accelerate extreme HPC environments, whilst adapting our technology for nontraditional HPC environments including AI, Energy and M&E.

Originating 2005 by the Fraunhofer Institute for Industrial Mathematics, for the past 10 years BeeGFS has continued to be developed and delivered globally by ThinkParQ. BeeGFS is developed on an ‘Publicly Available Source Code’ development model with a strong focus on performance and community needs, and it currently holds over 10% market share of parallel file systems in academic / non-profit research. It is also trusted and used to accelerate some of the world's fastest supercomputers, and has strived its way to become the parallel file systems of choice where performance matters.

Back to Session III

Quantum Computing and High-Performance Computing: Rivals or Allies?

Rajeeb Hazra

QUANTINUUM, Broomfield, Colorado, USA

The rapid advancement of quantum computing has sparked speculation about its potential to supplant traditional high-performance computing (HPC) architectures. This keynote delves into the pivotal question: Will quantum computing usurp HPC, or are they destined to coexist as complementary technologies?

This keynote navigates the convergence and divergence of quantum and classical computing paradigms. It examines scenarios where quantum computing excels, such as cryptography and optimization, while acknowledging the enduring relevance of HPC in domains like weather forecasting, drug discovery, and engineering simulations. Moreover, it explores synergistic possibilities where quantum accelerators enhance HPC workflows, promising unprecedented computational power for scientific discovery and technological innovation.

Back to Session I

Improving Future Climate Predictions with Artificial Intelligence

Torsten Hoefler

ETH Zurich, Full Professor Department of Computer Science and Director Scalable Parallel Computing Laboratory, Zurich, Switzerland

Artificial Intelligence and specifically Large Language Models have had great impact on Science and Society at large. We will show how those tools can be used in the context of one of humanity’s hardest prediction challenges: the climate and future state of our planet. We will discuss several ideas for accelerating weather and climate simulations, using generative AI models for climate data compression, climate foundation models, or diffusion-based operators for observation data assimilation. By harnessing these techniques, we aim to significantly improve our understanding of future climate scenarios, ultimately informing local and global strategies to mitigate climate change and adapt to its effects.

Back to Session V

Social simulation with HPC and future Quantum Computing

Nobuyasu Ito

RIKEN Center for Computational Science, Kobe, Japan

Social phenomena are extremely complex and have a huge degree of freedom, and great expectations and challenges are required in the control and design of society. Massively parallel supercomputers are useful for such purposes. Its performance scalability provides a flexible platform for data analysis and simulation. Examples include vehicle traffic analysis, evacuation schedules, pandemic preparedness, and macroeconomic design.

Back to Session IX

Revolutionizing HPC and AI: The Power of Wafer-Scale Systems

Michael James

CEREBRAS, Sunnyvale, CA, USA

Wafer-scale systems extend the feasible space for physical simulations by multiple orders of magnitude in strong scaling. Hundred-fold time-to-solution improvements put wafer-scale supercomputers into a new class of scientific instruments that can provide real-time HPC. Moreover, the computational architectures that provide strong scale for HPC workloads directly imply techniques for coupling simulations with artificial intelligence.

In this talk, we will describe the Cerebras wafer-scale platform, show examples of hundred-fold accelerations, and introduce research directions for AI at HPC scale.

Bio: Michael is Founder and Chief Architect of Advanced Technologies at Cerebras, the company that created the world’s largest and most powerful computer processor. Michael leads the effort to reimagine the algorithmic building blocks for the next generation of AI technologies. Prior to Cerebras, Michael was a Fellow at AMD, where he pioneered a technique of adaptive and self-healing circuits based on cellular automata that was applied toward distributed fault tolerant machines. Michael focuses his career on exploration at the intersection of natural phenomena, mathematics, and engineered machines. Michael's degree is in Molecular Neurobiology, Computer Science and Mathematics from UC Berkeley.

Back to Session V

WACQT - the Swedish quantum computer effort and testbed

Göran Johansson

Co-director WACQT and professor of Theoretical and Applied Quantum Physics at Chalmers University of Technology in Gothenburg, Sweden

In this talk I will give a brief overview of the Wallenberg Center for Quantum Technology (WAQCT), which is a twelve year 120 M€ effort which started 2018.

One of the two main goals of this center is to build a Swedish superconducting quantum computer and explore potential use-cases together with our industrial partners.

In 2024 we also started a testbed, where we let our Swedish researchers and industrial partners test algorithms both on our own hardware as well as the IBM quantum computers.

Back to Session VII

Challenges of Deploying Emerging Computing Technologies for U.S. Academic Research

Andrey Kanaev

U.S. National Science Foundation, Program Director Office of Advanced Cyberinfrastructure Computer and Information Science and Engineering Directorate, Alexandria, VA, USA

Novel computing paradigms are created in academic laboratories, but their advent is driven by industry incentives and investments. As a result, deployment of emerging technologies for scientific computing at scale possesses distinctive challenges of attracting users, who are ready to adopt new ways to compute; discovering suitable application domains; allocating investments that are competitive with industry’s levels; and estimating scientific return on investing. Additionally, each novel paradigm, whether its quantum, brain-inspired, etc. presents its unique set of issues. In this talk we will share some of the opportunities U.S. National Science Foundation offers to academia to address these challenges.

Back to Session III

Performance evaluation of vector annealing on NEC vector processor SX-Aurora TSUBASA

Hiroaki Kobayashi

Architecture Laboratory, Department of Computer and Mathematical Sciences

Graduate School of information Sciences, Tohoku University, Japan

In this talk, I will introduce VE3.0, Vector annealer that is specially designed and implemented on NEC’s vector computing platform, SX-Aurora TSUBASA, regarding features and performance evaluation results by using the Traveling salesperson problem. I also present the vector-quantum hybrid platform for the development of simulation-data analysis hybrid applications. As an example, I will show you the formulation of optimal rescue resource deployment after the Tsunami disaster and its performance evaluation.

Back to Session VI

LUMI HPC Ecosystem – Today and Tomorrow

Kimmo Koski

CSC - Finnish IT Center for Science, Espoo, Finland

LUMI – one of the most efficient supercomputers in Europe – is operating in Kajaani, center Finland, in an old papermill where substantial amount of space and renewable energy is available. The system has been in production since early 2022 and targeted to run at least until end of 2027. LUMI is a joint effort by European Union and 11 countries, coordinated by the Finnish IT center for Science, CSC. Finnish share of LUMI total 200 MEUR cost is 50 MEUR. This spring Finnish government announced the investment of 250 MEUR for the follow-up supercomputer – thus gathering together the consortium to procure and deploy the next system will start now.

The talk will cover the usage of current LUMI and plans for the next one. It will describe the state of the Ecosystem today and developments which impact for the future, including the use of the eco-efficient datacenter in Kajaani. The roles of traditional HPC, AI and quantum computing are discussed, as also European collaboration around these topics. Number of examples about applications are presented.

Back to Session III

Defining the quantum-accelerated supercomputing at NVIDIA

Elica Kyoseva

Director Quantum Algorithms Engineering, NVIDIA, Santa Clara, California, USA

Quantum computing has the potential to offer giant leaps in computational capabilities, impacting a range of industries from drug discovery to portfolio optimization. Realizing these benefits requires pushing the boundaries of quantum information science in the development of algorithms, research into more capable quantum processors, and the creation of tightly integrated quantum-classical systems and tools. I will review the challenges facing quantumcomputing, showcase how GPUcomputing can help, and reveal exciting developments in tightly integrated quantum-classical computing.

Back to Session VI

The road to Quantum Advantage via Classical Control and Integration

Lorenzo Leandro

Quantum Machines inc., Milan, Italy

Key quantum algorithms that are expected to provide super-polinomial speed-ups hold an unbelievable strategic and economic potential. However, their full-scale practical implementation is still far away, requiring robust error correction and thereby advanced control and both quantum and classical systems. In this talk, we delve into the intricacies of running such key algorithms on an error-corrected quantum computer from a control and classical integration standpoint. We do this by looking at a simulated end-to-end example of running Shor’s algorithm to factorize the number 21 within a quantum error correction code on a superconducting QPU. By analyzing the algorithms’ resource requirements, gate fidelity, and noise tolerance, we derive essential criteria for designing an effective quantum control system that will do the job, and we outline what type of quantum-classical integration will get us there.

Back to Session VIII

Application Driven Optimizations in High-Performance Interconnects for Supercomputing

Yutong Lu

Full Professor, School of Computer Science and Engineering, Director, National Supercomputer Center in Guangzhou, Sun Yat-Sen University, Guangzhou Higher education Mega Center, Guangzhou, China

The interconnect network is a crucial component of large-scale supercomputing systems. As supercomputing systems continue to progress, networks have been consistently optimized. It should be noted that the ultimate goal of all network optimizations is to serve applications. The architecture of the network must evolve according to the communication characteristics of applications, simultaneously eliminating redundancies to minimize unnecessary costs. For communication middleware, it is essential to provide a better abstraction for the network while retaining excellent performance of underlying hardware. For applications, the design of communication schemes should fully exploit the hardware and software features of networks. Thus, we proposed a Unified Notifiable RMA (UNR) library to address these challenges. Our evaluation demonstrates the performance improvements of the domain applications on domestic supercomputers. As large-scale model training emerges as a pivotal application in today’s supercomputing systems, we are concentrating some critical network optimization techniques for large-scale computing.

Back to Session II

Quantum Annealing Today: Updates from D-Wave

Irwan Owen

D-Wave Systems Inc., Germany and USA

Over the last few years, D-Wave Quantum has seen customers moving from research projects in the lab to production applications that provide business value. Our commercial-scale hybrid solvers, real-time cloud access, and new features are enabling enterprise and research organizations to leverage quantum technologies in more and more ways. Join us in this session to hear about the latest results from our customers, as well as updates and new features from D-Wave.

Back to Session VII

Developing a Quantum Computer with Tunable Couplers

Riccardo Manenti

Rigetti Computing, Berkeley, CA, USA

As the field of quantum computing advances, the demand for devices with higher performance and greater qubit counts becomes more pressing. In this talk, I will outline the evolution of our qubit architecture and elaborate on our strategy for scaling quantum devices using superconducting qubits. I will introduce our tunable coupler architecture and explain our implementation of parametric entangling gates. Additionally, I will discuss the challenges in scaling, particularly our efforts in integrating on-chip flux and microwave lines, and present our modular approach.

Back to Session VIII

Moving Beyond QPU as an Accelerator: Embracing Non-Von Neumann Approaches in Quantum Programming Models

Stefano Markidis

KTH Royal Institute of Technology, Computer Science Department, Stockholm, Sweden

The design of quantum programming models has traditionally been grounded in the conceptual framework of quantum circuits and gates introduced by David Deutsch in the early 1980s. This framework typically envisions the Quantum Processing Unit (QPU) as an accelerator within a host-device configuration, where the host system offloads the program to the QPU for execution. However, quantum computers predominantly consist of classical systems that stimulate and measure quantum systems as black boxes, diverging significantly from the circuit-offloading model. This abstraction is misaligned with the hardware’s operational reality, hindering optimized implementations and limiting the scope of operations. In contrast, concepts from non-Von Neumann architectures—such as neuromorphic hardware and dataflow systems—utilize abstractions like stimuli, channels, and schedules, which better align with the nature of quantum computing systems and the physical processes they embody. As David Deutsch originally conceptualized, computing is fundamentally a physical process. Thus, advancing quantum programming models should incorporate this perspective to achieve greater accuracy and physical fidelity. By adopting physics-based programming models, we can develop approaches to quantum computing that more accurately reflect the interactions between classical hardware and quantum systems.

Back to Session VIII

Riken TRIP-AGIS and FugakuNEXT - greatly accelerating next generation AI for Science

Satoshi Matsuoka

RIKEN Director Center for Computational Science, Kobe and Department of Mathematical and Computing Sciences Tokyo Institute of Technology, Tokyo, Japan

AI for Science, leveraging high-performance computing (HPC), is set to transform scientific endeavors, accelerating innovation and societal benefits. HPC and AI, once niche areas, are now pivotal in computer science, fueled by substantial investments in talent and resources. This shift is evident in initiatives like Fugaku-LLM in Japan, which utilizes 14,000 nodes of the Fugaku supercomputer to train large-scale language models, emphasizing skill in managing massive training operations. Concurrently, the TRIP-AGIS project at Riken aims to integrate AI with simulation and automated experiments, standardizing this approach across sciences in Japan to enhance innovation cycles. These initiatives not only guide the development of the next-generation FugakuNEXT supercomputer but also explore key technical challenges such as optimizing data movement to boost efficiency and capacity. These efforts are critical for advancing both AI and traditional simulations in the upcoming post-exascale era.

Back to Session I

Silicon chips made quantum

John Morton

Professor University College London – UCL, Director of UCL Quantum Science and Technology Institute, and Co-Founder and CTO of QUANTUM MOTION London, UK

Silicon MOS dominates today’s information technology industry, having repeatedly replaced the incumbent technology platform in diverse applications, but what will its role be in quantum computing? Spins in silicon offer some of the longest quantum coherence times of any solid-state system while cryogenic CMOS circuits of increasing complexity have been designed and demonstrated to run at deep cryogenic temperatures, opening a route to tightly integrating control electronics with quantum devices. MOS devices fabricated on 300mm wafers, similar to those used in the silicon CMOS transistor industry today, can be used to form spin qubit arrays capable of implementing versatile quantum computing architectures. I will discuss recent progress at Quantum Motion on MOS spin qubit devices fabricated using industrial grade 300mm wafer processing and their integration with cryogenic CMOS electronics, showing how silicon could play a major role in the future quantum computing industry. I will show how arrays of up to 1024 Si quantum dots can be addressed on-chip using digital and analogue electronics and characterised within 5 minutes, present MOS spin qubit readout fidelities in excess of 99.9% and exchange oscillations which form the basis of two-qubit entangling gates. I will also discuss prospects for different QC architectures based on MOS spin qubits covering the NISQ and FTQC regimes, and requirements for control electronics.

Back to Session VII

Breaking the Memory Wall for Generative AI Systems

Martin Mueller

SambaNova Systems Inc, Palo Alto, CA, USA

Composition of Experts is an alternative approach to lower the cost and complexity of training and serving very large AI (language) models to overcome the memory wall caused by increase in compute-to-memory of modern AI accelerators. This talk describes how composition of experts, streaming dataflow, and a three tier memory architecture can scale the memory wall.

Back to Session V

Toward Utility Scale Quantum Computing Applications in Physical Science

Kevin Obenland

Quantum Information and Integrated Nanosystems, Lincoln Laboratory, Massachusetts Institute of Technology MIT, Boston, MA, USA

Quantum computing provides a fundamentally new capability that has the promise of accelerating the development of applications in physical science. These applications include: quantum chemistry, condensed matter systems, and high-energy-density physics, among others. In order to assess the capabilities of quantum computing for these applications we must identify specific problems and parameter regimes, develop workflows that leverage quantum computing algorithms, and assess the resources required by quantum computing implementations used in the workflows. As part of the DARPA Quantum Benchmarking program, MIT Lincoln Laboratory is actively developing a tool called pyLIQTR, which provides implementations of important quantum kernels used in the workflows of applicationss in physical science. With the implementations provided by our tool, one can measure the quantum resources required for applications at utility scale. In this talk, I will describe the pyLIQTR tool and show resource analysis results for problems that include: local and periodic quantum chemistry, the Fermi-Hubbard model, and plasma physics.

Back to Session VII

Charting Your Path to Fault Tolerant Quantum Computing with Quantinuum

Nash Palaniswamy

QUANTINUUM, Broomfield, Colorado, USA

In this talk, we will chart Quantinuum's path to true fault-tolerant quantum computing, highlighting the critical advancements and milestones in our fully integrated hardware and software stack. We will delve into the latest technical progress in our QCCD Architecture, including achieving 99.9% fidelity, addressing scalability, and introducing the first and only Level 2 resilient quantum computer with Microsoft.

The talk will illustrate how these innovations support our journey towards fault tolerance through real-world use cases of commercial importance across various industries, such as fuel cell catalytic reactions, high-resolution seismic imaging, materials for carbon capture, ammonia catalysis, quantum natural language processing for peptide binding analysis, and fraud detection.

We will conclude with a forward-looking perspective on our roadmap, outlining the steps we are taking to achieve fault-tolerant quantum computing and the transformative potential it holds for the future.

Back to Session VIII

Harnessing the Edge for Science

Manish Parashar

Scientific Computing and Imaging Institute and School of Computing University of Utah, Salt Lake City, USA

Recent advances in edge devices are enabling data-driven, AI-enabled scientific workflows integrate distributed data sources. Combined with pervasively available computing resources, spanning HPC to the edge, these workflows can help us understand end-to-end phenomenon, drive experimentation, and facilitate important decision making. However, despite the growth of available digital data sources at the edge, and the ubiquity of non-trivial computational power for processing this data, realizing such science workflows remains challenging. This talk will explore a computing continuum spanning resources at the edges, in HPC centers and clouds, and in-between, and providing abstractions that can be harnessed to support science. The talk will also introduce recent research in programming abstractions that can express what data should be processed and when and where it should be processed, and autonomic middleware services that automate the discovery of resources and the orchestration of computations across these resources.

Back to Session X

Advancements in HPC Integration with Quantum Brilliance’s Room-Temperature Quantum Accelerators

Florian Preis

Quantum Brilliance GmbH, Stuttgart, Germany

In this talk, we will delve into the latest developments in the field of quantum accelerators by Quantum Brilliance, based on the use of NV centers in diamond to operate at room temperature. The centerpiece of Quantum Brilliance's ongoing integration work is the Quantum Brilliance QDK2.0, the latest version of their quantum accelerator, which represents a significant leap forward in the practical integration of quantum and classical computing. We will explore current HPC integration projects that leverage the unique capabilities of the QDK. Furthermore, we will discuss the different levels of classical parallelization of quantum computations, which are crucial for maximizing the efficiency and scalability of hybrid computing systems. By examining these advancements, we aim to provide a comprehensive overview of the current landscape and future directions for practical quantum computing.

Back to Session VII

Neutral Atoms at the Kiloqubit Scale

Kristen Pudenz

Vice President of Research Collaborations, Atom Computing, Berkeley, California, USA

Atom Computing has demonstrated 1225 neutral atom qubits loaded in a computational array. We will explore the technology behind this milestone, other novel technology developed at Atom Computing, and address future development and opportunities for collaboration.

Back to Session VI

CGRA Architectures for High-Performance Computing and AI

Kentaro Sano

Team Leader, Processor Research Team, Center for Computational Science, RIKEN, Japan

At RIKEN Center for Computational Science (R-CCS), we have been researching future architectures for HPC and AI. Especially, in Processor research team, we are focusing on reconfigurable computing architectures such as coarse-grained reconfigurable array (CGRA), which can be advantageous for limited data movement resulting in lower power consumption. In this talk, we introduce the concept of CGRA and our research on RIKEN CGRA for HPC and AI with architectural exploration for more efficient computing.

Bio:

Kentaro Sano is the team leader of the processor research team at RIKEN Center for Computational Science (R-CCS) since 2017, responsible for research and development of future high-performance processors and systems. He is also a visiting professor with an advanced computing system laboratory at Tohoku University. He received his Ph.D. from the graduate school of information sciences, Tohoku University, in 2000. From 2000 until 2018, he was a Research Associate and an Associate Professor at Tohoku University. He was a visiting researcher at the Department of Computing, Imperial College, London, and Maxeler Technology corporation in 2006 and 2007. His research interests include data-driven and spatial-parallel processor architectures such as a coarse-grain reconfigurable array (CGRA), FPGA-based high-performance reconfigurable computing, high-level synthesis compilers and tools for reconfigurable custom computing machines, and system architectures for next-generation supercomputing based on the data-flow computing model.

Back to Session X

Drug design on quantum computers

Raffaele Santagati

Quantum Computing Scientist, Boheringer Ingelheim, Germany

The promising industrial applications of quantum computers primarily rely on their anticipated ability to conduct precise and efficient quantum chemical calculations. In computational drug discovery, the accurate prediction of drug-protein interactions is paramount [1]. However, several notable challenges need to be overcome to apply quantum computers to drug design effectively.

First, efficiently computing expectation values for observables beyond total energy is a significant challenge in fault-tolerant quantum computing. Currently, quantum algorithms rely on nested quantum phase estimation subroutines to calculate the expectation value of observables [2, 3]. Although quantum phase estimation is highly efficient, the frequent need for nested quantum phase estimations creates a bottleneck that makes computing observables prohibitively expensive, even with the latest algorithmic advancements [4, 5]. This limitation presents a significant hurdle for quantum computing applications in the pharmaceutical industry.

Secondly, molecular simulations at finite temperatures are key in free energy calculations. These calculations are crucial for determining thermodynamic quantities such as binding affinities. However, this process can be pretty complex and challenging due to the vast number of configurations needed. Millions of calculations are typically required, each with a quantum computing run time of several days, making it difficult to compete with the run times of optimized experiments. Nevertheless, quantum computing has the potential to provide an alternative solution [6]. For example, by simultaneously modeling classical nuclei and quantum mechanical electrons on a quantum computer, it may be possible to calculate thermodynamic quantities more practically and efficiently. It may even be possible to generate thermal ensembles of geometries and calculate thermodynamic properties like free energies directly on a quantum computer. By overcoming these challenges, we could significantly enhance the efficiency and applicability of molecular simulations at finite temperatures. This could profoundly impact computational drug discovery in the pharmaceutical industry.

This talk will explore some of these challenges and discuss potential new routes for applying quantum computers to drug design.

[1] R. Santagati, A. Aspuru-Guzik, R. Babbush, M. Degroote, L. Gonz ìalez, E. Kyoseva, N. Moll, M. Oppel, R. M. Parrish, N. C. Rubin, M. Streif, C. S. Tautermann, H. Weiss, N. Wiebe, and C. Utschig-Utschig, Nature Physics , 1 (2024).

[2] M. Steudtner, S. Morley-Short, W. Pol, S. Sim, C. L. Cortes, M. Loipersberger, R. M. Parrish, M. Degroote, N. Moll, R. Santagati, and M. Streif, Quantum 7, 1164 (2023).

[3] T. E. O’Brien, M. Streif, N. C. Rubin, R. Santagati, Y. Su, W. J. Huggins, J. J. Goings, N. Moll, E. Kyoseva, M. Degroote, C. S. Tautermann, J. Lee, D. W. Berry, N. Wiebe, and R. Babbush, Phys. Rev. Res. 4, 043210 (2022).

[4] P. J. Ollitrault, C. L. Cortes, J. F. Gonthier, R. M. Parrish, D. Rocca, G.-L. Anselmetti, M. Degroote, N. Moll, R. Santagati, and M. Streif, Enhancing initial state overlap through orbital optimization for faster molecular electronic ground-state energy estimation (2024), arXiv:2404.08565 [quant-ph].

[5] D. Rocca, C. L. Cortes, J. Gonthier, P. J. Ollitrault, R. M. Parrish, G.-L. Anselmetti, M. Degroote, N. Moll, R. Santagati, and M. Streif, Reducing the runtime of fault-tolerant quantum simulations in chemistry through symmetry-compressed double factorization (2024), arXiv:2403.03502 [quant-ph].

[6] S. Simon, R. Santagati, M. Degroote, N. Moll, M. Streif, and N. Wiebe, PRX Quantum 5, 010343 (2024).

Back to Session IX

Launching the Grace Hopper Superchip on the ‘Alps’ Cloud-Native Supercomputer

Thomas Schulthess

CSCS Swiss National Supercomputing Centre, Lugano and ETH, Zurich,

Switzerland

The 'Alps' cloud-native supercomputing infrastructure, leveraging HPE’s Cray Shasta EX product line, features versatile software-defined clusters (vClusters) configured via partitions of the Slingshot network to accommodate diverse research needs. These vClusters support various applications from traditional HPC workloads to high-throughput tasks for the World LHC Compute Grid and the Materials Cloud. Recently, MeteoSwiss’s ICON-22 model commenced operations on 'Alps' utilizing a geo-distributed configuration. This presentation will detail the deployment of NVIDIA’s Grace Hopper superchip (GH200) within these settings. The GH200-based CG4 nodes, integral to this recent extention, combine four 'superchips' connected by NVLink, achieving a balanced memory system and promising performance as demonstrated by initial applications. Despite these advances, the high energy density of the system presents significant challenges, primarily due to increased electrical power consumption. The system is very efficient and most applications run at peak power.

Back to Session X

Progress towards large-scale fault-tolerant quantum computing with photons

Pete Shadbolt

Co-Founder PsiQuantum, Palo Alto, California, USA

In this talk we will describe progress towards large-scale, fault-tolerant quantum computing with photons. This talk will span materials innovations for high-performance photonics, improvements in photonic component performance with an emphasis on improved optical loss, prototype systems of entangled photonic qubits, qubit networking, and novel high-power cryogenic cooling solutions designed for future datacenter-scale quantum computers. We will show new prototype systems designed to progressively overcome the key challenges to scaling up photonic quantum computers. We will also give an overview of the architecture of fusion-based photonic quantum computers, describe near-term systems milestones, and give a view on the long-term roadmap to useful, fault-tolerant machines.

Back to Session VIII

Rick Stevens

Trillion Parameter ConsortiumAuroraGPT

Back to Session I

HPC and Machine Learning for Molecular Biology: ADMIRRAL Project Update

Frederick Streitz

Center for Forecasting and Outbreak Analytics (CFA/CDC), USA and National AI Research Resource Task Force (NAIRR-TF) USA and Lawrence Livermore, National Laboratory (LLNL/DOE), Livermore, California, USA

The joint application of high performance computing (HPC) and Machine Learning (ML) has enabled advances in a the number of scientific disciplines. One of the most powerful demonstrations has been in the area of computational biology, where the addition of ML techniques has helped ameliorate the lack of clear mechanistic models and often poor statistics which has impeded progress in our understanding. I will discuss progress in the development of a hybrid ML/HPC approach to investigate the behavior of an oncogenic protein on cellular membranes in the context of the ADMIRRAL (AI-Driven Machine-learned Investigation of RAS-RAF Activation Lifecycle) Project, a collaboration between the US Department of Energy and the National Cancer Institute.

Back to Session II

Provable Advantage in Quantum PAC Learning

Sergii Strelchuk

Department of Applied Mathematics and Theoretical Physics and Centre for Quantum Information and Foundations University of Cambridge and University of Warwick, Computer Science Department, Warwick Quantum Centre, UK

In this talk I will provide a gentle introduction to PAC learning and revisit the problem of characterising the complexity of Quantum PAC learning, as introduced by Bshouty and Jackson [SIAM J. Comput. 1998, 28, 1136–1153]. Several quantum advantages have been demonstrated in this setting, however, none are generic: they apply to particular concept classes and typically only work when the distribution that generates the data is known. In the general case, it was recently shown by Arunachalam and de Wolf [JMLR, 19 (2018) 1-36] that quantum PAC learners can only achieve constant factor advantages over classical PAC learners.

We show that with a natural extension of the definition of quantum PAC learning used by Arunachalam and de Wolf, we can achieve a generic advantage in quantum learning.

The talk is based on https://eccc.weizmann.ac.il/report/2023/142/

Back to Session VII

Modular Supercomputing, HPC and AI

Estela Suarez

Juelich Research Center, Juelich, Germany

For the major technology providers HPC is a small market with low return of investment, for which it does not pay off to develop specific products. Therefore, in the last 30 years they have designed their products with the much larger volumes of the sever market in mind, and we have taken them off-the-shelf to build HPC clusters. Now, the industry has moved their focus towards the cloud and, most recently, towards the exploding AI-market dominated by hyperscalers. The latter do build their own devices and dictate the design of upcoming CPUs and accelerators, which are tailored to address the requirements of AI applications. To benefit from these new products, we in HPC must ask ourselves how the specific requirements of our traditional HPC applications differ from those of the most popular AI-models, and how we can make them compatible with each other. We also need to rethink how HPC systems should look like to serve the needs of both HPC and AI application domains. In this talk, we discuss how the Modular Supercomputing Architecture can be a vehicle to achieve this goal.

Bio

Prof. Dr. Estela Suarez is Joint Lead of the department “Novel System Architecture Design” at the Jülich Supercomputing Centre, which she joined in 2010. Since 2022 she is also Associate Professor of High Performance Computing at the University of Bonn, and member of the RIAG (Research and Innovation Advisory Board from EuroHPC JU). Her research focuses on HPC system architecture and codesign. As leader of the DEEP project series she has driven the development of the Modular Supercomputing Architecture, including hardware, software and application implementation and validation. She also leads the codesign efforts within the European Processor Initiative. She holds a PhD in Physics from the University of Geneva (Switzerland) and a Master degree in Astrophysics from the University Complutense of Madrid (Spain).

Back to Session IV

Breaking the HPC Communication Wall with Tightly-coupled Supernodes

Samantika Sury

SAMSUNG Electronics America, Westford, MA, USA

oday’s large scale HPC systems generally provide high-performance heterogenous nodes connected via a high-performance network fabric like Ethernet or Infiniband. A challenge with such a system architecture is that utilization of accelerators inside a node tends to still be challenging due to the costs of offloading and data movement . Another challenge is the significant performance cliff once you leave the node due to bandwidth, latency and software overheads in communication. A more scalable system architecture in HPC and AI is possible through the aggregation of tightly-coupled nodes into a “Supernode” augmented with a memory model for productive programming before accessing a scale-out network fabric. With industry innovations like NVlink, CXL3.0 and UAlink, the future of datacenters is also trending in this direction and joint innovation in this area will be key to future scalable system architectures. Keeping in mind the theme of the workshop “State of the Art, Emerging Disruptive Innovations and Future Scenarios in HPC”, this talk will discuss the value proposition of tightly-coupled Supernodes to improve communication for HPC and AI, some industry trends that are driving this direction and point out some challenges to overcome.

Back to Session III

Accelerating Progress in Delivering Clean Energy Fusion for the World with AI, ML, and Exascale Computing

William Tang

Princeton University Dept. of Astrophysical Sciences, Princeton Plasma Physics Laboratory; Center for Statistics and Machine Learning (CSML) and Princeton Institute for Computational Science & Engineering (PICSciE), Princeton University, USA

The US goal (March, 2022) to deliver a Fusion Pilot Plant [1] has underscored urgency for accelerating the fusion energy development timeline. Validated scientific and engineering advances driven by Exascale Computing together with advanced statistical methods featuring artificial intelligence/deep learning/machine learning (AI/DL/ML) must properly embrace Verification, Validation, and Uncertainty Quantification (VVUQ) to truly establish credibility. Especially time-urgent in the Clean Energy Fusion grand challenge application domain is the need to predict and avoid large-scale “major disruptions” in tokamak systems.

Disruption prediction has enjoyed great progress through the use of high dimensional signals, modern deep learning methods, multi-device training and testing. We expect accelerated progress through the use of additional architectural improvements such as transformers as well as multi-time scale models (e.g. temporal convolutions to take advantage of the wide range of natural temporal scales of the measured diagnostic signals. Integrating additional multi-model signals (such as frequency domain signals, ECEi data, 2D radiation profiles, etc.) into a single model provides additional opportunities for performance improvement. Foundation model-type efforts have especially promising potential for impact. The general framework for enabling huge advances can come from the rapidly evolving LM’s and Image Recognition models. Associated international R&D efforts such as the "Trillion Parameter Consortium" [https://tpc.dev/tpc-european-kick-off-workshop] are already focusing on the training of multi-billion parameter models on a mix of experimental and simulation data. With rapidly advancing modern technology this can rapidly lead to the fine-tuning of huge models multiple times into several smaller distilled models in the category of “Multitask Learning for Complex & Diverse Control Needs.”

This presentation will highlight the deployment of recurrent and convolutional neural networks in Princeton’s Deep Learning Code -- "FRNN" – that enabled the first adaptable predictive DL model for carrying out efficient "transfer learning" while delivering validated predictions of disruptive events across major internatonal tokamak devices [2]. Moreover, the AI/DL capability -- in an "understandable sense" can provide not only the “disruption score,” as an indicator of the probability of an imminent disruption but also a “sensitivity score” in real-time to indicate the underlying reasons for the predicted disruption [3]. A real-time prediction and control capability has recently been significantly advanced with a novel surrogate model/HPC simulator ("SGTC") [4] -- a first-principles-based prediction and control surrogate necessary for projections to future experimental devices (e.g., ITER, FPP’s) for which no "ground truth" observational data exist.

Finally, an exciting and rapidly developing area that cross-cuts engineering design with advanced visualization capabilities involves AI-enabled advances in Digital Twins – with the FES domain providing stimulating exemplars. This has also witnessed prominent recent illustrations of the increasingly active collaborations between leading industries such as NVIDIA that enabled productive advances for tokamak digital twins with dynamic animations of the advanced AI-enabled surrogate model SGTC [4] and NVIDIA’s "Omniverse" visualization tool [5]. More generally, the scientific merits of Digital Twins are well analyzed in the recent US National Academies Report on “Foundational Research Gaps and Future Directions for Digital Twins” [6].

REFERENCES:

[1] https://www.whitehouse.gov/ostp/news-updates/2022/04/19/readout-of-the-white-house-summit-on-developing-a-bold-decadal-vision-for-commercial-fusion-energy/

[2] Julian Kates-Harbeck, Alexey Svyatkovskiy, and William Tang, "Predicting Disruptive Instabilities in Controlled Fusion Plasmas Through Deep Learning," NATURE 568, 526 (2019)

[3] WilliamTang, et al., Special Issue on Machine Learning Methods in Plasma Physics, Contributions to Plasma Physics (CPP), Volume 63, Issue 5-6, (2023).

[4] Ge Dong, et al., 2021, Deep Learning-based Surrogate Model for First-principles Global Simulations of Fusion Plasmas, Nuclear Fusion 61 126061 (2021).

[5] William Tang, et al., 2023, AI-Machine Learning-Enabled Tokamak Digital Twin, Proceedings of 2023 IAEA FEC, London, UK (2023).

[6] https://www.nationalacademies.org/our-work/foundational-research-gaps-and-future-directions-for-digital-twins (2023).

Back to Session IV

The National Science Data Fabric: Democratizing Data Access for Science and Society

Michela Taufer

The University of Tennessee, Electrical Engineering and Computer Science Dept. Knoxville, TN, USA

The National Science Data Fabric (NSDF) pilot project is a transformative initiative to democratize data-driven sciences through a cyberinfrastructure platform that ensures equitable access. By integrating a programmable Content Delivery Network (CDN), NSDF achieves interoperability across various computing environments, enabling seamless computing, storage, and networking integration. This strategy enables the development of community-driven solutions and domain-specific advancements efficiently. A key element of NSDF’s approach is its dedication to community education and outreach, especially through collaborations with minority-serving institutions, to ensure widespread access. Our presentation will introduce the shared, modular, and containerized NSDF environment, designed to bridge significant gaps in the national computational infrastructure and tackle the ‘missing millions’ in STEM talent. We will highlight NSDF’s commitment to fostering an inclusive, diverse workforce and its efforts towards collective success in various fields, including material sciences, astrophysics, and earth sciences. Through testimonials and live demonstrations, we will showcase the impactful services provided by NSDF to support global science and engineering goals and to engage the broader scientific community effectively.

Vita:

Dr. Michela Taufer is an AAAS Fellow and ACM Distinguished Scientist; she holds the Dongarra Professorship in High-Performance Computing in the Department of Electrical Engineering and Computer Science at the University of Tennessee Knoxville (UTK). She earned her undergraduate degree in Computer Engineering from the University of Padova (Italy) and her doctoral degree in Computer Science from the Swiss Federal Institute of Technology (ETH) Zurich (Switzerland). From 2003 to 2004, she was a La Jolla Interfaces in Science Training Program (LJIS) Postdoctoral Fellow at the University of California San Diego (UCSD) and The Scripps Research Institute (TSRI), where she worked on interdisciplinary projects in computer systems and computational chemistry.

Dr. Taufer’s commitment to interdisciplinary collaboration has been a constant throughout her career, with a particular passion for connecting computational and experimental sciences. Her research targets designing and implementing cyberinfrastructure solutions that leverage high-performance computing, cloud computing, and volunteer computing. She also focuses on the application of HPC in artificial intelligence and machine learning, is dedicated to enhancing algorithms and workflows for scientific applications, and advocates for reproducibility, replicability, and transparency in scientific research while pushing the boundaries of in situ and in transit data analytics.

Dr. Taufer has led several National Science Foundation collaborative projects and has served in leadership roles at HPC conferences such as ISC and IEEE/ACM SC. Beyond research and leadership, Dr. Taufer has been influential on steering committees and editorial boards, currently serving as the editor-in-chief of the journal Future Generation Computer Systems. Her commitment to growing a diverse and inclusive community of scholars is evident in her mentorship of students across a spectrum of interdisciplinary research.

Back to Session III

Our first move and second step toward "HPC-Oriented" Quantum-HPC Hybrid platform software

Miwako Tsuji

RIKEN Center for Computational Science, Kobe, Japan

We had started the the development of the quantum HPC hybrid platform from last year. In this talk, we present several prototype implementations and preliminary experiment results. We also present the design of the quantum HPC hybrid platform software defined based on the preliminary experiments and a lot of discussion with researchers in quantum hardware, quantum SDK, and so on.

Our "HPC-oriented" design exploits supercomputers’ performance efficiently and provides flexible solutions to support different kinds of Quantum and HPC hybrid applications.

Back to Session IX

Navigating AI’s impact on energy efficiency and resource consumption

Andrew Wheeler

HPE Fellow & VP, Hewlett Packard Labs, Fort Collins, CO, USA

The world has recently witnessed an unprecedented acceleration in the demand for machine learning and AI applications. This spike in demand has imposed tremendous strain on today’s technology performance, power, and energy consumption. Future trends indicate unsustainable spending and a widening technology gap. This talk will examine promising technologies with the potential to bring orders of magnitude of improvements to the growing cost, energy, and performance challenges.

Back to Session I

Lessons Learned from Pre-training Large Language Models

Rio Yokota

Tokyo Institute of Technology, Tokyo, Japan

Since the release of ChatGPT, there have been many efforts to pre-train large language models (LLM) with similar capabilities as ChatGPT. In Japan, there are efforts to train LLMs with strong Japanese capabilities and good understanding of the Japanese culture. However, since English is the dominant language on the internet, it is difficult to find high quality Japanese text data of similar quantity as the English data commonly used to train LLMs. There are also many challenges with the training itself since many types of distributed parallelism need to be combined to extract the full potential of GPU supercomputers. Since the runs can take months on thousands of nodes, hardware failure is another problem that we cannot neglect. In this talk I will summarize the lessons learned from three different projects in Japan to pre-train LLMs.

Back to Session IV