High-Performance Computing

High-performance computing (HPC) is the art and science of using groups of cutting edge computer systems to perform complex simulations, computations, and data analysis out of reach for standard commercial compute systems available.

What is HPC?

HPC computer systems are characterized by their high-speed processing power, high-performance networks, and large-memory capacity, generating the capability to perform massive amounts of parallel processing. A supercomputer is a type of HPC computer that is highly advanced and provides immense computational power and speed, making it a key component of high-performance computing systems.

In recent years, HPC has evolved from a tool focused on simulation-based scientific investigation to a dual role running simulation and machine learning (ML). This increase in scope for HPC systems has gained momentum because the combination of physics-based simulation and ML has compressed the time to scientific insight for fields such as climate modeling, drug discovery, protein folding, and computational fluid dynamics (CFD).

The basic system architecture of a supercomputer.

One key enabler driving this evolution of HPC and ML is the development of graphics processing unit (GPU) technology. GPUs are specialized computer chips designed to process large amounts of data in parallel, making them ideal for some HPC, and are currently the standard for ML/AI computations. The combination of high-performance GPUs with software optimizations has enabled HPC systems to perform complex simulations and computations much faster than traditional computing systems.

Why Is HPC Important?

High-performance computing is important for several reasons:

Speed and Efficiency: HPC systems can perform complex calculations much faster than traditional computers, allowing researchers and engineers to tackle large-scale problems that would be infeasible with conventional computing resources.
Scientific Discovery: HPC is critical for many scientific disciplines, including climate modeling, molecular dynamics, and computational fluid dynamics. It allows researchers to simulate complex systems and processes, leading to new insights and discoveries.
Product Design and Optimization: HPC is widely used in industries such as aerospace, automotive, and energy to simulate and optimize the design of products, processes, and materials, improving their performance and reducing development time.
Data Analysis: HPC is also essential for analyzing large datasets, such as those generated by observational studies, simulations, or experiments. It enables researchers to identify patterns and correlations in the data that would be difficult to detect using traditional computing resources.
Healthcare: HPC is increasingly being used in healthcare to develop new treatments and therapies, including personalized medicine, drug discovery, and molecular modeling.

HPC has revolutionized the way research and engineering are conducted and has had a profound impact on many aspects of our lives, from improving the efficiency of industrial processes to disaster response and mitigation to furthering our understanding of the world around us.

How Does HPC Work?

High-performance computing works by combining the computational power of multiple computers to perform large-scale tasks that would be infeasible on a single machine. Here is how HPC works:

Cluster Configuration: An HPC cluster is made up of multiple computers, or nodes, that are connected by a high-speed network. Each node is equipped with one or more processors, memory, and storage.
Task Parallelization: The computational work is divided into smaller, independent tasks that can be run simultaneously on different nodes in the cluster. This is known as task parallelization.
Data Distribution: The data required for the computation is distributed among the nodes, so that each node has a portion of the data to work on.
Computation: Each node performs its portion of the computation in parallel, with the results being shared and ultimately integrated until the work proceeds to completion.
Monitoring and Control: The cluster includes software tools that monitor the performance of the nodes and control the distribution of tasks and data. This helps ensure that the computation runs efficiently and effectively.
Output: The final output is the result of the combined computation performed by all the nodes in the cluster. The output is generally saved to a large, parallel file system and/or rendered graphically into images or other visual depictions to facilitate discovery, understanding, and communication.

By harnessing the collective power of many computers, HPC enables large-scale simulations, data analysis, and other compute-intensive tasks to be completed in a fraction of the time it would take on a single machine.

What Is an HPC Cluster?

A high-performance computing cluster is a collection of tightly interconnected computers that work in parallel as a single system to perform large-scale computational tasks. HPC clusters are designed to provide high performance and scalability, enabling scientists, engineers, and researchers to solve complex problems that would be infeasible with a single computer.

An HPC cluster typically consists of many individual computing nodes, each equipped with one or more processors, accelerators, memory, and storage. These nodes are connected by a high-performance network, allowing them to share information and collaborate on tasks. In addition, the cluster typically includes specialized software and tools for managing resources, such as scheduling jobs, distributing data, and monitoring performance. Application speedups are accomplished by partitioning data and distributing tasks to perform the work in parallel.

Training data speedup using traditional HPC.

Source: Adapted from graph data presented in Convergence of Artificial Intelligence and High-Performance Computing on NSF-Supported Cyberinfrastructure | Journal of Big Data | Full Text (springeropen.com)

HPC Use Cases

Climate Modeling

Climate models are used to simulate the behavior of the Earth's climate, including the atmosphere, oceans, and land surfaces. These simulations can be computationally intensive and require large amounts of data and parallel computing, making them ideal for GPU-accelerated HPC systems. By using GPUs and other parallel processing techniques, climate scientists can run more detailed and accurate simulations, which in turn lead to a better understanding of the Earth's climate and the impacts of human activities. As this use case continues to progress, the predictive capabilities will grow and can be used to design effective mitigation and adaptation strategies.

Drug Discovery

The discovery and development of new drugs is a complex process that involves the simulation of millions of chemical compounds to identify those that have the potential to treat diseases. Traditional methods of drug discovery have been limited by insufficient computational power, but HPC and GPU technology allow scientists to run more detailed simulations and deploy more effective AI algorithms, resulting in the discovery of new drugs at a faster pace.

Protein Folding

Protein folding refers to the process by which proteins fold into three-dimensional structures, which are critical to their function. Understanding protein folding is critical to the development of treatments for diseases such as Alzheimer's and cancer. HPC and GPU technology are enabling scientists to run protein-folding simulations more efficiently, leading to a better understanding of the process and accelerating the development of new treatments

Computational Fluid Dynamics

Computational fluid dynamics (CFD) simulations are used to model the behavior of fluids in real-world systems, such as the flow of air around an aircraft. HPC and GPU technology let engineers run more detailed and accurate CFD simulations, which help improve the designs for systems such as wind turbines, jet engines, and transportation vehicles of all types.

HPC and ML/Al are having a significant impact on climate modeling, which is used to simulate the behavior of the Earth.

HPC Applications

Some of the most used high-performance computing applications in science and engineering include:

Molecular dynamics simulation
Computational fluid dynamics
Climate modeling
Computational chemistry
Structural mechanics and engineering
Electromagnetic simulation
Seismic imaging and analysis
Materials science and engineering
Astrophysical simulation
Machine learning and data analysis

There are many computer codes used for molecular dynamics (MD) simulations, but some of the most frequently used ones are:

Groningen Machine for Chemical Simulation (GROMACS)
Assisted Model Building With Energy Refinement (AMBER)
Chemistry at Harvard Molecular Mechanics (CHARMM)
Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Nanoscale Molecular Dynamics (NAMD)
OpenMM

There are several computer codes used for CFD simulations, but some of the most used ones are:

Ansys Fluent
OpenFOAM
COMSOL Multiphysics
STAR-CCM+

There are many computer codes used for climate modeling, but some of the most used ones are:

Community Earth System Model (CESM)
Model for Interdisciplinary Research on Climate (MIROC)
Geophysical Fluid Dynamics Laboratory (GFDL) climate model
European Centre for Medium-Range Weather Forecasts (ECMWF) model
UK Met Office Unified Model (MetUM)
Max Planck Institute for Meteorology (MPI-M) Earth system model

There are several computer codes used for computational chemistry, but some of the most used ones are:

Gaussian
ORCA
NWChem
Quantum ESPRESSO
Molecular Orbital Package (MOPAC)
Amsterdam Density Functional (ADF)
Q-Chem

There are many computer codes used for machine learning, but some of the most used ones are:

TensorFlow
PyTorch
scikit-learn
Keras
Caffe

These codes provide a wide range of ML algorithms, including supervised and unsupervised learning, deep learning, and reinforcement learning. They’re widely used for tasks such as image and speech recognition, natural language processing, and predictive analytics, and they’re essential tools for solving complex problems in areas such as computer vision, robotics, and finance.

How Can You Get Started With HPC?

Here are some ways you get started in high-performance computing:

Familiarize yourself with the basics of computer architecture, operating systems, and programming languages, particularly those commonly used for high-performance computing (such as C, C++, Fortran, and Python).
Study parallel and distributed computing concepts, including parallel algorithms, interprocess communication, and synchronization.
Get hands-on experience with high-performance computing tools and systems, such as clusters, GPUs, and Message Passing Interface (MPI). You can use online resources, such as the NVIDIA Deep Learning Institute, or try running simulations on public computing clusters.
Read research papers and books on the subject to learn about the latest advances and real-world applications of high-performance computing.
Consider taking online courses or enrolling in a degree program in computer science, engineering, or a related field to get a more comprehensive understanding of the subject.
Participate in coding challenges and hackathons focused on high-performance computing to improve your practical skills.
Join online communities, such as the NVIDIA Developer Program, and attend workshops and conferences to network with professionals and stay up to date on the latest developments in the field.

Next Steps

Read Our Free HPC Ebook

In HPC for the Age of AI and Cloud Computing, discover best practices for system design, gain a deeper understanding of the current and upcoming technology trends, learn from diverse use cases, hear from NVIDIA experts, and more.

Read the NVIDIA Framework for HPC eBook

Check out HPC Blogs

From news highlighting the latest breakthroughs in HPC to deep-dive technical walkthroughs showcasing how you can use NVIDIA solutions, there’s a blog to answer your HPC questions.

Explore Technical Blogs