10 Fastest Supercomputers of the world in 2021

Have you ever wondered when a computer is called a supercomputer? Is it the sum of processors or the quantity of RAM? Does a supercomputer have to occupy a certain amount of space or use a particular amount of energy?

The Control Data Corporation (CDC) 6600, the first supercomputer in the world, only had a single CPU. The CDC 6600, which was released in 1964, was about the size of four file cabinets.

The cost was around $8 million and operated at up to 40 MHz, squeezing out a peak output of 3 million flops.

We have gathered information to present you with the ten fastest Supercomputers of the world, which are impacting our future to a greater extent.

Fugaku supercomputer


Fujitsu and Riken has designed the world’s top-level supercomputer, the Fugaku supercomputer, capable of achieving high efficiency for a wide variety of software applications, intending to start full service in the fiscal year 2021.

The supercomputer designed with the Fujitsu A64FX microprocessor was named after an alternate name for Mount Fuji. Fugaku can conduct more than 442 quadrillion computations a second, about three times faster than the U.S.-developed Summit supercomputer.

The Japanese supercomputer, which has been jointly built at Riken’s Kobe facility with Fujitsu Ltd., forms a critical foundation for powerful simulations used in scientific research and industrial and military technology development.

For the first time in history, the same supercomputer simultaneously became No.1 on the Top500, HPCG, and Graph500 on june 22 at the ISC High Performance 2020 Digital, an international high-performance computing conference, the awards were announced.

It achieved a LINPACK score of 415.53 petaflops on the Top500, a far higher score than its nearest rival, Summit in the United States, with 148.6 petaflops, using 152,064 of its total 158,976 nodes.

On HPCG, using 138,240 nodes, it scored 13,400 teraflops, and on HPL-AI, it obtained a score of 1,421 exaflops, the first time a device has even got an exascale ranking on any list, using 126,720 nodes.

Fugaku was developed as part of Japan’s initiative to build next-gen flagship supercomputer. In RIKEN center for computational science (R-CSS) in kobe, the machine would be used to introduce wide range of applications that addresses high priority social and scientific issues.

Sierra supercomputer


The Sierra supercomputer serves the three nuclear safety laboratories of  Sandia, Los Alamos National Laboratories and NNSA-LLNL where it provides high-fidelity simulations to support the core mission of NNSA to ensure security, safety, and efficiency of the nuclear stockpile of the nation.

The arrival of Sierra follows years of procurement, development, implementation of codes, and deployment. In close collaboration with IBM, NVIDIA, and Mellanox, the efforts of hundreds of computer scientists, developers, and operations staff were involved.

Sierra has 125 petaFLOPS and a peak performance of 125 quadrillion floating-point operations per second. Sierra is built to perform atleast six to ten times more depending on the application thrown on it, while Sequoia at LLNL can perform 20-petaFLOP which is currently the eighth fastest supercomputer in the world.

With a 125 petaFLOP/s peak, the IBM-built Sierra supercomputer offers four to six times the sustained performance and five to seven times the workload performance of Sequoia.

Sierra is also about five times more energy-efficient than Sequoia, at about 11 megawatts. Sierra integrates two kinds of processor chips: Power 9 processors from IBM and Volta graphics processing units from NVIDIA (GPUs).

It is designed for overall operations that are more successful and is a promising extreme-scale computing architecture.

Sunway TaihuLight


Having performed 93 quadrillion operations per second, The Sunway-Taihulight topped the list of the 500 most powerful supercomputers in the world, dethroning China’s Tianhe-2.

Sunway-TaihuLight is twice as fast and three times as powerful as Tianhe-2, which performs 33.86 quadrillion calculations per second, with 10,649,600 computing cores comprising 40,960 nodes.

The new system was developed by Parallel Computer Engineering & Technology’s Chinese National Research Center and installed at the Wuxi National Supercomputing Center.

It has also proved to be more energy efficient than the predecessor, Tianhe-2. Which was infact the fastest supercomputer of the world for the past 6 years.

For the same calculations, one watt of electricity can sustain 6 billion Sunway-TaihuLight calculations, which is just a third of the Chinese-developed Tianhe-2’s energy consumption, which recorded 33.86 PFlops per second.

To support the creation of Sunway-TaihuLight, China has channeled 1.8 billion yuan (273 million U.S. dollars) to support the development of Sunway-TaihuLight, about one-third of which was from the central government, and the other two-thirds was shared by the local governments of Jiangsu province and Wuxi.

Centered on a homegrown processor, the Sunway TaihuLight system shows the substantial progress China has made in developing and producing large-scale computer systems. The computer node on this system is based on the SW26010 processor, a multi-core processor chip.

Each processor consists of four Management Processing Elements (MPEs), four Computing Processing Elements (CPEs) (260 cores in total), four Memory Controllers (MC), and a Machine Interface-connected Network on Chip.

With 8 GB of DDR3 memory, each of the four MPE, CPE, and MC has access. In the entire scheme, there are 40,960 nodes.

HPC5 supercomputer


HPC5 is a cluster of computers working together to increase total output, or in other words, it fuctions as a set of many computers. This capacity of HPC5 is three times that of HPC4, its predecessor.

A hybrid architecture designed for it that has maximized total performance by keeping energy consumption to a minimum: as such, HPC5 and HPC4 both were equipped with GPU computation designed to measure large volumes of data simultaneously with much greater efiiciency.

Indeed this GPU can carry out over 10,000 million billion operations burning just one watt of electricity. Furthermore, every single device comes with two CPU sockets; for graphics accelerators, HPC5 has four sockets while HPC4 had two sockets.

In total, it has access to more than 3400 computing processors and 10,000 graphics cards.

This computer’s performance level means that it can process subsoil data using too sophisticated in-house algorithms. For processing, the geophysical and seismic data obtained from all over the world is sent to HPC5.

The system creates extremely in-depth subsoil models using this data.

Based on these, it allows geologists to determine what is hidden several kilometers below the surface: indeed, that’s how we found Zohr, the largest gas field in the Mediterranean ever discovered.



Tianhe-2, developed in central China’s Changsha city by the National University of Defense Technology, can sustain a computation of 33.86 petaflops per second.

Which is same as performing 33,860 trillion operations per second. It takes a total of 3120000 processor cores powered by the Ivy bridge and Intel’s Xeon Phi chips to perform this calculation.

It was funded by the State Hi-tech Plan or 863 Hi-tech programme of the chinese government. The goal was to make the country less dependent on overseas rivals and make the country’s hitech industry more competitive.

Many of its features have been built and are unique in China. They include:

  1. A custom designed network of interconnections that routes data across the device
  2. 4,096 Galaxy FT-1500 CPUs (central processing units) developed by the National University of Defense Technology have been installed to manage particular weather forecasting applications and national defense.
  3. The use of the Kylin operating system is named after a mythical beast known as the “Chinese unicorn” and developed by the university as a high-security alternative for users in government, defense, electricity, aerospace, and other critical industries.

The output of the Tianhe-2 on paper is nearly double that of the next machine on the list. According to the National University of Defense Technology, the Tianhe-2 will be used for simulation, research, and government security applications.

But with a $385 million price tag that occupies 7,750 square feet of space, some people think it’s a bit of overkill for what it’s going to be used for.

Marconi 100


The new accelerated cluster MARCONI 100 is based on the IBM Power9 architecture and Volta NVIDIA GPUs, acquired by Cineca under the European PPI4HPC initiative.

This system opens the way for the Leonardo Supercomputer pre-exascale, scheduled to be completed in 2021. It was open to the Italian public and industrial researchers from April 2020. Its potential for computation is around 32 PFlops.

Based on measurements performed by CINECA. The Marconi 100 provides almost 32 theoretical, computational peak petaFLOPS or 32 quadrillion calculations per second.

Marconi100, through the PRACE project, will assist European researchers and Italian researchers through the Italian Supercomputing Resource Allocation initiative (ISCRA).

Summit Supercomputer


Summit, the smartest, most powerful supercomputer in the world, was unveiled by Oak Ridge National Laboratory.

Summit pumps out 200 million billion floating-point operations (200 exaFlOPS) for high-performance computing (HPC) applications and 3 billion (1018) operations per second (3 exaops) for AI applications, using approximately 28,000 NVIDIA Volta Tensor Core GPUs.

Summit’s ability to combine HPC and AI techniques will provide researchers with the ability to automate, accelerate, and drive developments in areas such as health, electricity, and engineering.

The Summit runs eight times faster than the previous Titan supercomputer. It will change the exploration of AI and assist scientists and researchers in continuing to explore new technologies.

With Summit, the use of techniques such as machine learning and deep learning on a large scale can improve many aspects of our economy, including health care and the development of electricity.

Piz Daint supercomputer


The computational power of Piz Daint, a supercomputer at the Swiss National Supercomputing Centre, is 7.8 petaFlops, which means 7.8 quadrillion math operations per second.

This supercomputer has the power to compute over a typical laptop that could calculate in 900 years in one day.

With a total of 5’272 computing nodes, Piz Daint is a 28 cabinet Cray XC30 device. The compute nodes are fitted with a 64-bit Intel SandyBridge 8-core CPU (Intel® Xeon® E5-2670), a 6 GB GDDR5 NVIDIA® Tesla® K20X, and a 32 GB host memory.

The nodes are interconnected with a dragonfly network topology through Cray’s proprietary “Aries” interconnect.

Trinity supercomputer


The Trinity supercomputer is designed to provide the NNSA Nuclear Security Enterprise with enhanced computing capability to support ever-demanding workloads, e.g., to increase geometric and physics fidelity while maintaining total solution time requirements.

The Trinity’s skills are needed to support the certification and evaluation of the NNSA Stockpile Stewardship program to ensure that the nation’s nuclear stockpile is safe, stable, and secure.

The Trinity project is managed and run by Los Alamos National Laboratory and Sandia National Laboratories under the Alliance for Computing at Extreme Scale (ACES) collaboration.

The computer is physically placed in Los Alamos at the Nicholas Metropolis Center for Modeling and Simulation.

Trinity was constructed in two phases. The Intel Xeon Haswell processor was integrated into the first stage, while the Intel Xeon Phi Knights Landing Processor added a significant performance boost in the second stage.

There are 301,952 Haswell and 678,912 Knights Landing processors in the combined method, providing a cumulative peak output of over 40 PF/s (Petaflops per second).

Frontera supercomputer


The National Science Foundation (NSF) awarded the Texas Advanced Computing Center (TACC) a $60 million grant to deploy a new petascale computing device, Frontera, in 2018.

In science and engineering, Frontera opens up new possibilities by offering computational capability that enables researchers to tackle even larger and more challenging research problems across various domains.

Frontera comprises two computing subsystems, a primary computing system based on dual output precision and a second subsystem focused on single streaming-memory computing precision.

Frontera also has several storage systems, as well as cloud and archive device interfaces and a collection of virtual server hosting application nodes.

On the high-performance LINPACK benchmark, a test of the machine’s floating-point computing capacity, Frontera achieved 23.5 PetaFLOPS.

An individual will have to perform one calculation every second for about a billion years to equal what Frontera can compute in just one second. The estimated peak efficiency of the main machine would be 38.7 PetaFLOPS.


Leave a Comment