What is supercomputing? [part 3]
In this blog, we will go into the history of HPC and supercomputing where we will cover how it all started and how it developed into the HPC we see today.
This blog is part of a series of blogs where we will introduce you to the world of HPC.
- What is High-performance computing? – Introduction to the concept of HPC
- High-performance computing clusters anywhere – Introduction to HPC cluster hosting
- High-performance computing cluster architectures – An overview of HPC cluster architecture
- Open source in HPC – An overview of how open source has influenced and driven HPC
- High-performance computing (HPC) technologies – what does the future hold?
What is supercomputing?
These days, supercomputing has become a synonym for high-performance computing. However, they are not exactly interchangeable: supercomputers and supercomputing generally refer to the larger cluster deployments and the computation that takes place there, while HPC mainly refers to a computation that’s performed using extremely fast computers on clusters that can vary from small scale HPC clusters to large supercomputers. Most often HPC clusters and supercomputers even share most of the same architecture and are both being built out of commodity servers. But of course, some systems are still being built to reflect a closer representation of supercomputing as it was known in the past.
Historically, supercomputing was a type of high-performance computing that took advantage of a special set of systems. Similar to the HPC clusters of today, they worked on massively complex or data-heavy problems, although comparing the two is a little bit like comparing apples to pears when it comes to computing power. Even a mere mobile phone today is more powerful than the first supercomputers. For example, some mobile phones can reach a few gigaflops whereas the CDC 6600 was estimated to deliver about 3 megaflops.
However, at the time, supercomputers were more powerful than anything else on the market, very expensive to build and develop, and their architecture was far superior to the personal computers that were available at the time. That is why they were called supercomputers. They were the original HPC systems and were generally reserved for the realm of governments and research universities. Unlike current HPC clusters, supercomputers were quite different in terms of architecture. Ultimately, they were huge multi-processor systems with very specialised functionality.
Where did supercomputers start?
The first supercomputers were built for research institutions. Actually, one of the first supercomputers was the UNIVAC Livermore Atomic Research Computer (LARC) at the Lawrence Livermore National Laboratory, a federal research facility in Livermore, California. The second LARC system was delivered to the US Navy. Both worked on solving computational problems commonly solved by HPC systems today, such as computational fluid dynamics or the simulation of the flow of fluids-liquids and gases and how they interact with things around them. Similar systems followed such as the IBM 7030, also known as the Stretch and initially delivered to the Los Alamos National Laboratory. But the success of the IBM 7030 was mostly eclipsed by the release of the CDC 6600 designed by Seymour Cray at the Control Data Corporation. It delivered three times more performance than the IBM 7030 and Seymour Cray would go on to become a big name in the world of supercomputers.
How did supercomputing develop?
Following the success of the CDC 6600, Seymour Cray designed another popular supercomputer, the CDC 7600, which had ten times the computing power of its successor. However, the success did not last as the release of the next iteration, the CDC 8600, failed to deliver similar performance gains as previous iterations, leading to poor sales. This made it difficult to cover the research costs of the systems and ended up putting the CDC corporation in a difficult place financially. After that commercial failure, Seymour Cray wanted to start from new, something that the CDC was unwilling to do. It led to an amicable split with him founding Cray Research, with investment from the CEO of CDC. Cray went on to develop the Cray-1 which was the first successful system to implement vector processors. He also added registers, which allowed the processors to fetch and store data closer to where the computation took place, alleviating the need to continually fetch data from memory. This, along with an overall improvement in architecture, gave Cray-1 the performance it needed to succeed in the market.
Further developments from Cray, IBM, NEC, Fujitsu, Intel, and others then paved the way for more generalised architectures better suited for scale. The ever-increasing memory requirements encouraged the usage of distributed memory, where each processor had its own dedicated memory space. For this to be useful, the processes needed a way to communicate with each other; this required a message to be passed. With this began the era of distributed memory computers and message-passing.
This led the way to how we deploy and build supercomputers today and was vital to the development of Beowulf clusters. The first known Beowulf cluster was built out of commodity PC hardware by Thomas Sterling and Donald Becker at NASA. The system ran Linux and made NASA one of the first adopters of Linux. These Beowulf clusters depended on multiple machines working together as one, sharing their memory; this was enabled by the development of MPI or Message Passing Interface, a standard used to enable portable message passing from the memory of one system to another on parallel computers.
Thanks to message-passing computational workloads could now be run across multiple commodity servers all connected together via a high-speed networking link. This was vital to the development of HPC as it allowed an ever greater number of organisations to solve their computational problems at a lower cost and at a greater scale than ever before as they were no longer limited to the computational ability of a single system.
This was a key development in terms of democratising computational clusters and is the way we build high-performance computing clusters today — using multiple servers often referred to as nodes working over a high-speed network together as a cluster.
This has brought high-performance computing to an ever-growing number of users and use cases. And now high-performance computing is available to anything from individuals to organisations of all sizes.
Today’s clusters are still built on the same fundamental principles using a large number of commodity servers, a high-speed interconnect and a Linux operating system. A list of these systems is maintained and these systems actively compete for top spots on the TOP500 list. Some of the top systems are the IBM Summit from 2018, a GPU-based cluster developed by IBM for the Department of Energy and Oak Ridge National Laboratory in the USA, and Fugaku from 2020, an ARM-based cluster developed by Fujitsu for Riken in Japan. Summit has been benchmarked at 148,600 TFlop/s and Fugaku at 442,010 TFlop/s so the processing ability of these clusters is developing at a rapid pace. This year we might see the first exaflop clusters with the Intel and Cray developed Aurora currently being developed for Argonne National Laboratories predicted to reach 2 EFlops/s.
This article is about the history of supercomputers and how they started and how they have evolved into the HPC and supercomputing systems we see today.
If you are interested in more information take a look at the previous blog in the series “High-performance computing (HPC) anywhere” or at this video of how Scania is Mastering multi cloud for HPC systems with Juju, or dive into some of our other HPC content.
In the next blog, we will give an insightful overview of HPC clusters, their architecture, components, and structure.