Supercomputer operating system

A supercomputer operating system is an operating system intended for supercomputers. Since the end of the 20th century, supercomputer operating systems have undergone major transformations, as fundamental changes have occurred in supercomputer architecture.[1] While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been moving away from in-house operating systems and toward some form of Linux,[2] with it running all the supercomputers on the TOP500 list in November 2017. In 2021, top 10 computers run for instance Red Hat Enterprise Linux (RHEL), or some variant of it or other Linux distribution e.g. Ubuntu.

Given that modern massively parallel supercomputers typically separate computations from other services by using multiple types of nodes, they usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as Compute Node Kernel (CNK) or Compute Node Linux (CNL) on compute nodes, but a larger system such as a Linux distribution on server and input/output (I/O) nodes.[3][4]

While in a traditional multi-user computer system job scheduling is in effect a tasking problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully dealing with inevitable hardware failures when tens of thousands of processors are present.[5]

Although most modern supercomputers use the Linux operating system,[6] each manufacturer has made its own specific changes to the Linux distribution they use, and no industry standard exists, partly because the differences in hardware architectures require changes to optimize the operating system to each hardware design.[1][7]

Operating systems used on top 500 supercomputers
  1. ^ a b Cite error: The named reference Padua426 was invoked but never defined (see the help page).
  2. ^ Cite error: The named reference MacKenzie was invoked but never defined (see the help page).
  3. ^ Cite error: The named reference EuroPar2004 was invoked but never defined (see the help page).
  4. ^ An Evaluation of the Oak Ridge National Laboratory Cray XT3 by Sadaf R. Alam, et al., International Journal of High Performance Computing Applications, February 2008 vol. 22 no. 1 52–80.
  5. ^ Open Job Management Architecture for the Blue Gene/L Supercomputer by Yariv Aridor et al in Job scheduling strategies for parallel processing by Dror G. Feitelson 2005 ISBN 978-3-540-31024-2 pages 95–101.
  6. ^ Vaughn-Nichols, Steven J. (June 18, 2013). "Linux continues to rule supercomputers". ZDNet. Retrieved June 20, 2013.
  7. ^ "Top500 OS chart". Top500.org. Archived from the original on 2012-03-05. Retrieved 2010-10-31.

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search