Developments in HPC
What is to come in the computing arena
After the heyday of vector computing in the 1980s, the success of computer workstations, and the rapid increases in performance of personal computers (PC), the 1990s saw the emergence of massively parallel processing (MPP) computers. Initially, parallel computers were simple agglomerations of commonly available processors – typically a few hundred but as many as a few thousand by the end of the millennium. With the continued acceleration of commodity processors used in PCs, many MPP machines became what today are known as clusters. These systems require only modest initial capital investments and therefore seem ideal for many applications in scientific computing. However, studies as early as the late 1990s have shown that many applications run with very poor efficiency on clusters built from commodity components. Efficiency in this context means the fraction of all available operations on a computer used by the application program to generate the result of the computation. Low efficiency, in many cases well bellow 10%, means that effective cost per computed result is much higher than the initial capital investment might suggest. Furthermore, today low efficiency directly implies a lot of wasted energy, since large clusters consume hundreds of kilowatts or even megawatts of electric power to operate.
The “Computenik” shock
The computational science community in the US received the equivalent of the Sputnik shock when in 2002 the Japanese Earth Simulator (ES), a massively parallel vector computer, became operational. Jack Dongarra from the University of Tennessee, a member of the US Academy of Engineering and co-founder of the list of Top500 supercomputers made this comparison and called it the “Computenik” shock.
This machine was not only a factor five faster than any other computer in the world, it ran typical climate simulation codes at over 60% efficiency, that is an order of magnitude more efficiently than any other commodity component based system. Effectively, from a scientific application point of view, the ES was at least one order of magnitude more productive than any of the other large MPP systems available at the time. The ES was also the most expensive supercomputer ever built and therefore remained a singular event. However, it did revitalize the architectural developments in the US.
Leading systems today and tomorrow
The leading hardware systems today, in 2009, are the IBM Blue-Gene (BG) and the Cray XT lines of supercomputers. The largest machines of this kind consist of order 105 tightly coupled scalar processors, and several scientific applications codes that have managed to keep up with the architectural developments, can make full use of these machines at efficiencies around 50% or more. Two scientific applications sustained a petaflop/s under production conditions at the National Center for Computational Sciences on a supercomputer with 150,000 processors in November 2008.
Besides the development of IBM BG and Cray XT, which were funded mainly by the US Department of Energy, the Defense Advanced Research Project Agency (DARPA), launched the High Productivity Computing Systems (HPCS) program with the goal to build revolutionary new systems that would run real applications at high efficiency by 2011. The agency is formerly known as ARPA and is well known for bringing us technological advances like the laser or ARPA-Net now known as the Internet. After an initial design and evaluation phase that lasted from 2002 to 2006, DARPA selected two vendors who would go into the final prototype development phase, funding IBM and CRAY to build the next generation computer systems by 2011. From everything we know today, these systems will be very different from the IBM BG and Cray XT lines of supercomputers and will likely lead to a revolutionary shift in computing architectures in the years after 2011.
A key aspect that will distinguish the DARPA/HPCS system from today’s IBM BG and Cray XT lines of computers, is that DARPA invested concomitantly into the development of new programming models and performance benchmarks that are more diverse and better suited to applications efficiency than today’s programming models and the LINPACK benchmark. However, in order to take advantage of these new developments and maximize productivity and efficiency, scientific and engineering applications will have to adapt to these new programming models. This does imply that domain scientists will have to do more than simply “port” and tune their application codes to the new machines. It is important that computational scientists begin to invest now in algorithm and software development targeted to the new computing architectures and programming paradigms that will be available after 2011.