Thomas Sterling Talks Exascale, Chinese HPC, Machine Learning, and Non-von Neumann Architectures

June 23, 2018

By: Michael Feldman

On Wednesday evening at the ISC High Performance conference, HPC luminary Dr. Thomas Sterling will deliver his customary keynote address on the state of high performance computing. To get something of a preview of that talk, we caught up with Sterling and asked him about some of the more pressing topics in the space.

What follows is pretty much the unedited text or our email exchange.

TOP500 News: What do you think the achievement of exascale computing will mean to the HPC user community?

Thomas Sterling: As a particular point in capacity and capability, exascale is as arbitrary in the continuum of performance as any other. But symbolically it is a milestone in the advancement of one of mankind’s most important technologies, marking unprecedented promise in modeling and information management.

 Of a subtler nature, it is a beachhead on the forefront of nanoscale enabling technologies, marking the end of Moore’s Law, the flatlining of clock rates due to power considerations, and the limitations of clock rate. The achievement of exascale computing will serve as an inflection point at which change from conventional means is not only inevitable but essential. It also implies the need to replace the venerable von Neumann architecture of which almost all commercial computing systems of the last seven decades are derivatives thereof.

Many will correctly argue the specific metrics by which this point is measured but at any dimension, it reflects progress, even if not as much as the community would like to think. This last consideration is a reflection of the Olympian heights at which almost all computing is excluded. The reality is that almost all systems operate at about two orders of magnitude lower capability. But then, most of us do not drive a Rolls Royce, while still admiring it.

TOP500 News: Do you think it’s important which nation reaches that milestone first?

Sterling: It would be easy to dismiss the importance of the exact order in which nations realize exascale capability, in particular, based on High Performance Linpack (HPL). Perhaps a far more important metric is a nation’s per capita number of systems deployed on the TOP500 list, suggesting the degree of access for high-end computing; this suggests that the number 500 system is the more important line on any such curve.

Further, thoughtful practitioners correctly observe that the accomplishment is the actual amount of science and engineering achieved as well as other important tasks, not an artificial test that has meaningful consequences. Finally, it’s not even clear that we are looking at all of the world’s big machines with industry deploying and operating enormous conglomerates of processing components and not even participating in the Linpack marathon.

Intellectually I agree with all of these cogent viewpoints. But there is an emotional aspect of this milestone and we are a species driven more by emotions than we are by predicate calculus. A nation is one delineation of a society and people – even HPC people – are atomic elements of societies. If a nation and therefore a societal identity is measured as competitive, then we as individuals inherit that property, sense of satisfaction – yes, even pride – and the tools of future achievement to which HPC contributes. If we fail significantly, then we accept our lesser stature. It does not so much matter who is at the front of the line around the race track at any instance. But it does matter if we are part of the race.

TOP500 News: Potential exascale achievements aside, how much do you worry about the ascendance of China in HPC?

Sterling: The recent dominance of China is important and of a concern, not that it is a Chinese accomplishment, but rather that it demonstrates a potential diminishing of US will, means, and ability of delivering the best that enabling technology can offer. I applaud the Chinese advances as well as those of Europe and Japan. The K machine has been at the top of the Graph 500 list and Europe is exploring alternative hardware structures for future HPC.

More, the Chinese have demonstrated significant innovation and are also competitive in terms of number of deployed HPC systems. They are learning a lot about how to apply these machines to real world applications rather than just paying for them. US funding has stalled and research in HPC has declined precipitously. Even with a recent increase in HPC budget by the US Senate, this has not been refactored into US HPC research but rather in system deployment. While Summit is a meaningful and long-awaited demonstration of American engagement in HPC progress, what does the failure of the Aurora project of similar scale portend for the future. It is not the Chinese success I worry about, it is the US stagnation I fear.

TOP500 News: What’s your perspective on the impact of machine learning and the broader area of data analytics on supercomputing – from both its effect on how its driving hardware – processors, memory, networking, etc. – and on how HPC practitioners are incorporating machine learning into their traditional workflows?

Sterling: The techniques at the core of machine learning go back to the 1980’s – neural-nets – and while creditable improvements have been inaugurated, the foundations are clearly similar. The important advance, even leap-frogging, is the explosive applications to unprecedented scale of data to which HPC is now being put to use. With little fear of contradiction, machine learning and data analytics is a major extension and market of HPC.

Further, it gives us a window into disorganized data sources in part made available through the internet and from many disparate origins from giant science experiments like the Large Hadron Collider, to existing and of criticality, for example, personalized medicine. I hope that these emerging data-intensive applications will stimulate innovative concepts in the memory side of supercomputer architecture with less focus on the FPU and more on the memory semantics, latency, bandwidth, and parallelism. It is long overdue.

But it should be noted that the term “machine learning” should not be misconstrued to mean machine intelligence. It is people who learn from the data processing of this paradigm, gaining human understanding and knowledge, not the machines. Machines do not know how to process the knowledge gained by human practitioners and they certainly don’t achieve anything like “understanding”. For this, we need yet new breakthroughs to reach Machine Intelligence (MI).

TOP500 News: There seem to be three computing technologies on the horizon that could potentially disrupt the market: quantum computing, neuromorphic computing, and optical computing. Can you give us your perspective on the potential of each of these for HPC kinds of applications?

Sterling: The three identified technologies are certainly part of the exploratory road map “on the horizon” as you say. But they are neither alone nor necessarily the top three. The enormous funding being poured into quantum computing by industry and government is very positive and will ultimately lead to a new form of computing far different from conventional practices, and as distinct as Vannevar Bush machines were from the succeeding von Neumann generation. Quantum computing will be important but always serving domain-specific purposes where their advantages can be exploited.

Neuromorphic or “brain-inspired” computing is intriguing as it is uncertain. The diversity of approaches being explored is constructive as ideas and insights emerge through a human relaxation process. I personally don’t think it’s going to work the way many people think. For example, I don’t think we have to mimic the structural elements of the brain to achieve machine intelligence. But I do expect that the associative methods hardwired into the brain if duplicated in some analogous fashion will greatly enhance certain idioms of processing that are very slow with today’s conventional methods. Right now, the inspiration is catalyzing new ideas and resulting methods that are worth exploring. Who knows what we will find.

Optical computing in the sense of adopting optical technologies to digital computing have been heavily pursued but in active data storage and logical data transformation have not proved successful. I love the idea but do not have faith in its promise. However, optical in the analogous sense employing non-linear functionality in the analog versus digital means may be very promising in the long term – or not. A form of extreme heterogeneity mixing the strengths of optical in narrow operations with parallel digital systems may serve in a manner similar to structures integrating GPUs today.

But I’ll close here by mentioning two other possibilities that, while not widely considered currently, are nonetheless worthy of research. The first is superconducting supercomputing and the second is non-von Neumann architectures. Interestingly, the two at least in some forms can serve each other making both viable and highly competitive with respect to future post-exascale computing designs. Niobium Josephson Junction-based technologies cooled to four Kelvins can operate beyond 100 and 200 GHz and has slowly evolved over two or more decades. When once such cold temperatures were considered a show stopper, now quantum computing – or at least quantum annealing – typically is performed at 40 milli-Kelvins or lower, where four Kelvins would appear like a balmy day on the beach. But latencies measured in cycles grow proportionally with clock rate and superconducting supercomputing must take a very distinct form from typical von Neumann cores; this is a controversial view, by the way.

Possible alternative non-von Neumann architectures that would address this challenge are cellular automata and data flow, both with their own problems, of course – nothing is easy. I introduce this thought not to necessarily advocate for a pet project – it is a pet project of mine – but to suggest that the view of the future possibilities as we enter the post-exascale era is a wide and exciting field at a time where we may cross a singularity before relaxing once again on a path of incremental optimizations.

I once said in public and in writing that I predicted we would never get to zettaflops computing. Here, I retract this prediction and contribute a contradicting assertion: zettaflops can be achieved in less than 10 years if we adopt innovations in non-von Neumann architecture. With a change to cryogenic technologies, we can reach yottaflops by 2030.

Thomas Sterling’s ISC keynote address will take place at the Messe Frankfurt on Wednesday, June 27, at 5:30  - 6:15 pm CEST.