The Intersection of AI, HPC and HPDA: How Next-Generation Workflows Will Drive Tomorrow’s Breakthroughs
The Intersection of AI, HPC and HPDA: How Next-Generation Workflows Will Drive Tomorrow’s Breakthroughs
November 3, 2018 23:48 CET
Workflows that unify artificial intelligence, data analytics, and modeling/simulation will make high-performance computing more essential and ubiquitous than ever.
By Patricia A. Damkroger Vice President and General Manager, Extreme Computing Intel Datacenter Group
High-performance computing (HPC) has long been critical to running the large-scale simulation and analytic workloads that advance scientific progress, product innovation, and national competitiveness. With increasing numbers of artificial intelligence (AI), high-performance data analytics (HPDA), and modeling/simulation workflows, there is a need to expand high-performance compute infrastructure to tackle the challenges of these workflows. This convergence is expanding HPC’s scope and making HPC infrastructure more essential than ever.
Two major factors are driving this expansion of HPC’s scope. One is the data deluge. Annual digital data output is predicted to reach or exceed 163 zettabytes by 20251. This data must be analyzed for meaningful actionable insights, increasing the need for high-performance infrastructure.
Big Science projects are key contributors to the deluge. CERN’s Large Hadron Collider will generate 50 petabytes of data in 2018, and a planned high-luminosity upgrade in 2026 will increase the number of events detected per second by tenfold2. The world’s largest radio telescope, the Square Kilometer Array (SKA), will analyze 10 billion incoming visibility data streams, scrutinizing over 1.5 million objects every ten minutes to detect astrophysical phenomena such as pulsar signals or flashes of radio light3. Next-generation use cases are also increasing the production of data. An autonomous vehicle is estimated to generate 4 TB per day4. A connected airplane will produce ten times that amount, and an automated factory will generate approximately one petabyte daily5. A smart hospital will generate 3 TB per day6.
The other important vector driving the broadening of HPC’s scope is the increase in compute power at affordable costs—that is, the ability to extract knowledge from the growing data volumes and support new data-fueled use cases. This has stimulated increased research in the AI algorithms that continue to drive the momentum in this space. In addition to the traditional analysis needed to gain insights from simulations, HPC resources are powering a new generation of converged workloads that integrate AI and HPDA with HPC’s model creation and simulation workloads. HPC platforms deliver the computational performance and throughput to train deep learning models, ingest and process real-time data streams, and scale resource-intensive AI workloads. In many converged workflows, AI and HPDA also allow for creating and refining more powerful HPC models.
The importance of AI and converged workloads is reflected in HPC’s growth projections. The overall HPC market is expected to enjoy a healthy 7 percent cumulative annual growth rate (CAGR) through 20217. The AI-HPC segment is projected to grow at 30 percent CAGR for the same period. 7The deep learning HPC segment is rising even faster, with an 81 CAGR8. Intel’s own analysis suggests that AI compute cycles will grow 12x by 2020.
What are some of the use cases for the intersection of these AI/HPC/HPDA workflows? And what do we as HPC leaders need to do to accelerate this convergence?
Converged Use Cases
AI is still maturing, and the convergence is in its early stages. But these game-changing innovations are already altering industries and delivering new research capabilities. Let me share a few of my favorite examples.
Autonomous driving is a well-known AI use case and one most of us can relate to. American drivers spent an average of 283 hours behind the wheel in 20169 , and UK motorists spent an average of 30 hours in 2017 sitting in traffic jams10. While AI-based navigation systems and safety features already augment the driver’s awareness and responses, autonomous vehicles will generate massive amounts of data that will require HPC infrastructure to analyze and predict. After developers have used HPC systems and neural networks to develop and train deep learning models, on-board HPC systems will ingest and analyze multiple sources of streamed data, including radar, sonar, GPS, camera, and laser-based lidar surveying data. These systems will execute complex algorithms, draw inferences from data, and make real-time decisions in response to changes in road conditions, traffic situations, pedestrian activity, and other aspects of the driving environment. These systems will bring breakthroughs in fuel usage, air quality, efficiency, human productivity, entertainment, urban design, infrastructure planning, and more.
The biological world is increasingly digital and measurable, generating vast streams of data from cryo-electron microscopes, genome sequencers, bedside patient monitors, diagnostic medical equipment, wearable health devices, and a host of other sources. Running AI-based analytics on HPC systems, life science researchers are using this data to construct and refine robust models that simulate the function and processes of biological structures at cellular and sub-cellular levels. By combining the resulting insights with AI analysis of massive data sets drawn from clinical, epidemiological, and other sources, converged workloads will help transform the way we understand, treat, and prevent disease. Practical AI solutions are already being incorporated into clinical workflows, augmenting human judgment and helping clinicians quickly and accurately identify conditions ranging from cancerous tumors to incipient diabetic blindness.
A modern hospital is a data-generation engine, primarily from day-to-day healthcare delivery, but also from research and analytical processes such as clinical trials. The data are heterogeneous, and include structured data (such as test results and diagnostic codes), unstructured text (such as physician notes), medical images, genomic and other “omics” data, and other sources. Hospitals, especially those with a research mission, also require large amounts of computation to handle their diverse computational processes—everything from basic statistical analysis of patient medical records, to image processing, HPC, and AI. However, the goal of delivering actionable information at the point of care remains elusive. Data are often siloed, and medical informatics teams often work in isolation from care providers. Their analyses are beneficial to hospital operations, but the results are often difficult to translate into better patient care. Large research hospitals are working hard to integrate HPC, HPDA, and AI to seamlessly deliver actionable information to physicians at the point of care. Imagine, for example, an AI agent that accesses not just the individual patient’s health data but data from multiple sources (including other patients) to dynamically predict, for example, how the patient will respond to a particular treatment. Given enough data and computing power, such AI agents have the potential to drastically improve patient outcomes.
HPC-enabled AI techniques are crucial to living on a changing planet. Converged workloads are helping climate change researchers build more sophisticated and accurate models as well as more thoroughly explore the output of their simulations. This strengthens scientists’ ability to predict the effects of environmental change and deliver evidence-based inputs to planning and management strategies. Applying HPC and AI to acoustic sensor data and sophisticated reservoir models helps increase natural resource recovery while minimizing environmental impacts. Converged AI/HPC applications increase time-to-success for innovative energy sources and energy-efficient materials.
In many of these converged workflows, scientists and researchers use AI and HPDA to perform sophisticated evaluation of real-time instrument data as it is gathered from connected, sensor-based equipment. With advanced analysis and aggregation of data in flight, research teams can interactively steer their simulations, refining their models on the fly. This can help improve resource utilization and speed time-to-insight. Similar use cases are emerging in sectors such as manufacturing, where machine learning and computer vision technologies identify and correct product defects in real time, and financial services, where in-flight analysis of data streams can detect and thwart security threats and respond to other anomalous activities.
Advancing the Convergence
Currently, HPC and AI/HPDA are developed in widely divergent hardware and software environments. However, converged use cases will require converged platforms that unite heterogeneous platform technologies, and efforts are underway to enable developing both of these on the same infrastructure via converged software stacks and compute infrastructure. One recent example is Intel® Deep Learning Boost, a new AI extension to Intel® Xeon® processors that accelerates AI applications while delivering leadership HPC performance via features like Intel® AVX-512.
These converged systems must cost-effectively execute the full range of scientific and technical computing applications—from equation-driven, batch-processing simulations to the automated, data-fueled analysis needed to support human decision-making in real time. Figure 1 shows the framework that Intel and our ecosystem partners are using as we work to enable a new era of converged, exascale computing.
Figure 1. The Intersection of HPC-HPDA-AI
The convergence of HPC, HPDA, and AI requires a unified software stack and resource management capabilities, along with robust platform technologies and a consistent architecture.
In contrast to traditional HPC platforms that emphasized processor and interconnect performance, these converged platforms must balance processor, memory, interconnect, and I/O performance while providing the scalability, resilience, density, and power-efficiency for next-generation computing. They must also offer flexible, unified capabilities for resource management and provisioning. Innovative approaches to storage will be needed to advance data curation and archiving and enable organizations to maximize the value of captured data throughout its lifespan.
Cloud-based infrastructure--whether on-premise or in the public cloud, will play an important role in providing access to HPC resources. A majority of HPC sites now run more than 10 percent of their HPC workloads on cloud infrastructure, and one-fourth run their most important HPC workload in a private cloud. Machine learning and AI are already the fastest growing cloud-based workloads, and workload convergence will no doubt drive further expansion.
AI is rapidly moving into the mainstream. It is integrated into a growing range of enterprise applications, and we interact with it in our daily lives. As we advance toward exascale and the convergence of AI, HPDA, and modeling/simulation, HPC resources will become mission-critical well beyond the traditional realms of science and technology. Converged HPC platforms and solution stacks will provide capabilities needed across the economic and societal spectrum, offering more opportunities than ever for researchers to make progress on their toughest challenges and enterprises to deliver deeply innovative products and services. There’s plenty of work to be done, but the vision is clear, and the rewards will be enormous.
Patricia Damkroger is responsible for developing and executing Intel’s strategy, building customer relationships, and defining the company’s product portfolio for technical computing workloads, including emerging areas such as high performance analytics, HPC in the cloud, and artificial intelligence. Prior to joining Intel in 2017, Damkroger served as Acting Associate Director for Computation at Lawrence Livermore National Laboratory (LLNL).