June 6, 2017
By: Michael Feldman
The Defense Advanced Research Projects Agency (DARPA) has selected Intel to develop a graph analytics processor that will be a thousand times faster than anything available today.
The work is being done under DARPA’s Hierarchical Identify Verify Exploit (HIVE) program, a four-and-half year effort whose goal is to develop and integrate new graph hardware and software technologies for accelerating DoD analytics workloads. Along with Intel, DARPA has also brought in Pacific Northwest National Laboratory, Georgia Tech, Northrop Grumman, and Qualcomm Intelligent Solutions to help principally with the system software and application effort.
Graph analytics is applied to problems where casual relationships need to be derived from the large datasets. This applies to a wide array of applications such as transportation routing, genomics processing, financial transaction optimization, and consumer purchasing analysis, just to name a few. In the case of DARPA and the DoD, the more relevant applications are in areas like communications, intelligence, surveillance, and reconnaissance.
The problem is that the average computer cluster is not very adept at these types of problems. Generally, graph processing requires large amounts of high-bandwidth memory to operate with any efficiency. And as the problem size gets larger, the cluster network becomes a secondary bottleneck.
In a write-up posted on DARPA’s news site, Trung Tran, a program manager in the agency’s Microsystems Technology Office (MTO), outlined the case for the project. “Today’s hardware is ill-suited to handle such data challenges, and these challenges are only going to get harder as the amount of data continues to grow exponentially,” explained Tran. The HIVE effort adds the additional demand of real-time support for cases where streaming data needs to be analyzed on the fly.
The challenge is that correlation across a graph tends to be computationally expensive, requiring a processor that is highly parallel in nature and has access to highly performant memory. The closest commercial architectures we currently have for this computing model is Intel’s Xeon Phi and GPUs. It’s noteworthy that DARPA’s original HIVE description made a specific reference to graphics processors, saying “the goal is to see a 1000x improvement in power and performance on the HIVE chip compared to a GPU.”
That doesn’t mean the graph analytics processor will be some variant of the Xeon Phi. The DARPA document specifies the HIVE effort will be to “research and design a new chip architecture from scratch.” Much of the work will actually focus on componentry outside the processor cores themselves, especially in the development of a memory architecture that supports a multi-node NUMA model.
More specifically, DARPA has outlined the HIVE architectural goals as follows:
- Create an accelerator architecture and processor pipeline which supports the processing of identified graph primitives in a native sparse matrix format.
- Develop a chip architecture that supports the rapid and efficient movement of data from memory or I/Os to the accelerators based on an identified data flow model. Emphasis should be on redefining cache based architectures so that they address both sparse and dense data sets.
- Develop an external memory controller designed to ensure efficient use of the identified data mapping tools. The controller should be able to efficiently handle random and sequential memory accesses on memory transfers as small as 8 to 32 bytes.
Presumably most of the hardware effort will fall to Intel, which will tap its Data Center Group, Platform Engineering Group, and Intel Labs to develop the graph analytics processor. The company stands to collect more than $100 million from DARPA over the four-and-a-half-year project.
According to Dhiraj Mallick, vice president of the Data Center Group and general manager of the Innovation Pathfinding and Architecture Group at Intel, by the middle of 2021, they and their HIVE contract partners will deliver “a 16-node demonstration platform showcasing 1,000x performance-per-watt improvement over today’s best-in-class hardware and software for graph analytics workloads.”
Mallick says commercial graph analytics products from this effort may arrive even sooner.