Cray Brings Graph Analytics, Deep Learning Capabilities to XC Supercomputers

June 20, 2017

By: Michael Feldman

Cray has launched the Urika-XC, a software suite that brings the capabilities of the Urika-GX big data appliance machine to the XC supercomputer platform. The idea is to offer customers a single system that can host both simulations and advanced data analytics.

From the GX appliance, the Urika-XC inherits the Gray Graph Engine and the Apache Spark Environment. The former is Cray’s custom graph analytics package, while the latter is open source framework for analytics cluster environments.

The software also comes with the Python-based distributed Dask parallel computing libraries for analytics, as well as BigDL, an Intel deep learning framework that is targeted to Xeon CPUs.  For the time being, this is the only built-in hook to deep learning for the Urika-XC. Support is also provided for the most popular programming languages for analytics applications, including Python, Scala, Java, and R.

“What we’re really trying to do with Urika-XC is make it easy for additional environments to run on the XC in concert with simulation workloads,” said Paul Hahn, group product marketing manager at Cray.

By porting this software to the XC hardware, Cray provides a much more scalable platform for the kinds of data analysis work that requires something bigger and more powerful than a cluster, including Cray’s own Urika-GX system. For all practical purposes, the GX tops out at 48 nodes, while XC systems, can scale to thousands of nodes.

With the Urika-XC, Cray is challenging the notion that HPC users will have to buy separate systems for simulation and analytics workloads. Certainly, customers would prefer to buy a single machine for both purposes, and as analytics becomes a larger part of HPC workflows, the decision to try to fit both types of workload into one machine becomes more pressing. Domains in which this mixed workflow model is already common include weather forecasting and climate science, seismic imaging, materials science, and CAE, among others.

Cray has already found one early customer that seems open to the idea of doing supercomputer-scale analytics. The Swiss National Supercomputing Centre (CSCS) is currently using the Urika-XC software on its 25-petaflop Piz Daint machine, which is comprised of XC50 and XC40 nodes.

“CSCS has been responding to the increased needs for data analytics tools and services,” said Prof. Dr. Thomas C. Schulthess, director of CSCS. “We were very fortunate to participate with our Cray supercomputer Piz Daint in the early evaluation phase of the Cray Urika-XC environment. Initial performance results and scaling experiments using a subset of applications including Apache Spark and Python have been very promising.  We look forward to exploring future extensions of the Cray Urika-XC analytics software suite.”

Some of those future extensions are likely to include support for GPUs and Xeon Phi processors, which will be especially useful for accelerating deep learning and graph analytics applications.

The Urika-XC software suite will be generally available in third quarter of 2017, and should run on any XC machine running Cray Linux Environment 6.0 UP02 and later.