On Tuesday, November 14, attendees to this year’s Supercomputing Conference (SC17) keynote address will get an in-depth look into the Square Kilometre Array (SKA) Project, one of the most ambitious international research efforts ever undertaken. We talked with SKA principles Professor Phil Diamond and Dr. Rosie Bolton about what the project entails and what kinds of supercomputing resources that will be needed to drive the science.
Diamond, who is Director General of the project, characterizes the SKA as the next-generation radio telescope. It will also be the largest such observatory in the world, and, true to its name, will be spread over approximately one square kilometer of real estate. The radio telescopes will be co-located in South Africa and Australia, which will maintain their own arrays. For the first phase of the project, the Australian array will be comprised of over one hundred thousand antennas, which will grow to as many as a million antennas in phase two. Meanwhile, almost 200 dishes, 64 of which are already under construction, will be deployed in South Africa. Phase 1 of the SKA project is scheduled to be completed in 2023, with phase 2 slated for 2030.
According to Diamond, even the phase one array will be much more sensitive than any existing radio telescope system, enabling it to detect much weaker emissions than are currently possible. It will also be able to cover a larger area of the sky more quickly, an attribute known as “survey speed.” The SKA website claims the Australian array will be eight times as sensitive as LOFAR, which is currently the best such instrument of that type in the world. It will also be able to scan the sky about 135 times faster. These two attributes set the SKA apart from anything currently deployed, and will enable the array to “tackle science goals that the current generation of radio telescopes can’t tackle,” Diamond told TOP500 News.
That sensitivity will allow the array to observe hydrogen all the way back to the “cosmic dawn,” which is just a few hundred thousand years after the big bang. That, in turn, will enable scientist to determine how the first stars were born, how they coalesced into the first galaxies, and ultimately how these galaxies evolved into what we see today. The array will also be able to use pulsars as natural clocks to study the passage of gravitational waves across our solar system. In addition, the SKA will be able to look for large organic molecules, like the basic constituents of amino acids and nucleic acids, which could offer some profound insights into the origin of life. “When people want a sound bite, I say we’re going to study the history of the entire universe,” said Diamond.
That level of capability, though, creates some real challenges when it comes to data processing. According to Rosie Bolton, SKA Regional Centre Project Scientist, each of the instruments will produce over a petabyte of data per day, and that will drive the design and placement of the supercomputers that need to ingest and process that data.
With regard to the latter, the two supercomputers that will get first crack at the raw data streaming from the radio telescopes will be located in close proximity to the instruments themselves. In this case, that means one of the HPC facilities will be in Cape Town, in southwestern South Africa, and the other in Perth, Western Australia. Bolton notes that this will be the first time that radio telescopes were directly connected with supercomputers on the backend. That kind of setup has its commercial analogy in the HPC machinery co-located in Wall Street datacenters, where they are fed real-time market data so they execute split-second trading decisions.
And like those Wall Street clusters, the SKA systems will be optimized for bandwidth – memory, I/O and network – and will need to be relatively deterministic due to the time constraints imposed by the data rates. We need to make sure that we don’t lag behind in the way that we can process the data that the antennas collect,” explained Bolton. “That quite special.”
The precise design of these systems is still to be determined, but Bolton did offer that these will essentially be dataflow machines built with exascale technology. She estimates that the first iteration of the hardware will be in the neighborhood of a quarter of an exaflop in computational power, but the data volumes to be accommodated will easily run into multiple exabytes.
The devil will be in the details. Both systems will be processing rather large and complex graphs – a typical graph for an SKA process might contain 400 million points. Because of that complexity and performance required for near real-time data ingestion and processing, they need to determine within a few percentage points how long it will take from the data to flow from one graph point to another. That problem has to solved in software, but the hardware itself must be fast enough to make that feasible.
They are also envisioning a network of SKA regional centers spread around the world, which will also house large supercomputers. They will be located at universities and other research institutions, and will be tasked with analyzing the post-processed data from the backend machines in Cape Town and Perth. Those workflows will look like more typical HPC applications, in which virtual models are created from the massaged data.
The effort will not stop with these initial HPC systems. The SKA instruments are expected to operate for at least 50 years, so the computational hardware will get many refreshes over this time frame as budgets allow. Software too will improve over its extended lifetime, as researchers figure out new ways to re-interrogate the data. As Bolton explained: “Were putting in place the kind of metal that we’ll continue to exploit as the decades go by, and we’ll just keep getting better and better at extracting the science.”
As far as what kinds of discoveries will be revealed, no one really knows. Often the biggest breakthroughs are uncovered accidentally when looking for something else entirely. The revelation of the existence of dark energy by the Hubble telescope when it was studying the expansion of the universe is one prominent example. “That’s what discovery and science is all about,” said Diamond.