Julia has joined the rarefied ranks of computing languages that have achieved peak performance exceeding one petaflop per second – the so-called ‘Petaflop Club.’
The Julia application that achieved this milestone is called Celeste. It was developed by a team of astronomers, physicists, computer engineers and statisticians from UC Berkeley, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Intel, Julia Computing and the Julia Lab at MIT.
Celeste uses the Sloan Digital Sky Survey (SDSS), a dataset of astronomical images from the Apache Point Observatory in New Mexico that includes every visible object from over 35% of the sky – hundreds of millions of stars and galaxies. Light from the most distant of these galaxies has been traveling for billions of years and lets us see how the universe appeared in the distant past.
Cataloging terabytes of stars and galaxies
Since SDSS data collection began in 1998, the process of cataloging these stars and galaxies was painstaking and laborious.
So the Celeste team developed a new parallel computing method to process the entire SDSS dataset. Celeste is written entirely in Julia, and the Celeste team loaded an aggregate of 178 terabytes of image data to produce the most accurate catalog of 188 million astronomical objects in just 14.6 minutes with state-of-the-art point and uncertainty estimates.
The Celeste team achieved peak performance of 1.54 petaflops using 1.3 million threads on 9,300 Knights Landing (KNL) nodes of the Cori supercomputer at NERSC. This result was achieved through a combination of a sophisticated parallel scheduling algorithm and optimizations to the single core version which resulted in a 1,000x improvement on a single core compared to the previously published version.
New challenges managing data from space telescopes
The Celeste research team is already looking to new challenges. For example, the Large Synoptic Survey Telescope (LSST), scheduled to begin operation in 2019, is 14 times larger than the Apache Point telescope and will produce 15 terabytes of images every night. This means that every few days, the LSST will produce more visual data than the Apache Point telescope has produced in 20 years. With Julia and the Cori supercomputer, the Celeste team can analyze and catalog every object in those nightly images in as little as 5 minutes.
The Celeste team is also working to:
- Further increase the precision of point and uncertainty estimates
- Identify ever-fainter points of light near the detection limit
- Improve the quality of native code for high performance computing
The Celeste project is a shining example of:
- High performance computing applied to real-world problems
- Cross-institutional collaboration including researchers from UC Berkeley, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Intel, Julia Computing and the Julia Lab at MIT
- Cross-departmental collaboration including astronomy, physics, computer science, engineering and mathematics
- Julia, the fastest modern open source high performance programming language for scientific computing
- Parallel and multithreading supercomputing capabilities
- Public support for basic and applied scientific research