Leading-edge AI Computing System now at Home with Brookhaven Lab’s Computational Science Initiative

The Computational Science Initiative (CSI) at the U.S. Department of Energy’s Brookhaven National Laboratory now hosts one of the newest computing systems aimed at enhancing the speed and scale for conducting diverse scientific research: the NVIDIA® DGX-2™ Artificial Intelligence supercomputer.

NVIDIA’s latest deep learning high-performance computing system, the DGX-2, now is part of Brookhaven’s Computational Science Initiative. Photo courtesy of NVIDIA.

Designed to “take on the world’s most complex artificial intelligence challenges,” the NVIDIA DGX-2 at Brookhaven is one of the first available worldwide. At the Lab, the NVIDIA DGX-2, nicknamed “Minerva,” will serve as a user-accessible multipurpose machine focused on computer science research, machine learning, and data-intensive workloads.

According to Adolfy Hoisie, who directs Brookhaven’s Computing for National Security Department, having the NVIDIA DGX-2’s compute power, which includes a 2-petaflops graphics processing unit (GPU) accelerator made possible by a scalable architecture built on the NVIDIA NVSwitch™ AI network fabric, will afford opportunities for diverse research pursuits with impact across the laboratory.

In the area of systems architecture research, Hoisie expects that the NVIDIA DGX-2 will provide insights in evaluating the performance, power, and reliability of state-of-the-art computing technologies for various workloads.

Because the NVIDIA DGX-2 specifically was designed to tackle the largest data sets and most computationally intensive and complex models, it also will play an important role in the Lab’s machine learning efforts. One such beneficiary will be the ExaLearn collaboration, an Exascale Computing Project co-design center featuring eight DOE national laboratories and led by CSI’s Deputy Director, Francis J. Alexander. The ExaLearn team primarily is developing machine learning software for exascale applications.

The NVIDIA DGX-2 also will be engaged as part of CSI’s ongoing management, development, and discovery associated with the analysis and interpretation of high-volume, high-velocity heterogeneous scientific data.

“We will expose the NVIDIA DGX-2 to data-intensive workloads for many programs, such as those of import to DOE science programs at the Lab’s Office of Science User Facilities—including the Relativistic Heavy Ion Collider, National Synchrotron Light Source II, and Center for Functional Nanomaterials—and to Department of Defense (DoD) data-intensive workloads of interest,” Hoisie explained. “Given significant bandwidth in and out of the system, we can pursue data analyses in multiple paradigms, for example, streaming data or fast access to vast amounts of data from Brookhaven Lab’s massive scientific databases. Such improvements will afford tremendous strides in data analyses within the Lab’s core high energy physics, nuclear physics, biological, atmospheric, and energy systems science areas and cryogenic technologies, as well as for specific research areas in computing sciences of interest to DOE and DoD.”

CSI’s DGX-2 also will be a resource for NVIDIA as part of a collaboration. As research involving the system advances, its capability in impacting applications, speed to solutions, or even markers of its own overall performance will be shared between Brookhaven Lab and NVIDIA developers.

DGX-2 is the newest addition to NVIDIA’s portfolio of AI supercomputers, which began with the DGX-1, introduced in 2016. The DGX-2 brings new innovations to AI, including the integration of 16 fully interconnected NVIDIA Tesla® Tensor Core V100 graphics processing units with 512 gigabytes of GPU memory.

“We built the NVIDIA DGX-2 to solve the world’s most complex AI challenges, so we’re delighted that Brookhaven National Laboratory will put its innovations to use to further real-world science,” said Charlie Boyle, senior director of DGX Systems at NVIDIA. “The Lab’s researchers will be able to tap into the system’s 16 NVIDIA Tesla V100 Tensor Core GPUs—delivering two petaflops of computational performance—to help address opportunities of national importance.”

Source: BNL


Comment this news or article