Hungry for AI? New supercomputer comprises 16 dinner-plate-size chips

5

[ad_1]

Enlarge / The Cerebras Andromeda, a 13.5 million core AI supercomputer.

On Monday, Cerebras Techniques unveiled its 13.5 million core Andromeda AI supercomputer for deep studying, stories Reuters. In accordance Cerebras, Andromeda delivers over one 1 exaflop (1 quintillion operations per second) of AI computational energy at 16-bit half precision.

The Andromeda is itself a cluster of 16 Cerebras C-2 computer systems linked collectively. Every CS-2 comprises one Wafer Scale Engine chip (typically referred to as “WSE-2”), which is presently the most important silicon chip ever made, at about 8.5-inches sq. and full of 2.6 trillion transistors organized into 850,000 cores.

Cerebras constructed Andromeda at a knowledge heart in Santa Clara, California, for $35 million. It is tuned for functions like giant language fashions and has already been in use for educational and business work. “Andromeda delivers near-perfect scaling by way of easy knowledge parallelism throughout GPT-class giant language fashions, together with GPT-3, GPT-J and GPT-NeoX,” writes Cerebras in a press launch.

The Cerebras WSL2 chip is roughly 8.5-inches square and packs 2.6 trillion transistors.
Enlarge / The Cerebras WSL2 chip is roughly 8.5-inches sq. and packs 2.6 trillion transistors.

Cerebras

The phrase “Close to-perfect scaling” signifies that as Cerebras provides extra CS-2 laptop models to Andromeda, coaching time on neural networks is lowered in “close to good proportion,” in line with Cerebras. Sometimes, to scale up a deep-learning mannequin by including extra compute energy utilizing GPU-based techniques, one would possibly see diminishing returns as {hardware} prices rise. Additional, Cerebras claims that its supercomputer can carry out duties that GPU-based techniques can’t:

GPU not possible work was demonstrated by one in all Andromeda’s first customers, who achieved close to good scaling on GPT-J at 2.5 billion and 25 billion parameters with lengthy sequence lengths—MSL of 10,240. The customers tried to do the identical work on Polaris, a 2,000 Nvidia A100 cluster, and the GPUs have been unable to do the work due to GPU reminiscence and reminiscence bandwidth limitations.”

Whether or not these claims maintain as much as exterior scrutiny is but to be seen, however in an period the place corporations typically prepare deep-learning fashions on more and more giant clusters of Nvidia GPUs, Cerebras seems to offer another method.

How does Andromeda stack up towards different supercomputers? Presently, the world’s quickest, Frontier, resides at Oak Ridge Nationwide Labs and might carry out at 1.103 exaflops at 64-bit double precision. That laptop value $600 million to construct.

Entry to Andromeda is obtainable now to be used by a number of customers remotely. It is already being utilized by business writing assistant JasperAI and Argonne Nationwide Laboratory, and the College of Cambridge for analysis.

[ad_2]
Source link