Home > TERATEC FORUM > Workshops > Workshop 6

TERATEC 2025 Forum
The European meeting for Experts in High Power Digital
Simulation . HPC/HPDA . Artificial Intelligence . Quantum Computing

Thursday May 22
Workshop 06 - 9:30 am to 11:30 am

Components and increasing performances of HPC systems: effervescence, divergence, convergence
Chaired by Marc Duranton, Research Fellow, CEA and Denis Dutoit, Program Manager,Advanced Computing, CEA

Leveraging Wafer-Scale Innovation for Next-Gen AI Acceleration: The Cerebras Revolution
By Alexander Mikoyan, Technology Sales Leadership & Transformation, Cerebras

Californian Cerebras Systems have solved the manufacturing and system challenges in designing a wafer-scale processor and in building production grade computing systems around it. Now at the third version of the hardware and with mature stack of the accompanying software, this technology is being deployed at scale for HPC acceleration, Generative AI training and inference. Today Wafer-Scale Engine processor is the most cited AI processor outside commodity computing technologies by some margin.

With high bandwidth and low-latency fabric this processor architecture is allowing for excellent parallel efficiency for non-linear and highly communicative codes. With a very significant amount of SRAM is located on the silicon and only one cycle away, users can gain significant acceleration for stencil based PDE solvers, linear algebra solvers, signal processing, sparse tensor math, big data analysis. The papers exploiting these capabilities have been Gordon Bell Prize finalists for four years in a row.

For AI training the clusters of Cerebras’s systems allow for data-parallel only LLM training – no parallel programming is required. These clusters demonstrate almost ideal linear scaling of performance with additional compute. In these clusters model memory and compute can be increased completely independently – impossible for other commonly used technologies. This leads to more efficient training capabilities with Cerebras’s technology – less of everything: floorspace, power, efforts, time from idea to value. In AI inference, the users benefit from very high memory bandwidth that takes the performance in the autoregressive LLM inference to the next level compared to the current implementations using older approaches. Regardless of the model, Cerebras’s inference clusters demonstrate 20x-70x speed gain vs. computing architectures used in hyperscalers for example.

Cerebras Systems’ offer mature technology that delivers 2-3 orders of magnitude performance gains in certain physical modeling and can be considered as another type of physical instrument. The same technology accelerates AI and expands the art-of-the-possible in AI for enterprise and research use cases.

Biography: Since 1996 Alexander Mikoyan worked in general management and commercial leadership positions in a number of technology companies, including Alcatel, Thales, HP and HPE, Noventiq and now Californian Cerebras Systems. Alexander led teams that brought some most advanced products and solutions into challenging emerging markets. He has experience in working in space, aerospace, defence, telecom and information technology deploying mission and business critical solutions in multiple vertical industries.

 

Register now and get your badge here >>>

© Ter@tec - All rights reserved - Lawful mention