Google
Oct 7, 2020We achieve 0.86 PFLOPS on a single wafer-scale system for the solution by BiCGStab of a linear system arising from a 7-point finite difference�...
We achieve 0.86 PFLOPS on a single wafer-scale system for the solution by BiCGStab of a linear system arising from a 7-point finite difference stencil on a 600�...
Nov 19, 2020Here we demonstrate the potential for wafer-scale systems to achieve breakthrough performance on regular mesh finite difference (stencil)�...
Here we demonstrate the potential for wafer-scale systems to achieve breakthrough performance on regular mesh finite difference (stencil) problems that can fit�...
We explain the system, its architecture and programming, and its performance on this problem and related problems. We discuss issues of memory capacity and�...
Aug 9, 2024StencilPy, a portable, high-performance optimized code generator for stencil computations on current CPU, GPU, and wafer-scale solutions.
Sep 12, 2024Iterative solvers are limited by data movement, both between caches and memory and between nodes. Here we describe the solution of such systems�...
The solution of large, sparse, and often structured systems of linear equations must be solved on the Cerebras Systems CS-1, a wafer-scale processor that�...
In this paper a new fast algorithm for the computation of the distance of a matrix to a nearby defective matrix is presented. The problem is formulated�...
Fast Stencil-Code Computation on a Wafer-Scale Processor. The performance of CPU-based and GPU-based systems is often low for PDE codes, where large, sparse�...