Google
To combine the benefits of the basic algorithms, we propose hybrid CR+PCR and CR+RD algorithms, which improve the performance of PCR, RD and CR by 21%, 31% and 61% respectively. Our GPU solvers achieve up to a 28x speedup over a sequential LAPACK solver, and a 12x speedup over a multi-threaded CPU solver.
Jan 1, 2010
Jan 1, 2021The paper proposes new computational algorithms for solving tridiagonal systems using higher-order system reductions.
We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU.
Feb 19, 2019This paper presents several implementations for the parallel solving of large tridiagonal systems on multi-core architectures, using the OmpSs�...
Aug 17, 2007To solve a tridiagonal system you need a ... We will see that the transpose method are faster until M ~4,000 [under single precision].
The solution of tridiagonal system of equations using graphic processing units (GPU) is assessed. The parallel-Thomas-algorithm (PTA) is developed.
People also ask
The method requires solving a modified non-cyclic version of the system for both the input and a sparse corrective vector, and then combining the solutions.�...
Missing: Speedup | Show results with:Speedup
Nov 9, 2020In this paper, we consider the solution of a tridiagonal Toeplitz system, where A is subdiagonally dominant, superdiagonally dominant, or weakly diagonally�...
Our GPU solvers achieve up to a. 28x speedup over a sequential LAPACK solver, and a 12x speedup over a multi-threaded CPU solver. Categories and Subject�...
In this article, we improve our previous implementation in order to accelerate the tridiagonal solvers on GPU using efficient memory techniques, such as pinned�...