Access Restriction

Author Strazdins, Peter
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword High Speed Cell Computation ♦ Various Form ♦ Optimal Performance ♦ Simple Interface ♦ Recent Development ♦ Tunable Architecture-dependent Parameter ♦ Fujitsu Ap1000 Ap ♦ High Performance ♦ Cell Blas Interface ♦ Various Cell Architecture ♦ Portable Distributed Blas Implementation ♦ Dblas Algorithm Term ♦ Distributed Blas ♦ Matrix Communication Operation ♦ Powerful Distributed Matrix Representation ♦ Dblas Algorithm ♦ Extreme Technique ♦ Dblas Code ♦ Cell Blas Algorithm
Description In this paper, we give a report on recent developments for the Distributed BLAS (DBLAS) project. These include a powerful distributed matrix representation which yields a simple interface to the DBLAS, and the redesign the DBLAS algorithms terms of powerful `spread' and `reduce' matrix communication operations for reasons of programmability. The DBLAS codes achieve portability by supporting BLACS and various forms of ApLib, including a locally developed `stride' ApLib (for the Fujitsu AP1000/AP+), which is optimal for the `spread' and `reduce' operations. Also, portability of ensuring high speed cell computation across various cell architectures involved designing a cell BLAS interface and the expression of the DBLAS algorithms in terms of a set of tunable architecture-dependent parameters. Cell BLAS algorithms have also been extended for SPARC 10 platforms, and required deeper software pipelining for optimal performance. More extreme techniques were required for the UltraSparc, with a...
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 1996-01-01
Publisher Institution In Sixth Parallel Computing Workshop