Support Login
Please login here to access extra support resources offered by the ClearSpeed Service Level Agreements.
Email:
Password:
Forgot your password?
Request a login account

Use Adobe Reader to view PDF downloads.

ClearSpeed High Performance Linpack (HPL)

HPL is the standard benchmark in HPC and is the basis of the Top500 supercomputer ranking. HPL can be run on a cluster of servers. Those server nodes can be accelerated with ClearSpeed Advance cards and use the CSXL library to accelerate calls to DGEMM made by HPL.

The standard implementation of HPL will work with the CSXL library, however, extra performance and greater scalability will be achieved using ClearSpeed’s modified HPL.

ClearSpeed’s version of HPL is modified to more efficiently overlap MPI communication with computation. It has also been instrumented to allow the ClearSpeed visual profiler to be used for performance visualization and tuning.
To support this version of HPL, a modified version of the MPI library (MPICH) is provided to handle larger message sizes.

Download

Platform

Operating Systems

Source Code

AMD64 /
EM64T
(x86-64)

Red Hat Enterprise Linux 4 64-bit and SUSE® LINUX Enterprise Server (SLES) 9 64-bit

ClearSpeed HPL DownloadDetails
MPICH DownloadDetails

Performance

When used with a single ClearSpeed Advance card and sufficient memory in every node, a performance increase of at least 25 GFLOPS per node, over the base performance of the cluster, can be achieved. See below for system requirements to achieve the best performance.

Suitable Systems

The HPL software will run and scale well on a cluster of x86 nodes – where a “cluster” consists of two or more nodes. The performance and scalability of the current implementation has been verified on a number of systems with up to and including eight cores per node. It may not scale well with a larger number of cores per node.

The performance achieved will depend on the amount of memory in each node. To get the best performance we recommend that system have at least 8 GB of memory per node for each Advance card in that node. For two Advance cards in a single server we recommend a minimum of 16 GB. ClearSpeed accelerated HPL will run with less system memory but the performance will be limited.

In order to scale well the system must be homogeneous which means that every Linpack process must be the same, and the system should have the same number of x86 cores and Advance cards. The simplest case is that nodes are identical (with the same number of Advance cards in every node).

In addition, the cluster needs a high performance interconnect such as Infiniband. Performance will typically not scale as well if Gigabit Ethernet is used for large clusters of fast nodes.

Software requirements

ClearSpeed’s HPL will work with Advance cards running either the 2.5 or 3.x releases of the runtime and CSXL library. The supported operating systems are:

Associated Documentation

Linpack Getting Started Guide (PDF)

For Linpack benchmark information please click here.