Inhalt

1   Hardware and Performance Characteristics of the Leo3e System

2   Known Problems

3   Leo3e Images

   3.1   Hot Aisle Infrastructure
   3.2   Front View Closed
   3.3   Front View Open
   3.4   Worker Nodes
   3.5   Inside View of Worker Node
   3.6   Worker Nodes Chassis Rear View
   3.7   File Server and Login Nodes
   3.8   Storage System
   3.9   File Server and Login Node Rear View
   3.10   File Server FC Cabling


1 Hardware and Performance Characteristics of the Leo3e System

  • Leo3e consists of a login node and 45 distributed-memory worker nodes with 20 CPU-cores and 64GB of memory each, totalling 900 cores. The two top numbered nodes have 512GB of memory, yielding a total of approx 3.7TB of memory. The 56GB/s FDR Infiniband network (used for MPI and GPFS) consists of two islands with an inter-island blocking factor of 2:1. It has a measured latency of 1.34µs and a point-to-point bandwidth of 6300MB/s per direction (including MPI overhead).

  • Each node of Leo3e has two sockets with ten cores of the Intel Xeon E5-2650-v3 Haswell-EP microarchitecture, set to run at a constant frequency of 2.6GHz to enable optimal performance for statically and dynamically balanced workloads.

  • Scratch storage is 54TB. Quota are 2.5 GB on $HOME, 1 TB on $SCRATCH.

  • The SGE parallel environments on Leo3e support up to 20 processes per node. Please use qconf -spl to get a list of defined PEs.

  • To get good performance from Leo3e, it is strongly recommended (or sometimes even necessary - depending on the libraries that you are using) to re-compile your CPU-intensive programs.

  • For the HPL benchmark running on the entire 900 core machine (n=531840), we have measured a performance of 28.5 TFlops/s at 2.6 GHz.

2 Known Problems

  • Intermittently, highly parallel communication intensive MPI test jobs crash in the setup phase or during phases of massive data interchange. Since it is not easy to pinpoint these failures, we wonder if they are relevant for your applications. Reports are welcome.

3 Leo3e Images

3.1 Hot Aisle Infrastructure

Leo3e is housed in an APC "Hot Aisle" computing center infrastructure together with other servers. Two rows of four racks and two cooling units enclose a central "hot" (30-40°C) aisle separated from the environment by a door and top panels. Air intake and exhaust are physically separated, preventing mixture of cool and warm air. This increases the efficiency of the heat exchangers used for cooling.

3.2 Front View Closed

Leo3e occupies the two middle racks of the rack row seen above.

3.3 Front View Open

At the top, one can see the disk arrays of the four IBM DS-3400 storage subsystems, the Fibre Channel switches connecting the storage arrays to the two redundant file servers (light blue cables), and a management and login node. Below the two spacer panels, four IBM NextScale chassis, the Ethernet switches (red cables) and the high speed Infiniband switches (black cables) can be seen. All cabling except power is on the front side. MPI data traffic between worker nodes goes over the Infiniband fabric, ensuring low latencies and high bandwidth.

3.4 Worker Nodes

One dummy (top) and five worker nodes. On each node, from left to right, one can see the Infiniband (black+blue) and Ethernet (red) cables and a tiny front panel for power control.

3.5 Inside View of Worker Node

In this inside view from front to back, one can see the main board with the Infiniband mezzanine card (blue, left), the CPU assemblies and memory DIMMs, and the boot disks and power connector of a worker node. Fans and power supplies are not part of the node, but are located in the chassis.

3.6 Worker Nodes Chassis Rear View

Two chassis are visible. In each chassis, six power supplies and ten fans are shared by all twelve nodes. Large fan diameters yield high cooling air throughput at low RPM, contributing to low noise and low energy consumption.

3.7 File Server and Login Nodes

Storage and two file servers are connected by two separate Fibre Channel (FC) switched fabrics. The entire storage configuration (disk arrays, RAID controllers, FC fabric, and file servers) is redundant and has no single point of failure. The file system is IBM's GPFS. The Infiniband connections are also used for GPFS traffic.

3.8 Storage System

To maximize Leo3e's compute capacity for the available budget, existing storage subsystems from Leo2 and another file server have been reused for this system. A total of 192 SAS disks spinning at 15kRPM ensure high I/O performance. Data is secured against individual disk failures by RAID5.

3.9 File Server and Login Node Rear View

These are conventional 2U systems cabled from the back.

3.10 File Server FC Cabling

Each file server has dual connections to each of the FC fabrics.