The ITP Opteron Cluster TEAZER

    Actual work load  (ITP only)
Ganglia Cluster Report  (ITP only)
System Features
Use of the ITP Opteron Cluster
User Accounts
Installed Software
Status of the ITP Cluster
Contact

Adapted from the cluster pages at http://www.uibk.ac.at/zid/systeme/hpc-systeme by courtesy of the HPC Team of the Central Information Technology Services (ZID).

System Features

The ITP Opteron Cluster consists of one master node (teazer.uibk.ac.at) and 10 compute nodes (Quad-Core & Six-Core Opteron machines with a total of 184 cores) and offers brutto 2.5 TB attached storage.

  • calleo_332 The master node is a transtec CALLEO 332 server with 2 Quad-Core Opteron 2350 (2.0 GHz) processors (Barcelona) and 32 GB DDR2-667 Reg. ECC RAM. The attached storage consists of 2 SATA-2 disks (each 160 GB) configured as RAID1 for OS file systems and 6 SATA-2 disks (each 500 GB) configured as RAID5 for /home and /scratch file systems.

calleo_531 The compute nodes are:

  • node1 - node3 & node5 - node7: 6 transtec CALLEO 531 computers, each have 4 Quad-Core Opteron 8356 (2.3 GHz) processors (Barcelona) and 64 GB DDR2-667 Reg. ECC RAM, except node1, which has 128 GB DDR2-667 Reg. ECC RAM. The attached 80 GB SATA-2 disk provides the local swap and /tmp file system.
  • node4: 1 transtec CALLEO 531 computer, which has 4 Quad-Core Opteron 8380 (2.5 GHz) processors (Shanghai), a 160 GB SATA-2 disk for the the local swap and /tmp file system and 64 GB DDR2-667 Reg. ECC RAM.
  • node8 - node10: transtec CALLEO 531 computers, have 4 Six-Core Opteron 8431 (2.4 GHz) processors (Istanbul), a 160 GB SATA-2 disk for the the local  swap and /tmp file system and 64 GB DDR2-667 Reg. ECC RAM, except node9 which has a 250 GB SATA-2 disk for the the local swap and /tmp file system.

Get more details (fotos) about the hardware here.

The hardware was purchased from transtec AG (public relations).

Use of the ITP Opteron Cluster

How can I use the cluster?

To get a quick overview, see the Short Tutorial or have a look at the following specific topics:

Where can I store my data?

Three different types of storage are available on the cluster:

directory storage integration description
/home shared to all nodes
  • High quality RAID5 storage with 1.9 GByte quota limitations.
  • All data in the home directories will be backed up daily by the Tivoli Storage Manager (TSM).
/scratch shared to all nodes
  • This storage should be used for large input and/or output files you need for your applications.
  • A RAID5 configuration prevents data loss caused by a disc failure. Please note, that there are no backups of scratch. Users are responsible for safeguarding their data according to their needs.
  • The quota (hard) limitations are 50GB.
/tmp local on every node
  • This data area may be used for temporary files during job execution.
  • Limits: 30 GB nodes 1-3 and 5-7, 89 GB nodes 4, 8 and 10, 171 GB node9.
    At the moment there are no quota limitations.
  • The HPC Cluster administrators are allowed to delete files in this directory after every job run, in case there is not enough space left.
  • Note: Automatic file deleting mechanism by tmpwatch (240 hours atime)!

Please note, that the directories /home and /scratch are avaliable on every cluster node. These directories are shared, whereas the /tmp directory is local on every node.

If the above mentioned quota limitations are too strict for your needs, please contact the system administrator (system-admin[at]mcavity.uibk.ac.at). It is no problem to increase the limits if there are reasons to do so.

User Accounts

The user accounts on the cluster are managed via the Network Information Service (NIS). Ordinary users can run the command yppasswd to change their NIS password. It will prompt them for their old NIS password and then ask them for their new password twice, to ensure that the new password was typed correctly.

$ yppasswd
Changing NIS account information for "user ID" on teazer.uibk.ac.at.
Please enter old password: <type old value here>
Changing NIS password for "user ID" on teazer.uibk.ac.at.
Please enter new password: <type new value>
Please retype new password: <re-type new value>

Changing NIS password has been changed on teazer.uibk.ac.at.

If the old value does not match the current value stored for that user, or the two new values do not match each other, then the password will not be changed.

Back Up of Home Directories

The files in the "home" directories of the HPC Cluster are automatically backed up daily by centrally-scheduled ADSM/TSM tasks. Thus, if you have deleted files or directories in your home by mistake you can restore them yourself by running the command /usr/bin/dsm at the command line on the master node, where your home directory physically is located (see Restore deleted files/directories).

Note: Files in directories named tmp or cache will be NOT backed up!

For more information about ADSM/TSM, see Tivoli Storage Manager or ADSM/TSM-Server (FABS: HSM, Backup) and TSM V5: Sichern und Wiederherstellen von Daten at the ZID home page.

User Quota

In order to get a better grip on some of the more pesky users of the ITP Opteron Cluster we have activated quotas on the central file server teazer. To check your disk quota type quota at the command line. The quota(1) command displays the current disk usage along with your personal limits for disk space (blocks) and number of inodes (files). The soft limit (quota) can be temporarily exceeded (for a grace period of 7 days) whereas the hard limit (limit) is an absolute upper bound.

If you run out your quota you might first choose to tar and gzip directories. This is convenient as the kfm-window manager allows you to view and manipulate tgz files just like ordinary directories. Other simple strategies are:

  • remove unused dvi, aux, log files;
  • clear your firefox cache;
  • avoid keeping many huge matlab data files if possible.

Note: If your (hard) quota is exceeded and/or you do not have any grace time for your soft quota, your submitted jobs will be aborted (due to write errors)!

Installed Software

  1. Operating System
    The cluster is running CentOS 5.11, which is a compatible rebuild of Redhat-Enterprise Linux, on the master node and CentOS 5.2 on the compute nodes.
  2. Job-Management-System
    Submit your jobs via Sun Grid Engine 6.1 U4.
  3. Software environment
    Set up your software environment by using Modules environment 3.2.6.
  4. Software packages
    Have a look at the available packages like compilers, parallel environments, numerical libraries, scientific applications, etc.

Contact

If you need additional information about the ITP Opteron Cluster or an account on it or if you have problems with applications, please contact your system administrators:

E-mail address: system-admin[at]mcavity.uibk.ac.at
Phone number: 52212