fock.haverford.edu

Introduction

fock.haverford.edu is the new Beowolf high-performance computing cluster that came online in December 2012. It has the capability to carry out most of the electronic structure calculations and MD simulations performed by Team Schrier. Basic information about this type of system can be found under the ScyId ClusterWare 6 heading of the Penguin Computing website.
Currently, fock is only accessible from computers on the Haverford network, and can be accessed via ssh into the master node
(term$: ssh <username>@fock.haverford.edu). To gain access to this system, you will need to request login information from Josh or Joe.

Basic Information

As aforementioned, fock is a Beowolf Cluster purchased from Penguin Computing. It contains one master and sixteen slave nodes, each featuring 8 processors per node. fock has a Unix-like OS with (I believe) a RedHat Linux distribution. It uses the MPICH and OpenMPI framworks to allow for efficient parallelization of programs such as Abinit or LAMMPS. The nodes are constructed from normally identical, commercially available computers. This cluster is designed to have a unified view of the independent machines, so that the cluster can be managed and used like a single machine. Thus, there is a single point of monitoring, so all logs, statistics, and status files are automatically forwarded to the master node. To check the status of all the nodes, use:

$: beostatus -c

Using fock

fock uses the Torque resource manager and job scheduler, similar to the Torque/Moab setup on NERSC to submit jobs to the queue. Therefore, some type of submission file will be needed (typically input.sub). An excellent introduction and documentation for the Portable Batch System (PBS) can be found here. A typical input file for an Abinit job to be submitted on fock will look something like this:

#PBS -N jobname
#PBS -l nodes=1:ppn=1
#PBS -j oe
#PBS -V
#PBS -M mdsmith@haverford.edu
#PBS -m abe
#PBS -A jsstudent01

cd ~/
mpirun -np 1 /opt/abinit/bin/abinit <filename.files>&log

In the last line of the above script, note how the full pathname for abinit is specified. This is because the binaries need to be available when running jobs on the compute (slave) nodes. fock does not currently feature a "module" system like NERSC/XSEDE.

To submit jobs from the command line, using the submission file:

$: qsub <submission_filename.sub>

Other PBS commands can also be used in the command line, such as qstat (Queue STATus), qdel <job_number> (DELete the job), etc.

Important Numbers

  • the size of the scratch directory for each node is 1 TB.
  • Each node has 64 GB of memory; each core can use up to 5.3 GB (compare to the 2.5GB typical load on NERSC's Carver processors.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License