Primer for Working on the AVIDD Linux Cluster

(Part of HPC @ IU Workshop, Indiana University)


Table of Contents

  1. Establish Your Own AVIDD Account

  2. Login to AVIDD using SSH

  3. Submit Simple Jobs on AVIDD (Using PBS Job Manager)

  4. Low-down on AVIDD - Technical Details

  5. Working with Portable Batch Scheduler (PBS) on AVIDD

  6. Softenv to Setup Your Working Environment

  7. AVIDD's File Storage Options


Requesting An Account (or Software)


Account Requests

If you wish to apply for your own account on the any of the UITS-Research Computing (RC) division's supercomputing resources (Big Red, AVIDD, Libra, RDC), visit the Research and Technical Services (RATS) application web page and submit a request.

Software Requests

If you already have an account on one (or more) of UITS-RC supercomputing resources (Big Red, AVIDD, Libra, RDC), and would like to request software to be installed, then visit the application web page and submit a request.

For the purpose of this workshop...

Workshop attendees - STOP! For the purpose of this workshop, we have already created a training account for you on Big Red. For convenience sake, we request you use the training account, provided to you, through out this workshop.

Top


Login to AVIDD using SSH


  • You can use the SSH Client on Microsoft Windows or plain old ssh on a Unix based workstation/laptop to login to the AVIDD cluster. For example, from a Unix workstation you can login as follows:

     $ ssh your_username@avidd-b.iu.edu 
    Example:
     [ag@peart agopu]$ ssh hpctrn01@avidd-b.iu.edu
    <<---- Do this!
    The authenticity of host 'avidd-b.uits.indiana.edu (129.79.228.232)' can't be established.
     RSA key fingerprint is 9c:63:4f:a0:90:95:5b:e3:76:c3:17:eb:96:9b:f3:87.
     Are you sure you want to continue connecting (yes/no)? yes
     hpctrn01@avidd-b.uits.indiana.edu's password:
     [hpctrn01@bh2 hpctrn01]$
          


  • Bash is AVIDD's default shell:

    • The default shell when you get an AVIDD account is bash; this tutorial assumes you'll continue to use bash as your shell. If you use another shell: just type bash to switch to the bash shell. You can also continue using your favorite shell, but you'll have to customize shell commands used in this tutorial to work for that shell.
  • Useful notes:

    1. Head nodes versus compute nodes: You can login only onto the head nodes bhX/ihX on AVIDD. Thus, by logging on to avidd-b.iu.edu as shown above, you will actually be logging onto bh2.uits.indiana.edu (or bh1/bh3); the exact head node you are assigned is decided on a round-robin basis. AVIDD's compute nodes are named bcXX; they are only accesible from the head node or from each other, not from the outside world; thus you cannot ssh into bcXX nodes directly.

    2. Note to Microsoft Windows users: You cannot open Totalview/Intel Trace Analyzer if you use the SSH client on MS Windows. You'll need some sort of X-window emulation software like Cygwin. We recommend using XLiveCD created by IU folks Dick Repasky et. al. For more information, look at the Using Windows and XLiveCD section.
    3. Plan to use X applications from Compute nodes?? Then disable X11 forwarding: To use the TotalView Debugger / Intel Trace Analyzer or any other graphical application from AVIDD compute nodes, you must disable default X forwarding (by specifying the -x flag):

       $ ssh -x your_username@avidd-b.iu.edu 

      Warning: To use other common graphical applications like emacs, etc. from the head node, you must not use the -x option i.e. you must keep default X forwarding enabled.

    4. Intra cluster logins: When your AVIDD account is created, passphrase-less SSH keys should have been created for you automatically. But if you see an error message:

      Permission denied (publickey,password,keyboard-interactive)
      when you try to login between nodes within the cluster then it is an indication that the intra-cluster RSA keypair in your home directory is either not present or has been messed up. Please run:
      $ gensshkeys
      from any login node (bhX/ihX). That will generate a keypair with a null passphrase, which will allow you to ssh into any node in the cluster interactively as well as non-interactively.
    5. About forwarding email address for job-related messages: The Big Red and AVIDD clusters send email about your jobs to the address specified in the ~/.forward file (Note the "." preceeding the filename) on your home directory. By default, this is setup at account creation time, to have the email address you provided when you requested for your account.

      However, if you'd like to change the email address that job emails are sent, you can do so as shown below:

       hpctrn01@BigRed:~> echo "myemailid@hotmail.com" > ~/.forward 

      Warning: You should use a valid email id! Failing to do so will result in inability to get email notifications about the status of your jobs (and also will annoy the sys-admins with bounced emails; you don't want to earn their wrath ;-))

    6. More AVIDD info? If you are interested in learning more about the AVIDD cluster, you might find the AVIDD system's homepage useful.

    7. SSH KB article: For more information on using SSH from Unix/Windows platforms, check the Using SSH for More Secure Connections webpage out. Though the above webpage is geared towards using SSH to log on to IU's IBM SP, it is still applicable in our context as well.
      For general information on SSH see: http://kb.indiana.edu/data/aelc.html.

Top


Submit Simple Jobs on AVIDD (Using PBS Job Manager)


  • If you'd like to read about the PBS job manager, then click here.

  • Copy example programs and scripts directory from hpc's home directory.
    [ag@bh2 agopu]$  cp -r ~hpc/simple_avidd_jobs ~/.
    [ag@bh2 agopu]$ cd ~/simple_avidd_jobs
    <<---- Do this!

  • The two simple (example) C and Fortran programs are listed below for your convenience:

    [ag@bh2 simple_avidd_jobs]$ cat sine.c
    /*
     *      Copyright 2005, The Trustees of Indiana University.
     *      Original author:  Arvind Gopu (UITS-RAC-HPC group)...
     *      . . . [ snip ] . . .
     */
    #include<stdio.h>
    
    int main () {
      double PI2=3.141592654/2.0, theta, sintheta;
      int i, N=4;
     
      for (i=0; i<=N; i++) {
        theta    = i * (PI2/N);
        sintheta = sin (theta);
        printf (" Sin (%8.6lf) = %8.6lf \n", theta, sintheta);
      }
      return 0;
    }
    [ag@bh2 simple_avidd_jobs]$ cat sine.f
    C
    C       Copyright 2005, The Trustees of Indiana University.
    C       Original author:  Don Berry   (UITS-RAC-HPC group); 
    C      . . . [ snip ] . . .
    C
            program sine
    
            real, parameter :: PI2=3.141592654/2.0
            integer, parameter :: N=4
            real   x,s
            integer  i
    
            do i=0,N
                    x=i*(PI2/N)
                    s=sin(x)
                    write(6,"(f11.6 f11.6)")  x,s
            end do
            end

  • Compile Program(s) on Head Nodes (bhX/ihX): Compile your code. Below, we've shown the use of Intel compilers (icc and ifort for C and Fortran programs respectively):
    [ag@bh2 simple_avidd_jobs]$ icc -o sine_c sine.c -lm
    [ag@bh2 simple_avidd_jobs]$ ifort -o sine_f sine.f
    <<---- Do this!

  • Submit job(s) to PBS: Next step is to submit jobs to PBS that will run your program(s). One of the example PBS job submission scripts is listed for your convenience (it runs your sine_c program):

    [ag@bh2 simple_avidd_jobs]$ cat submit_sine_c.sh
    #PBS -l nodes=1:ppn=1,walltime=5:00
    #PBS -m ae
    #PBS -N job_sine_c
     
    ${HOME}/simple_avidd_jobs/sine_c

    Go ahead and use the script to submit your job to PBS:
    [ag@bh2 simple_avidd_jobs]$ qsub submit_sine_c.sh
    <<---- Do this!

     348921.aviss.avidd.iu.edu
    [ag@bh2 simple_avidd_jobs]$ qsub submit_sine_f.sh
     348922.aviss.avidd.iu.edu
  • Check Job Status: Use the qstat command to check job statu (based on your username):

    [ag@bh2 agopu]$ qstat | grep <username>
    [ag@bh2 simple_avidd_jobs]$ qstat -u agopu
    <<---- Try doing this!
    Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
    --------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
    348921.aviss.av agopu    bg       job_sine_c    --    1   1    --  00:05 R   --
    348922.aviss.av agopu    bg       job_sine_f    --    1   1    --  00:05 Q   --
  • Output and Error Files: Assuming your job runs to completion, you can find messages it tried to print on the console in an output file. The directories, where the output file and the error file go, can be specified using the -o and -e flags respectively. The default file name of these files is of the format job_name.osequence_number and job_name.esequence_number , where job_name is the name of the job (check the -N option out) and sequence_number is the job number assigned when the job is submitted.

    So, for example, if PBS assigned job id 348921.aviss.avidd.iu.edu to your job, with jobname job_sine_c, then the output file would be named: job_sine_c.o348921 and the error file would named: job_sine_c.e348921. (Also see the use of the -j PBS directive as explained in the section above.)

    [ag@bh2 simple_avidd_jobs]$ ls job_sine_*
    <<---- Try doing this!
    job_sine_c.e348921   job_sine_c.o348921   job_sine_f.e348922   job_sine_f.o348922

    Go ahead and check those output files out:
    [ag@bh2 simple_avidd_jobs]$ cat job_sine_c.o348921
     Sin (0.000000) = 0.000000
     Sin (0.392699) = 0.382683
     Sin (0.785398) = 0.707107
     Sin (1.178097) = 0.923880
     Sin (1.570796) = 1.000000
    <<---- Try doing this!
    [ag@bh2 simple_avidd_jobs]$ cat job_sine_f.o348922
       0.000000   0.000000
       0.392699   0.382683
       0.785398   0.707107
       1.178097   0.923880
       1.570796   1.000000
    <<---- Try doing this!

  • Parallel Job for kicks! If you're getting more curious about how you could use multiple processors (nodes) at the same time, then try compiling the simple (example) parallel helloWorlds and try submitting a job to run it as shown below:

    [ag@bh2 simple_avidd_jobs]$ mpicc -o helloWorlds helloWorlds.c
    [ag@bh2 simple_avidd_jobs]$ qsub submit_parallel.sh
     348923.aviss.avidd.iu.edu
    
    [ag@bh2 simple_avidd_jobs]$ cat job_helloWorlds.o348923
    LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
     
    Hello, parallel worlds! This is processor bc27 and my rank is 1!
    Hello, parallel worlds! This is processor bc25 and my rank is 2!
    Hello, parallel worlds! This is processor bc27 and my rank is 0!
    Hello, parallel worlds! This is processor bc25 and my rank is 3!
     
    LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University

Top


Low-down on AVIDD - Technical Details


  • The AVIDD cluster consists of a huge number of compute nodes as well as storage space. Follow the links given below to understand how the AVIDD cluster is organized internally. Many of the things that you might do in subsequent sections of this workshop will make a lot more sense, if you understand the cluster's construction.

Top


Working with Portable Batch Scheduler (PBS) on AVIDD


  • What is PBS?
    The AVIDD cluster uses Portable Batch System (PBS) for job submission. The scheduling is done by the MAUI scheduler. In the the following sections, you'll see how PBS can be used interactively as well as non-interactively. Also, some useful PBS/MAUI commands are listed towards the end of this page.

  • Why PBS?
    As with any other supercomputer or cluster, a fair-share mechanism is incorporated within the AVIDD system. This mechanism does not allow users to run jobs on the head node or on the compute node outside of the PBS job submission system. Any job you submit outside of PBS is killed off if it runs for more than 20 minutes.

  • What if I want to debug code?
    Even if you want to debug code or do something interactively on a compute node, use the qsub -I mechanism to grab a couple of nodes through PBS. How you can do that is shown in the following section named Interactive: Using qsub -I on the command line.


Interactive: Using qsub -I on the command line

  • You can grab a couple of interactive nodes from PBS. Options that are used in this section can be applied in a non-interactive scenario. PBS options are explained below in the Useful Commands to check status... section.

    For example, to get 2 nodes, 2 processors per node, for 2 hours, you should do:

    [ag@bh2 agopu]$ qsub -I -l nodes=2:ppn=2 -l walltime=2:00:00
    <<---- Try doing this!
    You can expect to see a message like this:

    qsub: waiting for job 281988.aviss.avidd.iu.edu to start
    qsub: job 281988.aviss.avidd.iu.edu ready
    Following that, PBS will try to find and assign free compute nodes. Notice the change in Unix prompt from bhX to bcXX (in the example shown, from bh2 to bc56). This indicates that you are interactively logged in to a compute node; the prompt will default to your home directory.
    [ag@bc56 agopu]$ 
    As explained in the Login to AVIDD using SSH section, compute nodes on AVIDD are named bcXX or icXX where XX is a number between 01 and 92.
  • If you want to know what nodes have been allocated to your job, then you can find that information in a file pointed to by environment variable $PBS_NODEFILE.
    [ag@bc56 agopu]$ cat $PBS_NODEFILE
    <<---- Try doing this!

    bc56
    bc53
    bc56
    bc53 

    Note: If you attend the Parallel Programming workshop and run example MPI code, you'll realize that each process' rank is determined by the order of the nodes in $PBS_NODEFILE.

  • Once you get a bunch of nodes to work on, you can run your jobs on the new command prompt (that's on a compute node).


Non-Interactive: Using a PBS job submission script

  • Apart from running jobs interactively as explained in the previous sub-section, the other way (and more commonly used way for production code) you can run jobs on compute nodes is non-interactively. If you recall, you already tried doing that in the Submit Job(s) to PBS section within the Submit Simple Jobs on AVIDD (Using PBS Job Manager) section.

    The idea is that once you are done with (initial) interactive debugging activities, you can use a non-interactive mechanism to submit your jobs. PBS, along with the Maui scheduler, will schedule your job and assign free nodes to it.

  • One thing to remember with regard to non-interactive job submission is the fact that console output is written to an output and an error file as explained in the Output and Error Files sub-section of the Submit Simple Jobs on AVIDD (Using PBS Job Manager) section.


Important PBS options

  • We will look at some useful directives (flags) that can passed on to PBS. All directives to PBS job scripts (non-interactive job submission) begin with #PBS.

    Note: All PBS flags can be used at the command prompt as well without the #PBS. For example, to export the current working environment to your job, you can use -V at the command prompt when you do qsub -I.

  • The -l flag to PBS can be used to specify resource requirements and the like. For example, we have requested for 2 nodes, 2 processors in each of those nodes and a wall clock time of 3 hours for our interactive job (and 10 minutes in the case of our non-interactive job script).

  • The -q flag is used to specify the queue you want the job to be run on; PBS allows use of multiple queues -- for instance, sys-admins may have multiple queues to segregate nodes. The default queue is bg if you submit jobs after logging on to avidd-b.iu.edu while the default queue is iq if you submit jobs after logging on to avidd-i.iu.edu. Thus, if you want to run jobs on compute nodes in the -I cluster (i.e. nodes in Indianapolis) and you're logged on to avidd-b.iu.edu, then you will need to use -q iq to indicate that.

    #PBS -q iq 

    Note: There is also a fastq queue available from time to time for short interactive jobs (10-30 mins, 2 nodes).

  • The -M flag is used to specify an email address to which messages will sent when a job begins/ends/aborts. Whatever you specify in the PBS script will override the address in your ~/.forward file.

    • The -m flag can be used to specify when an email should be sent.

      Possible values include:

      Character Job status for which email is sent
      a Abort
      b Begin
      e End

    • For example, #PBS -m ab would send an email when the job begins or if it gets aborted due to some error.

  • The -j flag can be used to direct PBS to join the output and error files [Once a job executes, output and error files are created with filenames: job_id.o and job_id.e respectively].

  • The -N flag can be used to specify a name for our job.

  • The -V flag is used to import your login environment ($PATH, etc.) into your PBS job.

    Note: If you decide not to export your shell environment in your PBS script (using #PBS -V), then you'll need to use absolute paths for all executables and data files used in a PBS script. For example, if we had not used #PBS -V in the above script, then we would have needed to specify something like this to execute mpirun and execute a parallel MPI jobs:
    /N/soft/linux-rhel3_AS-ia32/lam-gm-7.1.1-intel/bin/mpirun \
    		C ~agopu/MPI_Tutorial/helloWorld

  • Some "default" information (these may change at any time:

    • AVIDD's PBS assigns jobs to the bg queue.

    • Default number of nodes: 1

    • Default processors per node: 2

    • Default walltime: 2 hours

    Visit the AVIDD policy page for latest information.
  • For more information, try doing man qsub to see all possible options to qsub.


Useful Commands to check status of jobs, check nodes' status, delete a job, etc.

  • As explained in the Check Job Status sub-section of the Submit Simple Jobs on AVIDD (Using PBS Job Manager) section, you can use the qstat command to check the status of an non-interactive job.

    Based on job id:

    [ag@bh2 agopu]$ qstat <job id>
    Based on your username:
    [ag@bh2 agopu]$ qstat | grep <username>
    [ag@bh2 agopu]$ qstat -u  <username> 
    For example:
    [ag@bh2 agopu]$ qstat 281956.aviss.avidd.iu.edu
     Job id           Name             User             Time Use S Queue
     ---------------- ---------------- ---------------- -------- - -----
     281988.aviss       STDIN            agopu            00:01:57 R bg
    [ag@bh2 agopu]$ qstat | grep agopu
     281988.aviss       STDIN            agopu            00:01:57 R bg 

    Status codes you might see include:
    Status Code What does it mean?
    QQueued, waiting for free nodes
    RRunning
    CComplete
    CACancelled

  • To check how busy the queue is or how various nodes are being utilized, use the showq command.

    [ag@bh2 agopu]$ showq | less
     ACTIVE JOBS--------------------
     JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME
    
     282460                 user    Running    16     1:47:06  Wed Dec 15 16:33:20 
     . . . 
     204 Active Jobs     307 of  376 Processors Active (81.65%)
                         165 of  188 Nodes Active      (87.77%)
     . . .
     Total Jobs: 693   Active Jobs: 204   Idle Jobs: 301   Blocked Jobs: 188 
  • To delete a job, use the qdel command along with the job id. You can find the job id using the qstat command as explained above.

    [ag@bh2 agopu]$ qdel job_id 
    Example:
    [ag@bh2 agopu]$ qdel 281957.aviss.avidd.iu.edu
  • To check when your job might start (approximately), try the showstart command with your job id.

    [ag@bh2 agopu]$ showstart job_id 
  • To check why your job is not starting, try using the checkjob command with your job id.

    [ag@bh2 agopu]$ checkjob job_id 

Top


Softenv to setup your working environment


  • The AVIDD cluster uses SoftEnv, an environment management system, to permit users to customize their environment through the use of symbolic keywords.
  • Default settings in ~/.soft file:

    At your first login, a .soft file (under your home directory) will be created for you with a set of defaults. These defaults are defined by the @avidd line and can be modified.

  • Temporary changes to environment:

    To temporarily add a keyword (software) to your environment type:

    $ soft add +keyword
    To temporarily remove a keyword (software) from your environment type:
    $ soft delete +keyword
    For example, if you want to use gcc-4.0 instead of the default gcc (version 3.2.3) and also if you do not want Adobe Acrobat in your environment, then you could do:
    [hpctrn01@bh2 hpctrn01]$ soft add +gcc-4.0
    [hpctrn01@bh2 hpctrn01]$ soft delete +acrobat
    <<---- Try doing this!

    To revert back to default settings, as defined by your ~/.soft file, do:

    [hpctrn01@bh2 hpctrn01]$ resoft
  • Permanent changes to environment:

    If you want to make permanent changes to your environment, then you can edit your ~/.soft file to that effect...for example, if you never plan to use Adobe Acrobat and also always expect to use Clustalw, then your .soft file would look something like this:

    @remove +acrobat
    @avidd
    +clustalw

    Note: @remove lines should be entered before the default @avidd line. You will need to do resoft or log out and log back in for the changes to take effect.

  • What software is available?

    To see all the keywords that can be used on the system (i.e. software that are installed), use the softenv command. Keywords are listed with a preceeding '+' while macros (pre-defined lists of keywords) are listed with a preceeding '@'.
    [hpctrn01@bh2 hpctrn01]$ softenv
    <<---- Try doing this!

    SoftEnv version 1.4.2
          . . .
    These are the macros available:
        @avidd                         Default Environment for AVIDD users
          . . .
    These are the keywords explicitly available:
        +acrobat                       Adobe Acrobat Reader 7.0
        +gcc-4.0                       gcc-4.0-20050402
          . . .
  • Don't want to use Softenv?

    In the event that you would prefer not to use softenv, place an empty .nosoft file in your home directory.

  • More information:

    To learn more about softenv type:

    $ man softenv-intro

Top


AVIDD's File Storage Options


  • Referring to the cluster layout of AVIDD, you should be able to see that home directories are stored on a low-performance NFS mounted file system; and that there is also a GPFS file system (work bench).

    There are a few options as far as where you want to store what kind of data.

  • Where do I store data?

    • Datasets on GPFS: A good (blind) rule of thumb is "ALL data (sets) lives on GPFS", i.e. on /N/gpfs/some_directory/.

      AVIDD GPFS Technical Information

      • GPFS is a large file system with 1.6 TB of space in each of the clusters (B'town and Indy).
      • It is accesible from all compute nodes as well as the head node.
      • It uses RAID 5 (see system layout); data transfer is stripped across 4 storage nodes. GPFS write performance has been benchmarked at 64MB/s writes and 170MB/s reads, using 32 processes, MPI-IO, and a 32GB file!
      • Is the data backed up? NO. (See "Things to Remember" below for more information.)
    • Home directory only for programs (Source and executable): Store stuff like source code, batch scripts (PBS related or otherwise), executables and possibly documents on your home directory /N/u/${USER} . Rephrasing that, do not use your home directory to read in/write out datasets. It's a low performance NFS system not designed to take that kind of load.

      • Is the data backed up? YES.
      • Note: Always use ~/ to reference your home directory unless use of absolute path names (i.e. /N/u/username) is necessary;
        Why? Because your home directory might be moved from /N/u/username to a different location, but ~/ will always point to the accurate and current home directory.

    • Local scratch: Apart from GPFS and your NFS mounted home directory, you can also use local scratch on each compute node if your program does not require processes to use data across nodes. These are available as /scr on all compute nodes and are usually smallish in size (currently 10 GB).

      • Is the data backed up? NO. (See "Things to Remember" below for more information.)
      • Also, note that data on /scr is cleared out periodically at a frequency much greater than the one used in clearing out GPFS -- could be anything between 1-2 days to 15 days; unless your program is using some data, there is a risk it will be cleared out.
    • For more information on your diskspace options, see the AVIDD Usage Policies webpage.

  • Things to remember

    • Create your own sub-directory within GPFS: You should create a directory for yourself on /N/gpfs if you have not done so yet. Under that directory, you can create other sub-directories for multiple datasets or different projects.

    • GPFS-B and GPFS-I are different: GPFS at Bloomington and Indianapolis are not the same. So, if you login to avidd-b, then /N/gpfs will point to /N/gpfsb while on avidd-i, it will point to /N/gpfsi .

    • You need to backup your data: It is also important to remember that GPFS is not backed up and is also cleaned up regularly (60 days or so, but could be much shorter amount of time, beware!). So any data that you deem as important (or as required in the long term) should be backed up onto the HPSS mass store system.

      Once again, for up-to-date information on your diskspace options, see the AVIDD Usage Policies webpage.

    • More information on getting a Mass Store account and using HPSS from AVIDD/SP is available at the Using HPSS from IU Research Systems off of the AVIDD homepage webpage. Gustav Meglicki also has information on using HPSS along with GPFS; he also has a IU MDSS and CFS Tutorial off his homepage.

    • Beware /tmp users: If you write parallel code, then rememeber you cannot use /tmp to store data that'll be used across nodes. These are usually mounted off a local disk -- each compute node will have its own /tmp (just like /scr ).