Job submission using Torque

Job submission and queuing on hydra cluster is handled by Torque resource manager and Maui scheduler. Before you do that MPI users may need to Select your MPI compiler, while others pick your favorite compiler from the list of available software. Compile your code on master node.
The most efficient way to utilize the cluster is to submit your job through the scheduler. This is essentially a two step process:

  1. Prepare a job script - the PBS command file
  2. Submit it to the scheduler using the command "qsub"


 

Preparing a job script

A job script may consist of PBS directives and executable statements. A PBS directive specifies the job attributes. Inside a job script file, these directives can be identified as lines that begin with "#PBS". Many of these directives can be passed to qsub via command line as well. For the starters, you can try:

IMPORTANT: The default time limit (wall time) for any job is one hour, which
means your job will be killed if it is not completed within an hour. Please
specify wall time limit in your PBS script if your job is expected to take longer time.


 

Submitting the job through qsub

Once you have the script ready (along the lines of examples given above), you should submit it through the scheduler. The basic syntax is:

qsub jobscript.sh

where jobscript.sh is the file containing shell commands and PBS directives that are to be executed.
You can create this using the job script generation tool. You may also specify PBS directives as command line options to qsub. Command line arguments override the directives specified in the script. Batch output (your job's stdout and stderr output) is returned to the directory from which you issued the qsub command when your job finishes.

Interactive Access

Interactive access is done via the qsub -I command. This may be useful for debugging or test runs.
An example is given below:

qsub -I -l nodes=N,ncpus=n,walltime=hh:mm:ss jobscript.sh

You should use this only for short runs and ensure that processors are available on the cluster before you issue the command. The default wallclock time for jobs is one hour and this includes interactive access. When you use qsub -I you hold your processors whether you compute or not. Therefore, ensure that you
hit ^D as soon as you are done with your commands to end your interactive job.

Monitoring and Killing Jobs

To see the status of the queues and jobs from all users, use the Maui command "showq". For eg:

[romanov@hydra ~]$ showq

ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
428 smirnoff Running 1 02:15:30 Thu Oct 13 15:13:39
1 Active Job 1 of 456 Processors Active (0.22%)

1 of 19 Nodes Active (5.26%)

 

IDLE JOBS----------------------

JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs

 

BLOCKED JOBS----------------

JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
Total Jobs: 1 Active Jobs: 1 Idle Jobs: 0 Blocked Jobs: 0

[romanov@hydra ~]$

To check the status of your job, you can use the command qstat

[smirnoff@hydra ~]$ qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
428.hydra submit.sh smirnoff 03:30:00 R batch

[smirnoff@hydra ~]$

Most of the information is self explanatory except for the code "S", which stands for Status. In the example given above, Status=R, which means job is running. Status=C for a completed job. The last field is the name of queue. Default queue is called batch.

To view more information about your running job, use the Maui command checkjob

checkjob -v job-number

To delete a job use the command qdel.

qdel job-number
Last updated on: February 20, 2024