From HEP at Tennessee
[edit]
Using Torque Batch System
Steps to configuring your workstation to submit batch jobs to the CMS cluster:
- Install the torque, torque-client, and torque-docs RPM packages from http://hep.phys.utk.edu/~gragghia/CMS/torque/ for the CPU architecture of your workstation.
- Set the Torque server name in the file /var/spool/torque/server_name to "cms254.phys.utk.edu".
- Ensure that your SSH keys in ~/.ssh/ are properly configured for automatic authentication between cluster nodes and your desktop:
$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/nfs/home/gragghia/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /nfs/home/gragghia/.ssh/id_rsa. Your public key has been saved in /nfs/home/gragghia/.ssh/id_rsa.pub. The key fingerprint is: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx gragghia@babar1.phys.utk.edu $ cd ~/.ssh $ cat id_rsa.pub >> authorized_keys
- Ensure that the file ~/.ssh/known_hosts does not contain invalid or out-of-date entries for your workstation. You should be able to ssh from your workstation, to a compute node, and back with no errors. The command "pbsnodes" will list available nodes for this test.
- Test a job submission using the "qsub" command ("qsub ./test.sh").
- Monitor the job with "qstat."
- To run an interactive job, you can either use the "-I" option for qsub or you can ssh directly to a compute node. The command "pbsnodes" will list all available compute nodes.