User Tools

Site Tools


support:faq:getting_started

Getting Started

This document describes how to get started using your shiny new computing account. If you don't yet have an account please see this FAQ.

Keeping Your Secrets Private

We try to use SSL/Encryption wherever possible. Usually this means you have a private key/certificate that you use to access our computing resources. A private key is just that, private. Don't share this with anyone, make it readable by anyone, send it over unencrypted email, post it to reddit, etc. We encourage our users to only install their private key on the machines they sit at, and trust. Don't use it at an internet cafe or a hacked machine.

For more information on how to keep an ssh key safe please see this help document.

Getting Help

The easiest way to get help is to Contact Us. We have a ticket tracker and that is the fastest way to get help. If it is an emergency you can call us. For less urgent/specific help you can search this wiki. All of our documentation is stored here.

Guidelines

We try to provide a useful research environment with the minimal limitations. However:

  • Do not bog down the cluster by running long running and/or CPU intensive jobs on the head node
  • Qlogin allow interative logins, but they are not guaranteed. So if 100% of CPUs are used by batch jobs, interative jobs won't start. We could reserve capacity for interative logins, but so far reducing the resources for batch jobs hasn't been justified
  • NEVER ssh to a compute node that hasn't been allocated to you by the batch queue.

Using a Batch Queue

All of our computing resources use a Batch Queue. There are many benefits to using a batch queue on a compute cluster. We currently use Slurm for batch queue management. We no longer support Sun Grid Engine or Condor.

The general idea with a batch queue is that you don't have to babysit your jobs. You submit it, and it'll run until it dies, or there is a problem. You can configure it to notify you via email when that happens. This allows very efficient use of the cluster. You can still babysit/debug your jobs if you wish using an interactive session (ie qlogin).

Our main concern is that all jobs go through the batch queuing system. Do not bypass the batch queue. We don't lock anything down but that doesn't mean we can't or won't. If you need to retrieve files from a compute node feel free to ssh directly to it and get them, but don't impact other jobs that have gone through the queue.

Please read our Slurm page for more information about using the queue system.

Don't Hammer your Home

All of our clusters have a local disk on the compute nodes. If your job is I/O intensive please don't hammer your /home directory. Instead you can use the scratch space (either /scratch or /state/partition1).

Another thing to note is that you should make sure you have a unique directory for each job. If you happen to run multiple jobs at the same time you don't want to have them both using the same scratch space. We recommend something like: /scratch/username/jobid

Clean up After Yourselves

In you submit scripts (or interactive sessions) please remember to clean up any temporary files from the compute node scratch space. If everyone fails at this there will be no more scratch space.

Compiling on the Frontend

Feel free to compile code on the frontend. Don't worry about logging into a compute node interactively to compile. Make sure to do a module load before configuring and compiling any source for any needed libraries or compilers.

Software

We support a bunch of different open source and commercial software products. We might already have what you need installed. You can use the module command to load/unload/browse those packages. Here are a few common commands:

  • module avail (show available software modules)
  • module list (show modules currently loaded)
  • module load foo (load the foo module)
  • module unload foo (unload the foo module)
  • module purge (unload all modules)

Check out the modules man page (run: man module) for more info. If you want to compile and/or run a serial code module load gcc is a good place to start. If you want to compile and/or run a parallel code module load gcc openmpi is a good place to start.

For application specific module related commands please see our documentation

If you have any additional software packages that you feel would be of use to others on your cluster please let us know.

support/faq/getting_started.txt · Last modified: 2019/03/15 10:07 by tdthatch