Note: this guide is a work in progress.
Farm is accessible using SSH in a terminal emulator.
An SSH key is required to log in. SSH keys are generated as a matched pair of a private key and a public key. Keep your private key safe and use a strong, memorable passphrase.
We support one key per user. If you need to access the cluster from multiple computers, such as a desktop and a laptop, copy your private key. Directions are on the ssh key page.
Note that if you forget your passphrase or lose your private key, we cannot reset it. You'll need to generate a new key pair, following the same directions as when you first created it.
Visit the ssh key page for much more information on generating and using an ssh key on a PC, Mac, or Linux computer.
You will need terminal emulator software to log into Farm and run jobs. The software you choose must be able to use ssh keys to connect. This information is typically available in the documentation for the software. Common software choices include:
Mobaxterm is an all-in-one terminal emulator for Windows that gives a very Linux-like terminal environment along with the ability to edit files in a local editor and have changes automatically uploaded back to Farm.
Windows Subsystem for Linux can provide a Linux terminal within Windows 10. Once you have it installed, you can follow the Linux-based directions to generate a key pair and use ssh at the command-line to connect.
Once you have an SSH key and your account has been created, you can connect to Farm. In most text-based terminal emulators (Linux and MacOS), this is how you will connect:
Farm has a dedicated node for file transfers. When transferring large amounts of data to or from the cluster, you can specify port 2022 with your transfer software to connect to the transfer node. For all other work and to submit jobs to the cluster, connect to the login node on port 22.
The Farm transfer node is being phased out. You can transfer through the login node on port 22 for now.
Farm uses SSH key-pairs ONLY so you need to point any local scp/sftp clients at that the same private part that you use to SSH to Farm.
Filezilla is a multi-platform client commonly used to transfer data to and from the cluster.
Cyberduck is another popular file transfer client for Mac or Windows computers that has the ability to edit files in a local editor and have changes automatically uploaded back to Farm.
WinSCP is Windows-only file transfer software.
Globus is another common solution, especially for larger transfers.
rsync and scp are command-line tools to transfer data to and from the cluster.
These commands should be run on your computer, not on Farm.
To transfer something to Farm from your local computer:
scp -r local-directory firstname.lastname@example.org:~/destination/
Note: outbound scp initiated from Farm is disabled. Please initiate an inbound scp using the above method.
To transfer something from Farm to your local computer:
scp -r email@example.com:~/farmdata local-directory
To use rsync to transfer a file or directory from Farm to your local computer:
rsync -aP -e ssh firstname.lastname@example.org:~/farmdata .
rsync has the advantage that if the connection is interrupted for any reason you can just up-arrow and run the exact same command again and it will resume where it stopped.
man scp and
man rsync for more information.
Farm has many software packages available for a wide range of needs. Most packages that are installed are available as environment modules using the
module avail command. Use
module load <module/version> to load a module, and
module unload <module/version> when done.
Generally, use as few modules as possible at a time–once you're done using a particular piece of software, unload the module before you load another one, to avoid incompatibilities.
Many of the most up-to-date Python-based software packages may be found under the
bio3 module. Load the module with
module load bio3 and run
conda list to see a complete and up-to-date list.
Many additional Python 2 packages may be found under the
bio module. Note that the
bio3 modules are mutually incompatible with one another, so do not load both at the same time.
Visit the Environments page for much more information on getting started with software and the
modules command on the cluster.
If you can't find a piece of software on the cluster, you can request an installation for cluster-wide use. Contact the helpdesk with the name of the cluster, your username, the name of the software, and a link to the software's website, documentation, or installation directions, if applicable.
Disk I/O (input/output) happens when reading to or from a file on the hard drive. Please avoid heavy I/O in your home directory, as this degrades file server performance for everyone. If you know that your software is I/O intensive, such as software that rapidly reads/writes to many files, performs many small reads/writes, and so on, you may want to copy your data out of your home directory and onto the compute node as a part of your batch job, or the network file system (NFS) can bottleneck, slowing down both your job and others, as well.
To prevent NFS bottlenecking, Farm supports the use of the
/scratch/ directory on the compute nodes when you have I/O-intensive code that needs temporary file space. Each compute node has its own independent scratch directory of about 1TB.
Please create a unique directory for each job when you use scratch space, such as
/scratch/your-username/job-id/, to avoid collisions with other users or yourself. For example, in your sbatch script, you can use
/scratch/$USER/$SLURM_JOBID/$SLURM_ARRAY_TASK_ID (for array jobs).
When your job is finished, copy any results/output that you wrote to your
/scratch subdirectory (if any) and remove ALL of your files from your
/scratch/ is a shared space between everyone who runs jobs on a node, and is a limited resource. It is your responsibility to clean up your scratch space when your job is done or the space will fill up and be unusable by anyone.
/scratch/ is local to each node, and is not shared between nodes and the login node so you will need to perform setup and cleanup tasks at the start and end of every job run. If you do not cleanup at the end of every run you will leave remnants behind that will eventually fill the shared space.
/scratch/ directory is subject to frequent purges, so do not attempt to store anything there longer than it takes your job to run.
If you would like to purchase additional scratch space for yourself or your lab group, contact the helpdesk for more information.
Job scheduling with SLURM is a key feature of computing on the cluster.
A job in the context of the cluster is a running piece of software performing some kind of function, such as computation, analysis, simulation, analysis, modeling, comparing, sorting, and other research-related tasks.
The job scheduler or batch queue system allows for the fair provisioning of limited resources (nodes, CPUS, memory, and time) on a shared system.
Farm uses the SLURM job scheduler to manage user jobs, passing the work to the compute nodes for execution, primarily through the use of
srun commands. Jobs are placed in a queue and executed according to a priority system.
Do not skip the batch queue by running your compute tasks directly on the head/login node.
Running jobs on the login node degrades performance for all users and can damage the cluster. Jobs found running outside of the job queue will be terminated and your account may be temporarily suspended until you contact the helpdesk, so that the admins can work with you to help you run your job most effectively without damaging the cluster.
The batch queue is divided into job priority queues called partitions. Access to a particular partition is determined by your college, department, lab, or sponsor's contribution to the cluster by buying nodes. You will be informed what partitions you have access to when you receive your account creation email.
Farm's primary partitions include:
When purchasing a node, it will typically be added to the pool of nodes in the latest generation of Farm (Farm III, as of 2019) unless special arrangements are made.
Low priority - jobs in this queue may killed at any time when superceded by jobs in the medium or high partitions, with the possibility of being restarted at a later time when there are resources available again. The low queue is useful for soaking up unused cycles with short jobs, and is a particularly good fit for large array jobs with short run times. Low priority jobs can use more resources than your group paid for, if there are no other higher-priority jobs.
Medium priority - jobs in this queue may be temporarily suspended when superceded by jobs in the high partition, but will resume when higher priority job finishes. Medium jobs can also use more resources than your group paid for, if there are no higher-priority jobs. It is NOT recommended to run MPI jobs in medium.
High priority - jobs in this queue will kill/suspend lower priority jobs. Jobs in high will keep the allocated hardware until it's done (or there's a system or power failure.) Jobs in the high partition are limited to using the number of CPUs that your group contributed to the cluster. This partition is recommended for MPI jobs.
For more information about submitting your job to the batch queue with the sbatch and srun commands, visit our SLURM page.
For additional help with logging in, software or job-related problems, to request the installation of a software package for cluster-wide use, or other issues not listed here, contact the helpdesk to open a trouble ticket.
When contacting help for job-related issues, please ALWAYS include the complete prompt and command that you tried including the cluster, directory, username, command, arguments, job number, and any output/results that you received so that we can quickly begin troubleshooting your issue. For example:
user@cluster:~$ sbatch myjob.sh Submitted batch job 12345678
For software requests or other issues where a command prompt is not applicable, please include your cluster username and the name of the cluster in your message.