User Tools

Site Tools


support:systems:peloton

Peloton Documentation

Peloton is a research and teaching cluster for the College of Letters and Science. This page documents the hardware, software, and policies surrounding this resource. The announcement archives are available online.

Announcements

Announcement notifications are sent to a internally maintained mailing list. If you are a user of this cluster you will be added to the mailing lists automatically.

Access Policy

All researchers in College of Letters and Science are entitled to free access to the cluster. Their share of available resources depends on the sponsor. Those with MPS affiliation and nothing else can share 4 compute nodes (128 CPUs) and a share of any other free resources.

Those who contribute get immediate (within 1 minute) access to the resources they contribute and a larger share of any free resources. The minimum level contribution is approximately $16,000.

Default storage is 1TB, extra storage can be purchased in 22TB chunks for approximately $3,000. These 22TB chunks do NOT include backups.

Monitoring

Operating System

Peloton cluster runs Ubuntu 18.04 and uses the slurm batch queue manager. System configuration and management is via cobbler and puppet. (Last updated: 03/2019)

Software

Requests for any centrally installed software should go to help@cse.ucdavis.edu. Any software that is available in Ubuntu is also available for installation or already installed on this cluster. In many cases we compile and install our own software packages. These custom packages include compilers, mpi layers, open source packages, commercial packages, HDF, NetCDF, WRF, and others. We use Environment Modules to manage the environment. A quick intro:

  • To get a list of available applications and libraries - module avail
  • To setup your command line or script based environment - module load <directory/application>

Documentation on some of the custom installed software is at HPC Software Documentation. An (outdated) list is at Custom Software. Best to use the “module avail” command for the current list of installed software.

Cluster Hardware

Rack/Networking layout:

Peloton-I Hardware:

  • Four 36 port FDR (54Gbps) Infiniband switches.
  • 54 nodes with 64GB ram, 16 cores/32 threads (e5-2630v3)
  • 12 nodes with 128GB ram, 16 cores/32 threads (e5-26030v3) with 8 Geforce GTI 980 TI GPUs
  • 2 file servers with 36x8TB disks
  • 1 head node with 256GB ram connected to campus network over 10G

Peloton-II Hardware:

  • One 36 port EDR (100Gbps) Infiniband switch
  • 32 nodes with 256GB ram, 32 cores/64 threads (AMD Epyc 7351)
  • 1 file server with 36x10TB disks.

Batch Partitions

Low priority means that you might be killed at any time. Great for soaking up unused cycles with short jobs; a particularly good fit for large array jobs with short run times.

Medium priority means you might be suspended, but will resume when a high priority job finishes. *NOT* recommended for MPI jobs. Up to 100% of idle resources can be used.

High priority - your job will kill/suspend lower priority jobs. High priority means your jobs will keep the allocated hardware until it's done or there's a system or power failure. Limited to the number of CPUs your group contributed. Recommended for MPI jobs.

GPU - If you contributed to the GPU nodes you can access to the GPU partition to run CUDA jobs that take advantage of the GPUs

High2 - For contributors, allows running on twice your contribution for a week.

Med2 - For contributors, allows running on additional resources, but may be suspended by higher priority jobs.

support/systems/peloton.txt · Last modified: 2019/05/03 22:01 by bill