User Tools

Site Tools


support:systems:cmsnew

T3_US_UCD Documentation

T3_US_UCD is a beowulf cluster owned by John Conway and used by CERN researchers on the CMS project. This page documents the hardware, software, and policies surrounding this resource. The announcement archives are available online.

Terminology

The CMS terminology and technology supporting the CMS project is often hard to navigate. We maintain some resources on this very topic. We have a terminology page that should provide some help.

Fundamentals

Three types of storage:

  1. User home space on shell (NFS mounted to worker nodes)
  2. 84 GB scratch space on the worker nodes at /scratch
  3. Hadoop- store your files in /mnt/hadoop/cms/store/user/YourName

First set the environment and get your proxy:

$ source /share/apps/osg-1.2-wn/setup.sh
$ grid-proxy-init
$ source /opt/glite-UI/etc/profile.d/grid-env.sh

Check, you should get GRAM Authentication test successful.

$ globus-job-run cms.tier3.ucdavis.edu:2119/jobmanager-condor /usr
/bin/whoami

Set up your CMSSW environment

$ source /share/apps/cmssoft/cmsset_default.sh
$ export SCRAM_ARCH=slc5_amd64_gcc462

To list available CMSSW versions:

$ scramv1 list

Monitoring

Ganglia

You can also monitor the cluster using the ganglia interface. With the ganglia interface you can monitor just the nodes associated with your jobs. Just click on the “Job Queue” link and look for your job id. We collect a lot of data and you can look at things like memory usage, load average, disk activity, network activity, etc.

RSV

There are also RSV probes running occasionally that will test the various pieces of the OSG infrastructure that is running here at UC Davis. These probes are run on the primary node (cms.tier3) and are reported to the OSG servers in Indiana. You can see a local status page as well as the status page in MyOSG.

GIP

The Generic Information Provider (GIP) provides information on what resources are installed and where they are located on the internet. This information is configured on the primary node (cms.tier3) and is periodically sent to OSG, then on to WLCG. If this process stops or the connection becomes broken the WLCG will no know where our resources are and CRAB jobs will begin to fail. The GIP status can be checked on the MyOSG GIP status page for UCD here.

PhEDEx

PhEDEx is a collection of perl scripts that transfer data from other sites. It is running on agent.tier3. You can see the status of any pending downloads for Prod, Dev, and Debug. Typically only load tests should be active in the debug instance. You can also see historical information on the cmsweb phedex site.

FronTier/Squid

Squid provides a caching layer for some non-local data. You can see the Squid activity here. We are “T3_US_UCD”.

Gratia Accounting

Gratia provides information on which users and VOs are using your resources. This can be useful in determining why a resource is so busy. You can get useful information from the Gratia page for UCD in MyOSG.

Policies

The policies surrounding using this resource are outlined on this page.

Software

Any software that is available in Scientific Linux CERN 5 is also available for installation or already installed on this cluster. This provides the majority of the software that is installed. Any custom packages including compilers, mpi layers, open source packages, commercial packages, etc are installed in /share/apps and are available to all nodes in the cluster.

OS, Provisioning, Configuration Management

This cluster currently runs Scientific Linux CERN v5 (SLC5). SLC5 is based on Scientific Linux which is based on Redhat Enterprise Linux. CERN maintains SLC5. For provisioning we use cobbler and for configuration management we use puppet.

Hardware

The hardware for tribe is made up of the following:

support/systems/cmsnew.txt · Last modified: 2012/12/07 14:26 by tlknight