User Tools

Site Tools


tier3:home

T3_US_UCD Documentation

T3_US_UCD is a beowulf cluster owned by John Conway and used by CERN researchers on the CMS project. This page documents the hardware, software, and policies surrounding this resource. The announcement archives are available online.

Terminology

The CMS terminology and technology supporting the CMS project is often hard to navigate. We maintain some resources on this very topic. We have a terminology page that should provide some help.

Work Space

Three types of storage:

  1. User home space on shell.tier3.ucdavis.edu (NFS mounted to worker nodes)
  2. 84 GB scratch space local to the worker nodes at /scratch
  3. Hadoop- store your files in /mnt/hadoop/store/user/username

Getting Started

First set the environment and get your proxy:

$ source /share/apps/osg-1.2-wn/setup.sh
$ grid-proxy-init
$ source /opt/glite-UI/etc/profile.d/grid-env.sh

Check, you should get GRAM Authentication test successful.

$ globus-job-run cms.tier3.ucdavis.edu:2119/jobmanager-condor /usr
/bin/whoami

Set up your CMSSW environment

$ source /share/apps/cmssoft/cmsset_default.sh
$ export SCRAM_ARCH=slc5_amd64_gcc462

To list available CMSSW versions:

$ scramv1 list

DAS

Monitoring the Status of UCD Tier3

Ganglia

You can also monitor the cluster using the ganglia interface.

RSV

There are also RSV probes running occasionally that will test the various pieces of the OSG infrastructure that is running here at UC Davis. These probes are run on the primary node (cms.tier3) and are reported to the OSG servers in Indiana. You can see a local status page as well as the status page in MyOSG.

Useful commands

GIP

The Generic Information Provider (GIP) provides information on what resources are installed and where they are located on the internet. This information is configured on the primary node (cms.tier3) and is periodically sent to OSG, then on to WLCG. If this process stops or the connection becomes broken the WLCG will no know where our resources are and CRAB jobs will begin to fail. The GIP status can be checked on the MyOSG GIP status page for UCD here.

PhEDEx

PhEDEx is a collection of perl scripts that transfer data from other sites. It is running on agent.tier3. You can see the status of any pending downloads for Prod, Dev, and Debug. Typically only load tests should be active in the debug instance. You can also see historical information on the cmsweb phedex site.

FronTier/Squid

Squid provides a caching layer for some non-local data. You can see the Squid activity here. We are “T3_US_UCD”.

Gratia Accounting

Gratia provides information on which users and VOs are using your resources. This can be useful in determining why a resource is so busy. You can get useful information from the Gratia page for UCD in MyOSG.

Policies

The policies surrounding using this resource are outlined on this page.

Software

Any software that is available in Scientific Linux CERN 5 is also available for installation or already installed on this cluster. This provides the majority of the software that is installed. Any custom packages including compilers, mpi layers, open source packages, commercial packages, etc are installed in /share/apps and are available to all nodes in the cluster.

Condor , Hadoop , GLOBUS, Crab

OS, Provisioning, Configuration Management

This cluster currently runs Scientific Linux CERN v5 (SLC5). SLC5 is based on Scientific Linux which is based on Redhat Enterprise Linux. CERN maintains SLC5. For provisioning we use cobbler and for configuration management we use puppet.

Hardware

The hardware for tribe is made up of the following:

  • 40 quad-core, dual socket AMD compute nodes with 8GB RAM and 2x2TB of disk
  • 4 12-core, dual socket AMD Opteron 6344 compute nodes with 64GB RAM and 3x4TB of disk
  • 4 storage nodes
  • 1 KVM consoles
  • 2 APC 48u racks
  • 2 APC UPSes

The Storage Element

SE

tier3/home.txt · Last modified: 2013/06/14 08:58 by tlknight