This section describes the custom software that we support on many of our clusters. This software is usually installed in /share/apps
but we ask that users access these libraries and binaries through the modules interface.
This list is not comprehensive and may vary based on the cluster that you use.
To view the most current list of the modules available while logged into a cluster, run the command:
module avail
The section describes the compilers that are available on many of the clusters we manage.
We support several GNU compilers.
We support multiple Python interpreter versions.
Note that as of 2019 the developers of Python are ending support for Python 2.x in favor of Python 3.x, so please plan accordingly.
We support multiple versions of the R interpreter for statistical computing.
This section lists the batch schedulers that we currently support.
We primarily support SLURM for batch scheduling and resource management.
We support OSG for the USCMS project.
We support a couple parallel environments. Open MPI is our default and preferred MPI layer.
We support Open MPI as our primary MPI layer.
We support linda for use with AMBER.
The following are general purpose libraries that are used in many other software packages.
We support ACML for gfortran, pgi, and pathscale compilers.
LAPACK stands for Linear Algebra PACKage and is a library that solves common numerical linear algebra problems.
We support the latest stable ATLAS which stands for Automatically Tuned Linear Algebra Software.
We support the latest stable Goto BLAS.
FFTW stands for Fast Fourier Transform in the West and is a common open source FFT implementation. We support the latest stable v3 as well as the older v2.
These are the tools and libraries installed that assist in generating a common data set format.
HDF stands for Hierarchical Data Format and is a file format for storing raster data developed by the National Center for Supercomputing Applications (NCSA). We support both v4 and v5.
netCDF stands for Network Common Data Form and is an I/O library which stores and retrieves data in self-describing, machine-independent datasets. The latest version 4.0 and 3.6.3 are both supported.
We support the latest PETSc library and toolset (mainly for Gale).
We support a few Geo Dynamics applications.
CitcomS is a finite element code designed to solve thermo-chemical convection problems relevant to Earth's mantle.
Gale is a 2D/3D code for the long-term tectonics community. The code solves problems related to orogenesis, rifting, and subduction with coupling to surface erosion models.
ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads.
A new short read genome assembler from the Computational Research and Development group at the Broad Institute.
The BAM format is an efficient method for storing and sharing data from modern, highly parallel sequencers. While primarily used for storing alignment information, BAMs can (and frequently do) store unaligned reads as well.
BIOPERL Bioperl is a collection of perl libraries that ease common bioinformatic needs and integrate with common file formats.
BLAST+ Basic Local Alignment Search Tool is the most frequently used sequence homology search tool.
BLAST Basic Local Alignment Search Tool is the most frequently used sequence homology search tool.
Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
BWA Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW.
Celera Celera Assembler (CA) is a whole-genome shotgun (WGS) assembler for the reconstruction of genomic DNA sequence from WGS sequencing data.
DSSP DSSP is a database of secondary structure assignments for all protein entries in the Protein Data Bank (PDB). DSSP is also the program that calculates DSSP entries from PDB entries.
FASTQC is a quality control tool for high throughput sequence data.
FASTX-TOOLKIT is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
HHSuite The HH-suite 2.0.6 is an open-source software package for sensitive sequence searching based on the pairwise alignment of hidden Markov models (HMMs). It contains HHsearch and HHblits among other programs and utilities
Mira is a whole genome shotgun and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio (the later at the moment only CCS and error-corrected CLR reads).
Picard-Tools comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
PSIPRED is a protein secondary structure prediction tool based on position-specific scoring matrices.
“A two-stage neural network is used to predict protein secondary structure based on the position specific scoring matrices generated by PSI-BLAST. Despite the simplicity and convenience of the approach used, the results are found to be superior to those produced by other methods, including the popular PHD methods…”
Qiime QIIME (pronounced “chime”) stands for Quantitative Insights Into Microbial Ecology.
SamTools SAM stands for Sequence Alignment/Map SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
Trinity a method for de novo assembly of full-length transcripts
USEARCH is an algorithm designed to enable high-throughput, sensitive search of very large sequence databases.
Velvet is a de novo genomic assembler specially designed for short read sequencing technologies.
MATLAB is commercial software and you must have a license to use it.
We support the following packages used in MD simulations.
ABINIT is an open source package that helps find the total energy, charge density and electronic structure of systems.
AMBER stands for Assisted Model Building with Energy Refinement and is a molecular dynamics package written by various institutions including the Scripps Research Institute, University of California, Irvine among many others. We currently support both AMBER 8 and AMBER 9.
Gaussian predicts the energies, molecular structures, and vibrational frequencies of molecular systems, along with numerous molecular properties derived from these basic computation types. It can be used to study molecules and reactions under a wide range of conditions, including both stable species and compounds which are difficult or impossible to observe experimentally such as short-lived intermediates and transition structures.
GROMACS is a very versatile Molecular Dynamics application.
LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator and is a molecular dynamics simulator. We only support the later (C++) versions of this code.
NWChem is a computational chemistry package that is designed to run on high-performance parallel supercomputers as well as conventional workstation clusters. It aims to be scalable both in its ability to treat large problems efficiently, and in its usage of available parallel computing resources. NWChem has been developed by the Molecular Sciences Software group of the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). Most of the implementation has been funded by the EMSL Construction Project.
VASP stands for Vienna Ab-initio Simulation Package and is a package for performing ab-initio quantum-mechanical molecular dynamics (MD) using pseudopotentials and a plane wave basis set. This is a commercial package and is currently only available for those with a license.
WIEN2k This is a commercial package and is currently only available for those with a license.
We support the following packages used in MD simulations.
GEANT4 is a toolkit for the simulation of the passage of particles through matter.
WRF Weather Research and Forecasting Model is is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. It features multiple dynamical cores, a 3-dimensional variational (3DVAR) data assimilation system, and a software architecture allowing for computational parallelism and system extensibility. WRF is suitable for a broad spectrum of applications across scales ranging from meters to thousands of kilometers.