ceuadmin

The CEU software repository is here, /usr/local/Cluster-Apps/ceuadmin/.

More detailed diagrams on recently added genetics/proteomics and generic software are as follows,

Genetics Generic

noting that the importance of software is purely random according to \(Poisson(N,\lambda)\) where \(N\) is the number of entries, \(\lambda=3\).

Entries

The current list is as follows,

  [1] "ABCtoolbox"             "akt"                    "allegro"                "alpine"
  [5] "Anaconda3"              "annovar"                "aria2"                  "augeas"
  [9] "autoconf"               "automake"               "axel"                   "bazel"
 [13] "bcftools"               "Beagle"                 "bedops"                 "bedtools2"
 [17] "bgen"                   "biobank"                "blat"                   "boltlmm"
 [21] "boost"                  "brotli"                 "busybox"                "caddy"
 [25] "CaVEMaN"                "CAVIAR"                 "CAVIARBF"               "ccal"
 [29] "chromium"               "circos"                 "citeproc"               "cmake"
 [33] "comet"                  "cppunit"                "crossmap"               "crux"
 [37] "cryptopp"               "cryptsetup"             "curl"                   "Cytoscape"
 [41] "deno"                   "DEPICT"                 "device-mapper"          "diann"
 [45] "DjVuLibre"              "docbook2X"              "docker"                 "DosageConverter"
 [49] "dotnet"                 "Eagle"                  "edge"                   "enchant"
 [53] "ensembl-vep"            "exiv2"                  "exomeplus"              "expat"
 [57] "FastQTL"                "fcGENE"                 "ffmpeg"                 "fgwas"
 [61] "findlib"                "finemap"                "firefox"                "FlashLFQ"
 [65] "fossil"                 "fpc"                    "FragPipe"               "fraposa_pgsc"
 [69] "freesurfer"             "fribidi"                "GARFIELD"               "gatk"
 [73] "gcta"                   "gdal"                   "gdc"                    "geany"
 [77] "GEM"                    "GEMMA"                  "Genotype-Harmonizer"    "geos"
 [81] "gettext"                "gh"                     "ghc"                    "ghostscript"
 [85] "git"                    "git-extras"             "GitKraken"              "glib"
 [89] "glibc"                  "globusconnectpersonal"  "glpk"                   "gmp"
 [93] "gnutls"                 "go"                     "googletest"             "graphene"
 [97] "GraphicsMagick"         "GreenAlgorithms4HPC"    "gsl"                    "gtk+"
[101] "gtksourceview"          "gtool"                  "hivex"                  "hpg"
[105] "htslib"                 "hunspell"               "icu"                    "ImageJ"
[109] "ImageMagick"            "impute"                 "inetutils"              "IonQuant"
[113] "JabRef"                 "JAGS"                   "jasper"                 "jq"
[117] "json-c"                 "KentUtils"              "KING"                   "kojak"
[121] "krb5"                   "lapack"                 "ldc2"                   "ldsc"
[125] "LDstore"                "LEMMA"                  "libarchive"             "libcares"
[129] "libgeotiff"             "libgit2"                "libglvnd"               "libiconv"
[133] "libidn2"                "libjpeg-turbo"          "libntlm"                "libpng"
[137] "libseccomp"             "libsodium"              "libssh"                 "libssh2"
[141] "libuv"                  "libxml2"                "libxslt"                "linux"
[145] "locuszoom"              "LVM2"                   "MAGENTA"                "magma"
[149] "Mango"                  "MaxQuant"               "Mega2"                  "metal"
[153] "MetaMorpheus"           "Miniconda3"             "MONSTER"                "MORGAN"
[157] "MR-MEGA"                "msamanda"               "MsCAVIAR"               "MSFragger"
[161] "MS-GF+"                 "msms"                   "nano"                   "ncbi-vdb"
[165] "ncurses"                "netbeans"               "nettle"                 "nextflow"
[169] "nginx"                  "NLopt"                  "node"                   "nspr"
[173] "ntlm"                   "ocaml"                  "oniguruma"              "opam"
[177] "openjdk"                "OpenMS"                 "openssh"                "openssl"
[181] "osca"                   "p7zip-zstd"             "PAINTOR"                "pandoc"
[185] "pandoc-citeproc"        "pango"                  "parallel"               "Pascal"
[189] "patchelf"               "pcre2"                  "pdf2djvu"               "pdfjam"
[193] "peer"                   "Perseus"                "pgsc_calc"              "phenoscanner"
[197] "PhySO"                  "picard"                 "pigz"                   "plink"
[201] "plink-bgi"              "plinkseq"               "podman"                 "PoGo"
[205] "polyphen"               "poppler"                "popt"                   "proj"
[209] "PRSice"                 "pspp"                   "pulsar"                 "PWCoCo"
[213] "pwiz"                   "qctool"                 "qemu"                   "qpdf"
[217] "qt"                     "qtcreator"              "QTLtools"               "quarto"
[221] "quicktest"              "R"                      "raremetal"              "rclone"
[225] "readline"               "regenie"                "regtools"               "RHHsoftware"
[229] "rst2pdf"                "rstudio"                "rtmpdump"               "ruby"
[233] "rust"                   "sage"                   "samtools"               "Scala"
[237] "seqkit"                 "shapeit"                "singularity"            "SMR"
[241] "snakemake"              "SNP2HLA"                "snptest"                "spread-sheet-widget"
[245] "spyder"                 "sqlite"                 "sra-tools"              "sshpass"
[249] "ssw"                    "STAR"                   "stata"                  "SurvivalAnalysis"
[253] "SurvivalKit"            "Swift"                  "SYMPHONY"               "tabix"
[257] "tandem"                 "tatami"                 "ThermoRawFileParser"    "ThermoRawFileParserGUI"
[261] "thunderbird"            "tidy"                   "tiff"                   "trinculo"
[265] "trousers"               "Typora"                 "unbound"                "vala"
[269] "VarScan"                "vcftools"               "VEGAS2"                 "verifyBamID"
[273] "VSCode"                 "VSCodium"               "vte"                    "wine"
[277] "wrk"                    "xpdf"                   "yaml-cpp"               "Zotero"
[281] "zstd"

These are wrapped up as :star::star::star: modules :star::star::star:.

The original list prior to mid-November 2022 is given below1.

Usage

We illustrate with pspp. A brief description of a module is available with

module help ceuadmin/pspp

and the module is loaded and graphical user interface (GUI)2 started with

module load ceuadmin/pspp
psppire

for version 2.0.1. Once the job is done, one can restore the previous environment with

module unload ceuadmin/pspp

Note that module add/rm is equivalent to module load/unload.

Some modules are based on compiled Java (.jar) which can be called directly but it is handy to use preset environment variables, e.g.,

module load ceuadmin/picard
java -jar ${PICARD_HOME}/picard.jar --help

A full list of module subcommands is available with module help as detailed here for 3.2.9 – cclake uses version 3.2.10 (2012-12-21) while icelake uses 4.5.2 (2020-07-30). In particular, module whatis ceuadmin/ensembl-vep indicates usage regarding build37/build38 setup for the loftee plugin used in loss of function (LoF) annotation.

Most software are available for all CSD3 users, only limited by software with excessive size / reference data – which ideally will be available from /rds/project/jmmh2/software but now /rds/project/jmmh2/rds-jmmh2-public_databases/software as a trade-off. These can largely be seen as sources which are used to build the reoository given above.

CEU users will be able to use ANNOVAR, ensembl-vep, OpenMS, phenoscanner, polyphen, KentUtils/MAGMA/Pascal/VEGASV2/fgwas/locuszoom linking internal projects/personal space (additional requests need to be made). A large collection of R packages (1,705 as of 1/12/2024, esp. with availability of major machine learning packages) is linked with the latest R distribution, 4.4.2; there are also 3 packages (DescTools, Rfast, Rfast2) under R-gcc11. Note that there are limitations with CSD3 so that sf, terra cannot be updated due to incomplete build of gdal/proj.

For CEU users, it is easy to point to them, e.g.,

export HPC_WORK=/rds/user/$USER/hpc-work/
export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export R_LIBS=${RDS}/R:${RDS}/R-4.4.2/library

or possible to have your own installations based on these, e.g., through creation of a modified Makefile with altered prefix followed by make install -f <modified Makefile>.

The following script tests for loading of dplyr:

export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export PATH=${PATH}:${RDS}/R-4.4.2/bin
export R_LIBS=${RDS}/R-4.4.2/library:${RDS}/R
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'

It appears clumsy to do these every time, so an attempt is made to have them in a module, namely

module load ceuadmin/R/latest
which R
echo $R_LIBS
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'

For non-CEU users, please drop an email to jhz22@medschl.cam.ac.uk for access.

Module creation

The following example shows how to set up a module,

#!/bin/bash

mkdir tmp-xz
cd tmp-xz
wget http://tukaani.org/xz/xz-5.2.2.tar.gz
tar zxvf xz-5.2.2.tar.gz
cd xz-5.2.2
mkdir -p /usr/local/Cluster-Apps/xz/5.2.2
export PREFIX=/usr/local/Cluster-Apps/xz/5.2.2
./configure --prefix=$PREFIX
make
make check
sg swinst 'make install'

cat << 'EOL' > /usr/local/Cluster-Config/modulefiles/xz/5.2.2
#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {

  puts stderr "\tXZ Utils is free general-purpose data compression software with a high compression ratio.\n"
  puts stderr "\tInstalled under: /usr/local/Cluster-Apps/xz/5.2.2
     Hompage:http://tukaani.org/xz/"

}

module-whatis "xz free general-purpose data compression"

conflict xz
set               root                  /usr/local/Cluster-Apps/xz/5.2.2
prepend-path      PATH                  $root/bin
prepend-path      MANPATH               $root/man
prepend-path      LD_LIBRARY_PATH       $root/lib
prepend-path      LIBRARY_PATH          $root/lib
prepend-path      FPATH                 $root/include
prepend-path      CPATH                 $root/include
prepend-path      INCLUDE               $root/include
setenv            XZ_HOME               $root
EOL

The module is made visible through environment variable MODULEPATH. Note that there will be permission issue for a user, however, to make changes to /usr/local/Cluster-Apps.

The module files are defined at /usr/local/Cluster-Config/modulefiles/ceuadmin. Most software stay with gcc/6 due to many dependencies of built modules; when required it can be enabled with module load gcc/6; however packages could also require libgfortran.so.5 as in gcc/9 – as a compromise one can amend .bashrc to include lines such as export LD_LIBRARY_PATH=/usr/local/software/master/gcc/9/lib64:$LD_LIBRARY_PATH.

Footnotes

Further information is avaiiable from /usr/local/Cluster-Apps/ceuadmin/doc/ceuadmin.md, ceuadmin.html.


  1. The original list was a mixture of modules and directories as follows,

    bgenix/               impute_v2.3.2_x86_64_static/  plink/                        R/                 Raremetal_linux_executables/        snptest_new/
    biobank/              interval/                     plink_1.90_beta/              raremetal_4.13/    Raremetal_linux_executables.tgz     source/
    boltlmm/              JAGS/                         plink_bgi_Dev/                raremetal_4.13.3/  raremetal.log                       stata/
    boltlmm_2.2/          LDstore/                      plink-bgi_linux_x86_64_may/   raremetal_4.13.4/  regenie/                            tabix/
    crossmap/             locuszoom/                    plink_linux_x86_64_beta2a/    raremetal_4.13.5/  samtools-1.10.tar.bz2               temp/
    exomeplus/            magma/                        plink_linux_x86_64_beta3.32/  raremetal_4.13.7/  samtools_1.2/                       vcftools/
    gcta/                 MAGMA_Celltyping/             plinkseq-0.08-x86_64/         raremetal_4.13.8/  shapeit.v2.r790.RHELS_5.4.dynamic/  vcftools_ps629/
    gtool_v0.7.5_x86_64/  metabolomics/                 plinkseq-0.10/                raremetal_4.14.0/  snptest/
    hpg/                  metal/                        pspp/                         raremetal_4.14.1/  snptest_2.5.2/
    htslib/               metal_updated/                qctool_v1.4-linux-x86_64/     raremetal_BPGen/   snptest_2.5.4_beta3/
    

    A grep of recent add-ons in the Genetics/Proteomics category is as follows,

    Date Add.ons Category
    2022-10-22 snptest/2.5.6 Genetics
    "" qctool/2.0.8 Genetics
    "" gcta/1.94.1 Genetics
    "" KING/2.1.6 Genetics
    "" LDstore/2.0 Genetics
    "" shapeit/3 Genetics
    "" vcftools/0.1.16 Genetics
    "" finemap/1.4 Genetics
    2022-10-23 quicktest/1.1 Genetics
    "" samtools/1.11 Genetics
    "" bcftools/1.12 Genetics
    "" MORGAN/3.4 Genetics
    "" METAL/2020-05-05r Genetics
    "" regenie/3.2.1 Genetics
    "" GEMMA/0.98.5 Genetics
    "" htslib/1.12 Genetics
    "" fcGENE/1.0.7 Genetics
    "" SMR/1.0.3 Genetics
    "" FastQTL/2.165 Genetics
    2022-10-26 circos/0.69-9 Genetics
    "" bgen/1.1.7 Genetics
    "" DosageConverter/1.0.0 Genetics
    "" QTLtools/1.3.1-25 Genetics
    "" blat/37x1 Genetics
    "" bedtools2/2.29.2 Genetics
    "" bedops/2.4.41 Genetics
    2022-11-03 Beagle/3.0.4 Genetics
    2022-11-08 CrossMap/0.6.4 Genetics
    "" SurvivalKit/6.12 Genetics
    "" PRSice/2.3.3 Genetics
    2022-11-09 qctool/2.2.0 Genetics
    2022-11-10 CaVEMaN/1.01-c1815a0 Genetics
    "" akt/0.3.3 Genetics
    "" MsCAVIAR/0.6.4 Genetics
    "" CAVIAR/2.2 Genetics
    "" MONSTER/1.3 Genetics
    "" osca/0.46 Genetics
    "" LEMMA/1.0.4 Genetics
    "" CAVIARBF/0.2.1 Genetics
    2022-11-11 PAINTOR/3.0 Genetics
    2022-11-14 MR-MEGA/0.2 Genetics
    2022-11-16 SNP2HLA/1.0.3 Genetics
    "" STAR/2.7.10b Genetics
    "" Mega2/6.0.0 Genetics
    2022-11-19 ensembl-vep/104 Genetics*
    "" OpenMS/3.0.0 Genetics*
    "" polyphen/2.2.2 Genetics*
    "" ANNOVAR/24Oct2019 Genetics*
    "" MAGENTA/vs2_July2011 Genetics*
    "" GARFIELD/v2 Genetics*
    "" KentUtils/2022-11-14 Genetics*
    2022-11-20 Genotype-Harmonizer/1.4.25 Genetics
    2022-11-21 locuszoom/1.4 Genetics*
    "" DEPICT/v1_rel194 Genetics*
    "" MAGMA/1.10 Genetics*
    "" Pascal/v_debut Genetics*
    "" VEGAS2/2.01.17 Genetics*
    "" fgwas/0.3.6 Genetics*
    2022-12-04 phenoscanner/v2 Genetics*
    2022-12-07 SurvivalAnalysis/2016-05-09 Genetics
    2023-01-03 Eagle/2.4.1 Genetics
    2023-01-05 GEM/1.4.5 Genetics
    2023-02-01 GENEHUNTER/2.1_r6 Genetics
    2023-03-14 regenie/3.2.5 Genetics
    2023-03-24 PoGo/1.0.0 Genetics
    2023-03-31 PWCoCo/2023-03-31 Genetics
    2023-04-02 regenie/3.2.5.3 Genetics
    2023-04-04 PWCoCo/1.0 Genetics
    2023-06-02 regenie/3.2.7 Genetics
    2023-06-06 allegro/2.0f Genetics
    2023-06-19 plink-ng/2.00a3.3 Genetics
    2023-06-26 RHHsoftware/0.1 Genetics
    2023-07-28 PWCoCo/1.1 Genetics
    2023-08-02 regenie/3.2.9 Genetics
    2023-08-06 finemap/1.4.2 Genetics
    2023-09-27 ncbi-vdb/3.0.8 Genetics
    "" sra-tools/3.0.8 Genetics
    "" gatk/4.4.0.0 Genetics
    2023-11-24 ldsc/1.0.1 Genetics
    2023-11-30 gdc/1.6.1-1.0.0 Genetics
    2023-12-20 verifyBamID/1.1.3 Genetics
    2023-12-21 verifyBamID/2.0.1 Genetics
    2023-12-27 regtools/1.0.0 Genetics
    "" VarScan/2.4.6 Genetics
    2024-01-08 picard/3.1.1 Genetics
    "" plink/2.0_20240105 Genetics
    2024-01-19 htslib/1.19 Genetics
    2024-01-24 fraposa_pgsc/0.1.0 Genetics
    "" pgsc_calc/2.0.0-alpha.4 Genetics
    2024-04-22 peer/1.3 Genetics
    2024-06-04 pwiz/3_0_24156_80747de Proteomics
    2024-06-09 crux/4.2 Proteomics
    "" DIA-NN/1.8.1 Proteomics
    2024-06-11 crux/4.1 Proteomics
    "" pwiz/3_0_24163_9bfa69a-wine Proteomics
    2024-06-11 seqkit/2.8.2 Proteomics
    "" FlashLFQ/1.2.6 Proteomics
    "" MetaMorpheus/1.0.5 Proteomics
    2024-06-25 msms/3.2rc-b163 Genetics
    2024-07-13 msamanda/3.0.21.532 Proteomics
    2024-07-31 tandem/2017.2.1.4 Proteomics
    2024-08-11 comet/2024.01.1 Proteomics
    "" kojak/2.1.0 Proteomics
    "" kojak/1.5.5 Proteomics
    "" kojak/2.0.0a22 Proteomics
    2024-08-12 MS-GF+/2024.03.26 Proteomics
    2024-08-14 ThermoRawFileParser/1.4.4 Proteomics
    "" ThermoRawFileParserGUI/1.7.4 Proteomics
    "" FragPipe/22.0 Proteomics
    2024-08-15 MSFragger/4.1 Proteomics
    "" IonQuant/1.10.27 Proteomics
    2024-08-20 htslib/1.20 Genetics
    "" bcftools/1.20 Genetics
    "" samtools/1.20 Genetics
    2024-08-23 qpdf/11.9.1 Generic
    2024-09-01 MaxQuant/2.6.4.0 Proteomics
    "" Perseus/2.1.2.0 Proteomics
    2024-10-13 sage/0.14.7 Proteomics

    * CEU or approved users only.

    ceuadmin 

  2. GUI

    As GUI-based programs claim more computing resources, it is recommended that they are only used occasionally, e.g., calling back GitHub sessions.