ceuadmin

The CEU software repository is here, /usr/local/Cluster-Apps/ceuadmin/.

ceuadmin

noting that the importance of software is purely random according to \(Poisson(N,\lambda)\) where \(N=222\), \(\lambda=3\).

Entries

The current list is as follows,

  [1] "ABCtoolbox"            "akt"                   "allegro"               "alpine"
  [5] "Anaconda3"             "annovar"               "aria2"                 "autoconf"
  [9] "automake"              "axel"                  "bazel"                 "bcftools"
 [13] "Beagle"                "bedops"                "bedtools2"             "bgen"
 [17] "biobank"               "blat"                  "boltlmm"               "brotli"
 [21] "busybox"               "CaVEMaN"               "CAVIAR"                "CAVIARBF"
 [25] "ccal"                  "circos"                "citeproc"              "cmake"
 [29] "cppunit"               "crossmap"              "cryptsetup"            "Cytoscape"
 [33] "deno"                  "DEPICT"                "device-mapper"         "DjVuLibre"
 [37] "docbook2X"             "DosageConverter"       "Eagle"                 "enchant"
 [41] "ensembl-vep"           "exiv2"                 "exomeplus"             "expat"
 [45] "FastQTL"               "fcGENE"                "ffmpeg"                "fgwas"
 [49] "finemap"               "fossil"                "fpc"                   "fraposa_pgsc"
 [53] "fribidi"               "GARFIELD"              "gatk"                  "gcta"
 [57] "gdal"                  "gdc"                   "geany"                 "GEM"
 [61] "GEMMA"                 "Genotype-Harmonizer"   "gettext"               "gh"
 [65] "ghc"                   "ghostscript"           "git"                   "git-extras"
 [69] "GitKraken"             "glib"                  "glibc"                 "globusconnectpersonal"
 [73] "glpk"                  "gmp"                   "gnutls"                "go"
 [77] "googletest"            "graphene"              "GraphicsMagick"        "GreenAlgorithms4HPC"
 [81] "gsl"                   "gtk+"                  "gtksourceview"         "gtool"
 [85] "hpg"                   "htslib"                "hunspell"              "icu"
 [89] "ImageJ"                "impute"                "JabRef"                "JAGS"
 [93] "jq"                    "json-c"                "KentUtils"             "KING"
 [97] "krb5"                  "lapack"                "ldc2"                  "ldsc"
[101] "LDstore"               "LEMMA"                 "libcares"              "libgit2"
[105] "libglvnd"              "libiconv"              "libidn2"               "libntlm"
[109] "libpng"                "libseccomp"            "libsodium"             "libssh"
[113] "libssh2"               "libuv"                 "libxml2"               "libxslt"
[117] "locuszoom"             "LVM2"                  "MAGENTA"               "magma"
[121] "Mango"                 "Mega2"                 "metal"                 "MONSTER"
[125] "MORGAN"                "MR-MEGA"               "MsCAVIAR"              "nano"
[129] "ncbi-vdb"              "ncurses"               "netbeans"              "nettle"
[133] "nextflow"              "NLopt"                 "node"                  "nspr"
[137] "oniguruma"             "openjdk"               "OpenMS"                "openssh"
[141] "openssl"               "osca"                  "PAINTOR"               "pandoc"
[145] "pandoc-citeproc"       "pango"                 "parallel"              "Pascal"
[149] "pcre2"                 "pdf2djvu"              "pdfjam"                "pgsc_calc"
[153] "phenoscanner"          "PhySO"                 "picard"                "plink"
[157] "plink-bgi"             "plinkseq"              "PoGo"                  "polyphen"
[161] "poppler"               "popt"                  "proj"                  "PRSice"
[165] "pspp"                  "pulsar"                "PWCoCo"                "qctool"
[169] "qpdf"                  "qt"                    "qtcreator"             "QTLtools"
[173] "quarto"                "quicktest"             "R"                     "raremetal"
[177] "rclone"                "readline"              "regenie"               "regtools"
[181] "RHHsoftware"           "rst2pdf"               "rstudio"               "ruby"
[185] "rust"                  "samtools"              "Scala"                 "shapeit"
[189] "singularity"           "SMR"                   "snakemake"             "SNP2HLA"
[193] "snptest"               "spread-sheet-widget"   "sqlite"                "sra-tools"
[197] "ssw"                   "STAR"                  "stata"                 "SurvivalAnalysis"
[201] "SurvivalKit"           "Swift"                 "tabix"                 "tatami"
[205] "thunderbird"           "tidy"                  "trinculo"              "trousers"
[209] "Typora"                "unbound"               "vala"                  "VarScan"
[213] "vcftools"              "VEGAS2"                "verifyBamID"           "VSCode"
[217] "VSCodium"              "vte"                   "xpdf"                  "yaml-cpp"
[221] "Zotero"                "zstd"

These are wrapped up as :star::star::star: modules :star::star::star:.

The original list prior to mid-November 2022 is given below1.

Usage

We illustrate with pspp. A brief description of a module is available with

module help ceuadmin/pspp

and the module is loaded and graphical user interface (GUI)2 started with

module load ceuadmin/pspp
psppire

for version 2.0.0-pre1. Once the job is done, one can restore the previous environment with

module unload ceuadmin/pspp

Note that module add/rm is equivalent to module load/unload.

Some modules are based on compiled Java (.jar) which can be called directly but it is handy to use preset environment variables, e.g.,

module load ceuadmin/picard
java -jar ${PICARD_HOME}/picard.jar --help

A full list of module subcommands is available with module help as detailed here for 3.2.9 – CSD3 uses version 3.2.10 dated 2012-12-21. In particular, module whatis ceuadmin/ensembl-vep indicates usage regarding build37/build38 setup for the loftee plugin used in loss of function (LoF) annotation.

Most software are available for all CSD3 users, only limited by software with excessive size / reference data – which ideally will be available from /rds/project/jmmh2/software but now /rds/project/jmmh2/rds-jmmh2-public_databases/software as a trade-off. These can largely be seen as sources which are used to build the reoository given above.

CEU users will be able to use ANNOVAR, ensembl-vep, OpenMS, phenoscanner, polyphen, KentUtils/MAGMA/Pascal/VEGASV2/fgwas/locuszoom linking internal projects/personal space (additional requests need to be made). A large collection of R packages (1,414 as of 14/3/2024) is linked with the latest R distribution, 4.3.3; there are also packages under 4.3.3-gcc11 as well as 4.3.3-icelake.

For CEU users, it is easy to point to them, e.g.,

export HPC_WORK=/rds/user/$USER/hpc-work/
export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export R_LIBS=${RDS}/R:${RDS}/R-4.3.3/library

or possible to have your own installations based on these, e.g., through creation of a modified Makefile with altered prefix followed by make install -f <modified Makefile>.

The following script tests for loading of dplyr:

export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export PATH=${PATH}:${RDS}/R-4.3.3/bin
export R_LIBS=${RDS}/R-4.3.3/library:${RDS}/R
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'

It appears clumsy to do these every time, so an attempt is made to have them in a module, namely

module load ceuadmin/R/latest
which R
echo $R_LIBS
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'

For non-CEU users, please drop an email to jhz22@medschl.cam.ac.uk for access.

Module creation

The following example shows how to set up a module,

#!/bin/bash

mkdir tmp-xz
cd tmp-xz
wget http://tukaani.org/xz/xz-5.2.2.tar.gz
tar zxvf xz-5.2.2.tar.gz
cd xz-5.2.2
mkdir -p /usr/local/Cluster-Apps/xz/5.2.2
export PREFIX=/usr/local/Cluster-Apps/xz/5.2.2
./configure --prefix=$PREFIX
make
make check
sg swinst 'make install'

cat << 'EOL' > /usr/local/Cluster-Config/modulefiles/xz/5.2.2
#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {

  puts stderr "\tXZ Utils is free general-purpose data compression software with a high compression ratio.\n"
  puts stderr "\tInstalled under: /usr/local/Cluster-Apps/xz/5.2.2
     Hompage:http://tukaani.org/xz/"

}

module-whatis "xz free general-purpose data compression"

conflict xz
set               root                  /usr/local/Cluster-Apps/xz/5.2.2
prepend-path      PATH                  $root/bin
prepend-path      MANPATH               $root/man
prepend-path      LD_LIBRARY_PATH       $root/lib
prepend-path      LIBRARY_PATH          $root/lib
prepend-path      FPATH                 $root/include
prepend-path      CPATH                 $root/include
prepend-path      INCLUDE               $root/include
setenv            XZ_HOME               $root
EOL

The module is made visible through environment variable MODULEPATH. Note that there will be permission issue for a user, however, to make changes to /usr/local/Cluster-Apps.

The module files are defined at /usr/local/Cluster-Config/modulefiles/ceuadmin. Most software use gcc/6; when required it can be enabled with module load gcc/6; however packages could also require libgfortran.so.5 as in gcc/9 – as a compromise one can amend .bashrc to include lines such as export LD_LIBRARY_PATH=/usr/local/software/master/gcc/9/lib64:$LD_LIBRARY_PATH.

Footnotes

Further information is avaiiable from /usr/local/Cluster-Apps/ceuadmin/doc/ceuadmin.md, ceuadmin.html.


  1. The original list was a mixture of modules and directories as follows,

    bgenix/               impute_v2.3.2_x86_64_static/  plink/                        R/                 Raremetal_linux_executables/        snptest_new/
    biobank/              interval/                     plink_1.90_beta/              raremetal_4.13/    Raremetal_linux_executables.tgz     source/
    boltlmm/              JAGS/                         plink_bgi_Dev/                raremetal_4.13.3/  raremetal.log                       stata/
    boltlmm_2.2/          LDstore/                      plink-bgi_linux_x86_64_may/   raremetal_4.13.4/  regenie/                            tabix/
    crossmap/             locuszoom/                    plink_linux_x86_64_beta2a/    raremetal_4.13.5/  samtools-1.10.tar.bz2               temp/
    exomeplus/            magma/                        plink_linux_x86_64_beta3.32/  raremetal_4.13.7/  samtools_1.2/                       vcftools/
    gcta/                 MAGMA_Celltyping/             plinkseq-0.08-x86_64/         raremetal_4.13.8/  shapeit.v2.r790.RHELS_5.4.dynamic/  vcftools_ps629/
    gtool_v0.7.5_x86_64/  metabolomics/                 plinkseq-0.10/                raremetal_4.14.0/  snptest/
    hpg/                  metal/                        pspp/                         raremetal_4.14.1/  snptest_2.5.2/
    htslib/               metal_updated/                qctool_v1.4-linux-x86_64/     raremetal_BPGen/   snptest_2.5.4_beta3/
    

    A grep of recent add-ons in the Genetics category is as follows,

    Date Add.ons Category
    2022-10-22 snptest/2.5.6 Genetics
    "" qctool/2.0.8 Genetics
    "" gcta/1.94.1 Genetics
    "" KING/2.1.6 Genetics
    "" LDstore/2.0 Genetics
    "" shapeit/3 Genetics
    "" vcftools/0.1.16 Genetics
    "" finemap/1.4 Genetics
    2022-10-23 quicktest/1.1 Genetics
    "" samtools/1.11 Genetics
    "" bcftools/1.12 Genetics
    "" MORGAN/3.4 Genetics
    "" METAL/2020-05-05r Genetics3
    "" regenie/3.2.1 Genetics
    "" GEMMA/0.98.5 Genetics4
    "" htslib/1.12 Genetics
    "" fcGENE/1.0.7 Genetics5
    "" SMR/1.0.3 Genetics
    "" FastQTL/2.165 Genetics
    2022-10-26 circos/0.69-9 Genetics
    "" bgen/1.1.7 Genetics
    "" DosageConverter/1.0.0 Genetics
    "" QTLtools/1.3.1-25 Genetics6
    "" blat/37x1 Genetics
    "" bedtools2/2.29.2 Genetics
    "" bedops/2.4.41 Genetics
    2022-11-03 Beagle/3.0.4 Genetics
    2022-11-08 CrossMap/0.6.4 Genetics
    "" SurvivalKit/6.12 Genetics
    "" PRSice/2.3.3 Genetics
    2022-11-09 qctool/2.2.0 Genetics
    2022-11-10 CaVEMaN/1.01-c1815a0 Genetics
    "" akt/0.3.3 Genetics
    "" MsCAVIAR/0.6.4 Genetics
    "" CAVIAR/2.2 Genetics
    "" MONSTER/1.3 Genetics
    "" osca/0.46 Genetics
    "" LEMMA/1.0.4 Genetics7
    "" CAVIARBF/0.2.1 Genetics
    2022-11-11 PAINTOR/3.0 Genetics
    2022-11-14 MR-MEGA/0.2 Genetics
    2022-11-16 SNP2HLA/1.0.3 Genetics
    "" STAR/2.7.10b Genetics
    "" Mega2/6.0.0 Genetics
    2022-11-19 ensembl-vep/104 Genetics*
    "" OpenMS/3.0.0 Genetics*8
    "" polyphen/2.2.2 Genetics*
    "" ANNOVAR/24Oct2019 Genetics*
    "" MAGENTA/vs2_July2011 Genetics*
    "" GARFIELD/v2 Genetics*
    "" KentUtils/2022-11-14 Genetics*
    2022-11-20 Genotype-Harmonizer/1.4.25 Genetics
    2022-11-21 locuszoom/1.4 Genetics*9
    "" DEPICT/v1_rel194 Genetics*
    "" MAGMA/1.10 Genetics*
    "" Pascal/v_debut Genetics*
    "" VEGAS2/2.01.17 Genetics*
    "" fgwas/0.3.6 Genetics*
    2022-12-04 phenoscanner/v2 Genetics*
    2022-12-07 SurvivalAnalysis/2016-05-09 Genetics
    2023-01-03 Eagle/2.4.1 Genetics
    2023-01-05 GEM/1.4.5 Genetics
    2023-02-01 GENEHUNTER/2.1_r6 Genetics
    2023-03-14 regenie/3.2.5 Genetics
    2023-03-24 PoGo/1.0.0 Genetics
    2023-03-31 PWCoCo/2023-03-31 Genetics10
    2023-04-02 regenie/3.2.5.3 Genetics
    2023-04-04 PWCoCo/1.0 Genetics
    2023-06-02 regenie/3.2.7 Genetics11
    2023-06-06 allegro/2.0f Genetics
    2023-06-19 plink-ng/2.00a3.3 Genetics
    2023-06-26 RHHsoftware/0.1 Genetics
    2023-07-28 PWCoCo/1.1 Genetics
    2023-08-02 regenie/3.2.9 Genetics
    2023-08-06 finemap/1.4.2 Genetics
    2023-09-27 ncbi-vdb/3.0.8 Genetics
    "" sra-tools/3.0.8 Genetics
    "" gatk/4.4.0.0 Genetics
    2023-11-24 ldsc/1.0.1 Genetics
    2023-11-30 gdc/1.6.1-1.0.0 Genetics12
    2023-12-20 verifyBamID/1.1.3 Genetics
    2023-12-21 verifyBamID/2.0.1 Genetics
    2023-12-27 regtools/1.0.0 Genetics13
    "" VarScan/2.4.6 Genetics14
    2024-01-08 picard/3.1.1 Genetics
    "" plink/2.0_20240105 Genetics
    2024-01-19 htslib/1.19 Genetics
    2024-01-24 fraposa_pgsc/0.1.0 Genetics15
    "" pgsc_calc/2.0.0-alpha.4 Genetics16

    * CEU or approved users only.

    More detailed diagrams on recently added genetics (89) and generic (137) software are as follows,

    Genetics Generic 

  2. GUI

    As GUI-based programs claim more computing resources, it is recommended that they are only used occasionally, e.g., calling back GitHub sessions. 

  3. metal

    Notes on METAL 2020-05-05r

    This version has options EFFECT_PRINT_PRECISION and STDERR_PRINT_PRECISION (both with default 4) to enable many decimal places.

    The letter r as in 2020-05-05r indicates a replacement of functions in libsrc/MathStats.cpp to ensure generality – details have also been posted to the GitHub page, https://github.com/statgen/METAL/issues/24.

    FATAL ERROR -
    a too large, ITMAX too small in gamma countinued fraction (gcf)
    
    so the -1.info file could not be generated.
    

  4. gemma

    Note on compiling from source

    A considerably smaller (1,097,256 vs 22,721,624) executable, /usr/local/Cluster-Apps/ceuadmin/GEMMA/0.98.5/bin, is generated under CSD3 but the original distribution is used by default.

    module load openblas/0.2.15
    make
    

  5. fcgene

    Alternative site

    See https://github.com/dr-roshyara/fcgene 

  6. qtltools

    The long version number is 1.3.1-25-g6e49f85f20. 

  7. lemma

    The documentation indicates a requirement of gcc/9.4, boost/1.78, OpenMP/3.1 and/or Intel MKL Library 2019 Update 1 but it is possible to proceed with gcc/11, cmake-3.19.7-gcc-5.4-5gbsejo, boost-1.66.0-gcc-5.4.0-slpq3un, ceuadmin/bgen/1.1.7. 

  8. openms

    When the OpenMS module is loaded, pyopenms and alphapept also become available. 

  9. locuszoom

    The version adds chromosome X data and will have options using INTERVAL data. 

  10. pwcoco

    It compiles under gcc/9. Upon release of 1.1, this snapshot is removed. 

  11. regenie

    Building regenie 3.2.7

    cd ~/rds/public_databases/software/
    wget -qO- https://github.com/rgcgithub/regenie/archive/refs/tags/v3.2.7.tar.gz | \
    tar xvfz -
    cd regenie-3.2.7/
    export BGEN_PATH=~/rds/public_databases/software/bgen
    module load zlib/1.2.11
    export ZLIB_LIBRARY=/usr/local/Cluster-Apps/zlib/1.2.11
    module load gcc/6
    module load cmake-3.19.7-gcc-5.4-5gbsejo
    module load intel/mkl/mic/2018.4
    mkdir build
    cd build
    cmake ..
    make
    

  12. gdc

    It also includes gdc_dtt-ui 1.0.0 

  13. regtools

    gcc/6 is required for C++11. 

  14. varscan

    Simply call java -jar $VARSCAN_HOME/VarScan.v2.4.6.jar after module load ceuadmin/VarScan/2.4.6

  15. fraposa

    Several packages, including poetry, poetry-plugin-export and fraposa_pgsc, will be installed as follows,

    module load ceuadmin/Anaconda3/2023.09-0
    pip install poetry
    pip3 install poetry-plugin-export
    pip install --use-feature=fast-deps .
    scripts/run_example.sh
    

    This is necessay since by default peotry install will use user's home directory. As indicated from poetry install --help:

    The install command reads the poetry.lock file from the current directory, processes it, and downloads and installs all the libraries and dependencies outlined in that file. If the file does not exist it will look for pyproject.toml and do the same. 

  16. pgsc_calc

    Application, https://pgsc-calc.readthedocs.io/en/latest/index.html

    nextflow run pgscatalog/pgsc_calc -profile test,singularity
    

    It appears quarto is called so presumably under icelake.