ceuadmin
The CEU software repository is here, /usr/local/Cluster-Apps/ceuadmin/.
More detailed diagrams on recently added genetics/proteomics and generic software are as follows,
noting that the importance of software is purely random according to \(Poisson(N,\lambda)\) where \(N\) is the number of entries, \(\lambda=3\).
Entries
The current list is as follows,
[1] "ABCtoolbox" "akt" "allegro" "alpine"
[5] "Anaconda3" "annovar" "aria2" "augeas"
[9] "autoconf" "automake" "axel" "bazel"
[13] "bcftools" "Beagle" "bedops" "bedtools2"
[17] "bgen" "biobank" "blat" "boltlmm"
[21] "boost" "brotli" "busybox" "caddy"
[25] "CaVEMaN" "CAVIAR" "CAVIARBF" "ccal"
[29] "chromium" "circos" "citeproc" "cmake"
[33] "comet" "cppunit" "crossmap" "crux"
[37] "cryptopp" "cryptsetup" "curl" "Cytoscape"
[41] "deno" "DEPICT" "device-mapper" "diann"
[45] "DjVuLibre" "docbook2X" "docker" "DosageConverter"
[49] "dotnet" "Eagle" "edge" "enchant"
[53] "ensembl-vep" "exiv2" "exomeplus" "expat"
[57] "FastQTL" "fcGENE" "ffmpeg" "fgwas"
[61] "findlib" "finemap" "firefox" "FlashLFQ"
[65] "fossil" "fpc" "FragPipe" "fraposa_pgsc"
[69] "freesurfer" "fribidi" "GARFIELD" "gatk"
[73] "gcta" "gdal" "gdc" "geany"
[77] "GEM" "GEMMA" "Genotype-Harmonizer" "geos"
[81] "gettext" "gh" "ghc" "ghostscript"
[85] "git" "git-extras" "GitKraken" "glib"
[89] "glibc" "globusconnectpersonal" "glpk" "gmp"
[93] "gnutls" "go" "googletest" "graphene"
[97] "GraphicsMagick" "GreenAlgorithms4HPC" "gsl" "gtk+"
[101] "gtksourceview" "gtool" "hivex" "hpg"
[105] "htslib" "hunspell" "icu" "ImageJ"
[109] "ImageMagick" "impute" "inetutils" "IonQuant"
[113] "JabRef" "JAGS" "jasper" "jq"
[117] "json-c" "KentUtils" "KING" "kojak"
[121] "krb5" "lapack" "ldc2" "ldsc"
[125] "LDstore" "LEMMA" "libarchive" "libcares"
[129] "libgeotiff" "libgit2" "libglvnd" "libiconv"
[133] "libidn2" "libjpeg-turbo" "libntlm" "libpng"
[137] "libseccomp" "libsodium" "libssh" "libssh2"
[141] "libuv" "libxml2" "libxslt" "linux"
[145] "locuszoom" "LVM2" "MAGENTA" "magma"
[149] "Mango" "MaxQuant" "Mega2" "metal"
[153] "MetaMorpheus" "Miniconda3" "MONSTER" "MORGAN"
[157] "MR-MEGA" "msamanda" "MsCAVIAR" "MSFragger"
[161] "MS-GF+" "msms" "nano" "ncbi-vdb"
[165] "ncurses" "netbeans" "nettle" "nextflow"
[169] "nginx" "NLopt" "node" "nspr"
[173] "ntlm" "ocaml" "oniguruma" "opam"
[177] "openjdk" "OpenMS" "openssh" "openssl"
[181] "osca" "p7zip-zstd" "PAINTOR" "pandoc"
[185] "pandoc-citeproc" "pango" "parallel" "Pascal"
[189] "patchelf" "pcre2" "pdf2djvu" "pdfjam"
[193] "peer" "Perseus" "pgsc_calc" "phenoscanner"
[197] "PhySO" "picard" "pigz" "plink"
[201] "plink-bgi" "plinkseq" "podman" "PoGo"
[205] "polyphen" "poppler" "popt" "proj"
[209] "PRSice" "pspp" "pulsar" "PWCoCo"
[213] "pwiz" "qctool" "qemu" "qpdf"
[217] "qt" "qtcreator" "QTLtools" "quarto"
[221] "quicktest" "R" "raremetal" "rclone"
[225] "readline" "regenie" "regtools" "RHHsoftware"
[229] "rst2pdf" "rstudio" "rtmpdump" "ruby"
[233] "rust" "sage" "samtools" "Scala"
[237] "seqkit" "shapeit" "singularity" "SMR"
[241] "snakemake" "SNP2HLA" "snptest" "spread-sheet-widget"
[245] "spyder" "sqlite" "sra-tools" "sshpass"
[249] "ssw" "STAR" "stata" "SurvivalAnalysis"
[253] "SurvivalKit" "Swift" "SYMPHONY" "tabix"
[257] "tandem" "tatami" "ThermoRawFileParser" "ThermoRawFileParserGUI"
[261] "thunderbird" "tidy" "tiff" "trinculo"
[265] "trousers" "Typora" "unbound" "vala"
[269] "VarScan" "vcftools" "VEGAS2" "verifyBamID"
[273] "VSCode" "VSCodium" "vte" "wine"
[277] "wrk" "xpdf" "yaml-cpp" "Zotero"
[281] "zstd"
These are wrapped up as modules .
The original list prior to mid-November 2022 is given below1.
Usage
We illustrate with pspp
. A brief description of a module is available with
module help ceuadmin/pspp
and the module is loaded and graphical user interface (GUI)2 started with
module load ceuadmin/pspp
psppire
for version 2.0.1. Once the job is done, one can restore the previous environment with
module unload ceuadmin/pspp
Note that module add/rm
is equivalent to module load/unload
.
Some modules are based on compiled Java (.jar) which can be called directly but it is handy to use preset environment variables, e.g.,
module load ceuadmin/picard
java -jar ${PICARD_HOME}/picard.jar --help
A full list of module subcommands is available with module help
as detailed here for
3.2.9 – cclake uses version 3.2.10 (2012-12-21) while icelake uses 4.5.2 (2020-07-30). In particular, module whatis ceuadmin/ensembl-vep
indicates usage regarding build37/build38 setup for the loftee
plugin used in loss of function (LoF)
annotation.
Most software are available for all CSD3 users, only limited by software with excessive size / reference data – which ideally will be
available from /rds/project/jmmh2/software
but now /rds/project/jmmh2/rds-jmmh2-public_databases/software
as a trade-off. These can
largely be seen as sources which are used to build the reoository given above.
CEU users will be able to use ANNOVAR
, ensembl-vep
, OpenMS
, phenoscanner
, polyphen
, KentUtils
/MAGMA
/Pascal
/VEGASV2
/fgwas
/locuszoom
linking internal projects/personal space (additional requests need to be made). A large collection of R packages (1,705 as of 1/12/2024, esp. with availability of major machine learning packages)
is linked with the latest R distribution, 4.4.2; there are also 3 packages (DescTools, Rfast, Rfast2) under R-gcc11. Note that there are limitations with CSD3 so that sf
, terra
cannot be updated due to incomplete build of gdal
/proj
.
For CEU users, it is easy to point to them, e.g.,
export HPC_WORK=/rds/user/$USER/hpc-work/
export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export R_LIBS=${RDS}/R:${RDS}/R-4.4.2/library
or possible to have your own installations based on these, e.g., through creation of a modified Makefile
with altered prefix followed
by make install -f <modified Makefile>
.
The following script tests for loading of dplyr
:
export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export PATH=${PATH}:${RDS}/R-4.4.2/bin
export R_LIBS=${RDS}/R-4.4.2/library:${RDS}/R
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'
It appears clumsy to do these every time, so an attempt is made to have them in a module, namely
module load ceuadmin/R/latest
which R
echo $R_LIBS
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'
For non-CEU users, please drop an email to jhz22@medschl.cam.ac.uk for access.
Module creation
The following example shows how to set up a module,
#!/bin/bash
mkdir tmp-xz
cd tmp-xz
wget http://tukaani.org/xz/xz-5.2.2.tar.gz
tar zxvf xz-5.2.2.tar.gz
cd xz-5.2.2
mkdir -p /usr/local/Cluster-Apps/xz/5.2.2
export PREFIX=/usr/local/Cluster-Apps/xz/5.2.2
./configure --prefix=$PREFIX
make
make check
sg swinst 'make install'
cat << 'EOL' > /usr/local/Cluster-Config/modulefiles/xz/5.2.2
#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {
puts stderr "\tXZ Utils is free general-purpose data compression software with a high compression ratio.\n"
puts stderr "\tInstalled under: /usr/local/Cluster-Apps/xz/5.2.2
Hompage:http://tukaani.org/xz/"
}
module-whatis "xz free general-purpose data compression"
conflict xz
set root /usr/local/Cluster-Apps/xz/5.2.2
prepend-path PATH $root/bin
prepend-path MANPATH $root/man
prepend-path LD_LIBRARY_PATH $root/lib
prepend-path LIBRARY_PATH $root/lib
prepend-path FPATH $root/include
prepend-path CPATH $root/include
prepend-path INCLUDE $root/include
setenv XZ_HOME $root
EOL
The module is made visible through environment variable MODULEPATH. Note that there will be permission issue for a user, however, to make changes to /usr/local/Cluster-Apps
.
The module files are defined at /usr/local/Cluster-Config/modulefiles/ceuadmin. Most software stay with gcc/6 due to many dependencies of built modules; when required it can be enabled with module load gcc/6
; however packages could also require libgfortran.so.5
as in gcc/9
– as a compromise one can amend .bashrc
to include lines such as export LD_LIBRARY_PATH=/usr/local/software/master/gcc/9/lib64:$LD_LIBRARY_PATH
.
Footnotes
Further information is avaiiable from /usr/local/Cluster-Apps/ceuadmin/doc/ceuadmin.md, ceuadmin.html.
-
The original list was a mixture of modules and directories as follows,
bgenix/ impute_v2.3.2_x86_64_static/ plink/ R/ Raremetal_linux_executables/ snptest_new/ biobank/ interval/ plink_1.90_beta/ raremetal_4.13/ Raremetal_linux_executables.tgz source/ boltlmm/ JAGS/ plink_bgi_Dev/ raremetal_4.13.3/ raremetal.log stata/ boltlmm_2.2/ LDstore/ plink-bgi_linux_x86_64_may/ raremetal_4.13.4/ regenie/ tabix/ crossmap/ locuszoom/ plink_linux_x86_64_beta2a/ raremetal_4.13.5/ samtools-1.10.tar.bz2 temp/ exomeplus/ magma/ plink_linux_x86_64_beta3.32/ raremetal_4.13.7/ samtools_1.2/ vcftools/ gcta/ MAGMA_Celltyping/ plinkseq-0.08-x86_64/ raremetal_4.13.8/ shapeit.v2.r790.RHELS_5.4.dynamic/ vcftools_ps629/ gtool_v0.7.5_x86_64/ metabolomics/ plinkseq-0.10/ raremetal_4.14.0/ snptest/ hpg/ metal/ pspp/ raremetal_4.14.1/ snptest_2.5.2/ htslib/ metal_updated/ qctool_v1.4-linux-x86_64/ raremetal_BPGen/ snptest_2.5.4_beta3/
A grep of recent add-ons in the Genetics/Proteomics category is as follows,
Date Add.ons Category 2022-10-22 snptest/2.5.6 Genetics "" qctool/2.0.8 Genetics "" gcta/1.94.1 Genetics "" KING/2.1.6 Genetics "" LDstore/2.0 Genetics "" shapeit/3 Genetics "" vcftools/0.1.16 Genetics "" finemap/1.4 Genetics 2022-10-23 quicktest/1.1 Genetics "" samtools/1.11 Genetics "" bcftools/1.12 Genetics "" MORGAN/3.4 Genetics "" METAL/2020-05-05r Genetics "" regenie/3.2.1 Genetics "" GEMMA/0.98.5 Genetics "" htslib/1.12 Genetics "" fcGENE/1.0.7 Genetics "" SMR/1.0.3 Genetics "" FastQTL/2.165 Genetics 2022-10-26 circos/0.69-9 Genetics "" bgen/1.1.7 Genetics "" DosageConverter/1.0.0 Genetics "" QTLtools/1.3.1-25 Genetics "" blat/37x1 Genetics "" bedtools2/2.29.2 Genetics "" bedops/2.4.41 Genetics 2022-11-03 Beagle/3.0.4 Genetics 2022-11-08 CrossMap/0.6.4 Genetics "" SurvivalKit/6.12 Genetics "" PRSice/2.3.3 Genetics 2022-11-09 qctool/2.2.0 Genetics 2022-11-10 CaVEMaN/1.01-c1815a0 Genetics "" akt/0.3.3 Genetics "" MsCAVIAR/0.6.4 Genetics "" CAVIAR/2.2 Genetics "" MONSTER/1.3 Genetics "" osca/0.46 Genetics "" LEMMA/1.0.4 Genetics "" CAVIARBF/0.2.1 Genetics 2022-11-11 PAINTOR/3.0 Genetics 2022-11-14 MR-MEGA/0.2 Genetics 2022-11-16 SNP2HLA/1.0.3 Genetics "" STAR/2.7.10b Genetics "" Mega2/6.0.0 Genetics 2022-11-19 ensembl-vep/104 Genetics* "" OpenMS/3.0.0 Genetics* "" polyphen/2.2.2 Genetics* "" ANNOVAR/24Oct2019 Genetics* "" MAGENTA/vs2_July2011 Genetics* "" GARFIELD/v2 Genetics* "" KentUtils/2022-11-14 Genetics* 2022-11-20 Genotype-Harmonizer/1.4.25 Genetics 2022-11-21 locuszoom/1.4 Genetics* "" DEPICT/v1_rel194 Genetics* "" MAGMA/1.10 Genetics* "" Pascal/v_debut Genetics* "" VEGAS2/2.01.17 Genetics* "" fgwas/0.3.6 Genetics* 2022-12-04 phenoscanner/v2 Genetics* 2022-12-07 SurvivalAnalysis/2016-05-09 Genetics 2023-01-03 Eagle/2.4.1 Genetics 2023-01-05 GEM/1.4.5 Genetics 2023-02-01 GENEHUNTER/2.1_r6 Genetics 2023-03-14 regenie/3.2.5 Genetics 2023-03-24 PoGo/1.0.0 Genetics 2023-03-31 PWCoCo/2023-03-31 Genetics 2023-04-02 regenie/3.2.5.3 Genetics 2023-04-04 PWCoCo/1.0 Genetics 2023-06-02 regenie/3.2.7 Genetics 2023-06-06 allegro/2.0f Genetics 2023-06-19 plink-ng/2.00a3.3 Genetics 2023-06-26 RHHsoftware/0.1 Genetics 2023-07-28 PWCoCo/1.1 Genetics 2023-08-02 regenie/3.2.9 Genetics 2023-08-06 finemap/1.4.2 Genetics 2023-09-27 ncbi-vdb/3.0.8 Genetics "" sra-tools/3.0.8 Genetics "" gatk/4.4.0.0 Genetics 2023-11-24 ldsc/1.0.1 Genetics 2023-11-30 gdc/1.6.1-1.0.0 Genetics 2023-12-20 verifyBamID/1.1.3 Genetics 2023-12-21 verifyBamID/2.0.1 Genetics 2023-12-27 regtools/1.0.0 Genetics "" VarScan/2.4.6 Genetics 2024-01-08 picard/3.1.1 Genetics "" plink/2.0_20240105 Genetics 2024-01-19 htslib/1.19 Genetics 2024-01-24 fraposa_pgsc/0.1.0 Genetics "" pgsc_calc/2.0.0-alpha.4 Genetics 2024-04-22 peer/1.3 Genetics 2024-06-04 pwiz/3_0_24156_80747de Proteomics 2024-06-09 crux/4.2 Proteomics "" DIA-NN/1.8.1 Proteomics 2024-06-11 crux/4.1 Proteomics "" pwiz/3_0_24163_9bfa69a-wine Proteomics 2024-06-11 seqkit/2.8.2 Proteomics "" FlashLFQ/1.2.6 Proteomics "" MetaMorpheus/1.0.5 Proteomics 2024-06-25 msms/3.2rc-b163 Genetics 2024-07-13 msamanda/3.0.21.532 Proteomics 2024-07-31 tandem/2017.2.1.4 Proteomics 2024-08-11 comet/2024.01.1 Proteomics "" kojak/2.1.0 Proteomics "" kojak/1.5.5 Proteomics "" kojak/2.0.0a22 Proteomics 2024-08-12 MS-GF+/2024.03.26 Proteomics 2024-08-14 ThermoRawFileParser/1.4.4 Proteomics "" ThermoRawFileParserGUI/1.7.4 Proteomics "" FragPipe/22.0 Proteomics 2024-08-15 MSFragger/4.1 Proteomics "" IonQuant/1.10.27 Proteomics 2024-08-20 htslib/1.20 Genetics "" bcftools/1.20 Genetics "" samtools/1.20 Genetics 2024-08-23 qpdf/11.9.1 Generic 2024-09-01 MaxQuant/2.6.4.0 Proteomics "" Perseus/2.1.2.0 Proteomics 2024-10-13 sage/0.14.7 Proteomics * CEU or approved users only.
-
GUI
As GUI-based programs claim more computing resources, it is recommended that they are only used occasionally, e.g., calling back GitHub sessions. ↩