ceuadmin
The CEU software repository is here, /usr/local/Cluster-Apps/ceuadmin/.
noting that the importance of software is purely random according to \(Poisson(N,\lambda)\) where \(N=222\), \(\lambda=3\).
Entries
The current list is as follows,
[1] "ABCtoolbox" "akt" "allegro" "alpine"
[5] "Anaconda3" "annovar" "aria2" "autoconf"
[9] "automake" "axel" "bazel" "bcftools"
[13] "Beagle" "bedops" "bedtools2" "bgen"
[17] "biobank" "blat" "boltlmm" "brotli"
[21] "busybox" "CaVEMaN" "CAVIAR" "CAVIARBF"
[25] "ccal" "circos" "citeproc" "cmake"
[29] "cppunit" "crossmap" "cryptsetup" "Cytoscape"
[33] "deno" "DEPICT" "device-mapper" "DjVuLibre"
[37] "docbook2X" "DosageConverter" "Eagle" "enchant"
[41] "ensembl-vep" "exiv2" "exomeplus" "expat"
[45] "FastQTL" "fcGENE" "ffmpeg" "fgwas"
[49] "finemap" "fossil" "fpc" "fraposa_pgsc"
[53] "fribidi" "GARFIELD" "gatk" "gcta"
[57] "gdal" "gdc" "geany" "GEM"
[61] "GEMMA" "Genotype-Harmonizer" "gettext" "gh"
[65] "ghc" "ghostscript" "git" "git-extras"
[69] "GitKraken" "glib" "glibc" "globusconnectpersonal"
[73] "glpk" "gmp" "gnutls" "go"
[77] "googletest" "graphene" "GraphicsMagick" "GreenAlgorithms4HPC"
[81] "gsl" "gtk+" "gtksourceview" "gtool"
[85] "hpg" "htslib" "hunspell" "icu"
[89] "ImageJ" "impute" "JabRef" "JAGS"
[93] "jq" "json-c" "KentUtils" "KING"
[97] "krb5" "lapack" "ldc2" "ldsc"
[101] "LDstore" "LEMMA" "libcares" "libgit2"
[105] "libglvnd" "libiconv" "libidn2" "libntlm"
[109] "libpng" "libseccomp" "libsodium" "libssh"
[113] "libssh2" "libuv" "libxml2" "libxslt"
[117] "locuszoom" "LVM2" "MAGENTA" "magma"
[121] "Mango" "Mega2" "metal" "MONSTER"
[125] "MORGAN" "MR-MEGA" "MsCAVIAR" "nano"
[129] "ncbi-vdb" "ncurses" "netbeans" "nettle"
[133] "nextflow" "NLopt" "node" "nspr"
[137] "oniguruma" "openjdk" "OpenMS" "openssh"
[141] "openssl" "osca" "PAINTOR" "pandoc"
[145] "pandoc-citeproc" "pango" "parallel" "Pascal"
[149] "pcre2" "pdf2djvu" "pdfjam" "pgsc_calc"
[153] "phenoscanner" "PhySO" "picard" "plink"
[157] "plink-bgi" "plinkseq" "PoGo" "polyphen"
[161] "poppler" "popt" "proj" "PRSice"
[165] "pspp" "pulsar" "PWCoCo" "qctool"
[169] "qpdf" "qt" "qtcreator" "QTLtools"
[173] "quarto" "quicktest" "R" "raremetal"
[177] "rclone" "readline" "regenie" "regtools"
[181] "RHHsoftware" "rst2pdf" "rstudio" "ruby"
[185] "rust" "samtools" "Scala" "shapeit"
[189] "singularity" "SMR" "snakemake" "SNP2HLA"
[193] "snptest" "spread-sheet-widget" "sqlite" "sra-tools"
[197] "ssw" "STAR" "stata" "SurvivalAnalysis"
[201] "SurvivalKit" "Swift" "tabix" "tatami"
[205] "thunderbird" "tidy" "trinculo" "trousers"
[209] "Typora" "unbound" "vala" "VarScan"
[213] "vcftools" "VEGAS2" "verifyBamID" "VSCode"
[217] "VSCodium" "vte" "xpdf" "yaml-cpp"
[221] "Zotero" "zstd"
These are wrapped up as modules .
The original list prior to mid-November 2022 is given below1.
Usage
We illustrate with pspp
. A brief description of a module is available with
module help ceuadmin/pspp
and the module is loaded and graphical user interface (GUI)2 started with
module load ceuadmin/pspp
psppire
for version 2.0.0-pre1. Once the job is done, one can restore the previous environment with
module unload ceuadmin/pspp
Note that module add/rm
is equivalent to module load/unload
.
Some modules are based on compiled Java (.jar) which can be called directly but it is handy to use preset environment variables, e.g.,
module load ceuadmin/picard
java -jar ${PICARD_HOME}/picard.jar --help
A full list of module subcommands is available with module help
as detailed here for
3.2.9 – CSD3 uses version 3.2.10 dated 2012-12-21. In particular, module whatis ceuadmin/ensembl-vep
indicates usage regarding build37/build38 setup for the loftee
plugin used in loss of function (LoF)
annotation.
Most software are available for all CSD3 users, only limited by software with excessive size / reference data – which ideally will be
available from /rds/project/jmmh2/software
but now /rds/project/jmmh2/rds-jmmh2-public_databases/software
as a trade-off. These can
largely be seen as sources which are used to build the reoository given above.
CEU users will be able to use ANNOVAR
, ensembl-vep
, OpenMS
, phenoscanner
, polyphen
, KentUtils
/MAGMA
/Pascal
/VEGASV2
/fgwas
/locuszoom
linking internal projects/personal space (additional requests need to be made). A large collection of R packages (1,414 as of 14/3/2024)
is linked with the latest R distribution, 4.3.3; there are also packages under 4.3.3-gcc11 as well as 4.3.3-icelake.
For CEU users, it is easy to point to them, e.g.,
export HPC_WORK=/rds/user/$USER/hpc-work/
export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export R_LIBS=${RDS}/R:${RDS}/R-4.3.3/library
or possible to have your own installations based on these, e.g., through creation of a modified Makefile
with altered prefix followed
by make install -f <modified Makefile>
.
The following script tests for loading of dplyr
:
export RDS=/rds/project/jmmh2/rds-jmmh2-public_databases/software
export PATH=${PATH}:${RDS}/R-4.3.3/bin
export R_LIBS=${RDS}/R-4.3.3/library:${RDS}/R
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'
It appears clumsy to do these every time, so an attempt is made to have them in a module, namely
module load ceuadmin/R/latest
which R
echo $R_LIBS
Rscript -e 'suppressMessages(library(dplyr));cat("OK!\n")'
For non-CEU users, please drop an email to jhz22@medschl.cam.ac.uk for access.
Module creation
The following example shows how to set up a module,
#!/bin/bash
mkdir tmp-xz
cd tmp-xz
wget http://tukaani.org/xz/xz-5.2.2.tar.gz
tar zxvf xz-5.2.2.tar.gz
cd xz-5.2.2
mkdir -p /usr/local/Cluster-Apps/xz/5.2.2
export PREFIX=/usr/local/Cluster-Apps/xz/5.2.2
./configure --prefix=$PREFIX
make
make check
sg swinst 'make install'
cat << 'EOL' > /usr/local/Cluster-Config/modulefiles/xz/5.2.2
#%Module -*- tcl -*-
##
## modulefile
##
proc ModulesHelp { } {
puts stderr "\tXZ Utils is free general-purpose data compression software with a high compression ratio.\n"
puts stderr "\tInstalled under: /usr/local/Cluster-Apps/xz/5.2.2
Hompage:http://tukaani.org/xz/"
}
module-whatis "xz free general-purpose data compression"
conflict xz
set root /usr/local/Cluster-Apps/xz/5.2.2
prepend-path PATH $root/bin
prepend-path MANPATH $root/man
prepend-path LD_LIBRARY_PATH $root/lib
prepend-path LIBRARY_PATH $root/lib
prepend-path FPATH $root/include
prepend-path CPATH $root/include
prepend-path INCLUDE $root/include
setenv XZ_HOME $root
EOL
The module is made visible through environment variable MODULEPATH. Note that there will be permission issue for a user, however, to make changes to /usr/local/Cluster-Apps
.
The module files are defined at /usr/local/Cluster-Config/modulefiles/ceuadmin. Most software use gcc/6; when required it can be enabled with module load gcc/6
; however packages could also require libgfortran.so.5
as in gcc/9
– as a compromise one can amend .bashrc
to include lines such as export LD_LIBRARY_PATH=/usr/local/software/master/gcc/9/lib64:$LD_LIBRARY_PATH
.
Footnotes
Further information is avaiiable from /usr/local/Cluster-Apps/ceuadmin/doc/ceuadmin.md, ceuadmin.html.
-
The original list was a mixture of modules and directories as follows,
bgenix/ impute_v2.3.2_x86_64_static/ plink/ R/ Raremetal_linux_executables/ snptest_new/ biobank/ interval/ plink_1.90_beta/ raremetal_4.13/ Raremetal_linux_executables.tgz source/ boltlmm/ JAGS/ plink_bgi_Dev/ raremetal_4.13.3/ raremetal.log stata/ boltlmm_2.2/ LDstore/ plink-bgi_linux_x86_64_may/ raremetal_4.13.4/ regenie/ tabix/ crossmap/ locuszoom/ plink_linux_x86_64_beta2a/ raremetal_4.13.5/ samtools-1.10.tar.bz2 temp/ exomeplus/ magma/ plink_linux_x86_64_beta3.32/ raremetal_4.13.7/ samtools_1.2/ vcftools/ gcta/ MAGMA_Celltyping/ plinkseq-0.08-x86_64/ raremetal_4.13.8/ shapeit.v2.r790.RHELS_5.4.dynamic/ vcftools_ps629/ gtool_v0.7.5_x86_64/ metabolomics/ plinkseq-0.10/ raremetal_4.14.0/ snptest/ hpg/ metal/ pspp/ raremetal_4.14.1/ snptest_2.5.2/ htslib/ metal_updated/ qctool_v1.4-linux-x86_64/ raremetal_BPGen/ snptest_2.5.4_beta3/
A grep of recent add-ons in the Genetics category is as follows,
Date Add.ons Category 2022-10-22 snptest/2.5.6 Genetics "" qctool/2.0.8 Genetics "" gcta/1.94.1 Genetics "" KING/2.1.6 Genetics "" LDstore/2.0 Genetics "" shapeit/3 Genetics "" vcftools/0.1.16 Genetics "" finemap/1.4 Genetics 2022-10-23 quicktest/1.1 Genetics "" samtools/1.11 Genetics "" bcftools/1.12 Genetics "" MORGAN/3.4 Genetics "" METAL/2020-05-05r Genetics3 "" regenie/3.2.1 Genetics "" GEMMA/0.98.5 Genetics4 "" htslib/1.12 Genetics "" fcGENE/1.0.7 Genetics5 "" SMR/1.0.3 Genetics "" FastQTL/2.165 Genetics 2022-10-26 circos/0.69-9 Genetics "" bgen/1.1.7 Genetics "" DosageConverter/1.0.0 Genetics "" QTLtools/1.3.1-25 Genetics6 "" blat/37x1 Genetics "" bedtools2/2.29.2 Genetics "" bedops/2.4.41 Genetics 2022-11-03 Beagle/3.0.4 Genetics 2022-11-08 CrossMap/0.6.4 Genetics "" SurvivalKit/6.12 Genetics "" PRSice/2.3.3 Genetics 2022-11-09 qctool/2.2.0 Genetics 2022-11-10 CaVEMaN/1.01-c1815a0 Genetics "" akt/0.3.3 Genetics "" MsCAVIAR/0.6.4 Genetics "" CAVIAR/2.2 Genetics "" MONSTER/1.3 Genetics "" osca/0.46 Genetics "" LEMMA/1.0.4 Genetics7 "" CAVIARBF/0.2.1 Genetics 2022-11-11 PAINTOR/3.0 Genetics 2022-11-14 MR-MEGA/0.2 Genetics 2022-11-16 SNP2HLA/1.0.3 Genetics "" STAR/2.7.10b Genetics "" Mega2/6.0.0 Genetics 2022-11-19 ensembl-vep/104 Genetics* "" OpenMS/3.0.0 Genetics*8 "" polyphen/2.2.2 Genetics* "" ANNOVAR/24Oct2019 Genetics* "" MAGENTA/vs2_July2011 Genetics* "" GARFIELD/v2 Genetics* "" KentUtils/2022-11-14 Genetics* 2022-11-20 Genotype-Harmonizer/1.4.25 Genetics 2022-11-21 locuszoom/1.4 Genetics*9 "" DEPICT/v1_rel194 Genetics* "" MAGMA/1.10 Genetics* "" Pascal/v_debut Genetics* "" VEGAS2/2.01.17 Genetics* "" fgwas/0.3.6 Genetics* 2022-12-04 phenoscanner/v2 Genetics* 2022-12-07 SurvivalAnalysis/2016-05-09 Genetics 2023-01-03 Eagle/2.4.1 Genetics 2023-01-05 GEM/1.4.5 Genetics 2023-02-01 GENEHUNTER/2.1_r6 Genetics 2023-03-14 regenie/3.2.5 Genetics 2023-03-24 PoGo/1.0.0 Genetics 2023-03-31 PWCoCo/2023-03-31 Genetics10 2023-04-02 regenie/3.2.5.3 Genetics 2023-04-04 PWCoCo/1.0 Genetics 2023-06-02 regenie/3.2.7 Genetics11 2023-06-06 allegro/2.0f Genetics 2023-06-19 plink-ng/2.00a3.3 Genetics 2023-06-26 RHHsoftware/0.1 Genetics 2023-07-28 PWCoCo/1.1 Genetics 2023-08-02 regenie/3.2.9 Genetics 2023-08-06 finemap/1.4.2 Genetics 2023-09-27 ncbi-vdb/3.0.8 Genetics "" sra-tools/3.0.8 Genetics "" gatk/4.4.0.0 Genetics 2023-11-24 ldsc/1.0.1 Genetics 2023-11-30 gdc/1.6.1-1.0.0 Genetics12 2023-12-20 verifyBamID/1.1.3 Genetics 2023-12-21 verifyBamID/2.0.1 Genetics 2023-12-27 regtools/1.0.0 Genetics13 "" VarScan/2.4.6 Genetics14 2024-01-08 picard/3.1.1 Genetics "" plink/2.0_20240105 Genetics 2024-01-19 htslib/1.19 Genetics 2024-01-24 fraposa_pgsc/0.1.0 Genetics15 "" pgsc_calc/2.0.0-alpha.4 Genetics16 * CEU or approved users only.
More detailed diagrams on recently added genetics (89) and generic (137) software are as follows,
-
GUI
As GUI-based programs claim more computing resources, it is recommended that they are only used occasionally, e.g., calling back GitHub sessions. ↩
-
metal
Notes on METAL 2020-05-05r
This version has options EFFECT_PRINT_PRECISION and STDERR_PRINT_PRECISION (both with default 4) to enable many decimal places.
The letter
r
as in2020-05-05r
indicates a replacement of functions inlibsrc/MathStats.cpp
to ensure generality – details have also been posted to the GitHub page, https://github.com/statgen/METAL/issues/24.FATAL ERROR - a too large, ITMAX too small in gamma countinued fraction (gcf) so the -1.info file could not be generated.
-
gemma
Note on compiling from source
A considerably smaller (1,097,256 vs 22,721,624) executable, /usr/local/Cluster-Apps/ceuadmin/GEMMA/0.98.5/bin, is generated under CSD3 but the original distribution is used by default.
module load openblas/0.2.15 make
-
fcgene
Alternative site
-
qtltools
The long version number is 1.3.1-25-g6e49f85f20. ↩
-
lemma
The documentation indicates a requirement of gcc/9.4, boost/1.78, OpenMP/3.1 and/or Intel MKL Library 2019 Update 1 but it is possible to proceed with gcc/11, cmake-3.19.7-gcc-5.4-5gbsejo, boost-1.66.0-gcc-5.4.0-slpq3un, ceuadmin/bgen/1.1.7. ↩
-
openms
When the OpenMS module is loaded, pyopenms and alphapept also become available. ↩
-
locuszoom
The version adds chromosome X data and will have options using INTERVAL data. ↩
-
pwcoco
It compiles under gcc/9. Upon release of 1.1, this snapshot is removed. ↩
-
regenie
Building regenie 3.2.7
cd ~/rds/public_databases/software/ wget -qO- https://github.com/rgcgithub/regenie/archive/refs/tags/v3.2.7.tar.gz | \ tar xvfz - cd regenie-3.2.7/ export BGEN_PATH=~/rds/public_databases/software/bgen module load zlib/1.2.11 export ZLIB_LIBRARY=/usr/local/Cluster-Apps/zlib/1.2.11 module load gcc/6 module load cmake-3.19.7-gcc-5.4-5gbsejo module load intel/mkl/mic/2018.4 mkdir build cd build cmake .. make
-
gdc
It also includes gdc_dtt-ui 1.0.0 ↩
-
regtools
gcc/6 is required for C++11. ↩
-
varscan
Simply call
java -jar $VARSCAN_HOME/VarScan.v2.4.6.jar
aftermodule load ceuadmin/VarScan/2.4.6
. ↩ -
fraposa
Several packages, including poetry, poetry-plugin-export and fraposa_pgsc, will be installed as follows,
module load ceuadmin/Anaconda3/2023.09-0 pip install poetry pip3 install poetry-plugin-export pip install --use-feature=fast-deps . scripts/run_example.sh
This is necessay since by default
peotry install
will use user's home directory. As indicated frompoetry install --help
:The install command reads the poetry.lock file from the current directory, processes it, and downloads and installs all the libraries and dependencies outlined in that file. If the file does not exist it will look for pyproject.toml and do the same. ↩
-
pgsc_calc
Application, https://pgsc-calc.readthedocs.io/en/latest/index.html
nextflow run pgscatalog/pgsc_calc -profile test,singularity
It appears quarto is called so presumably under icelake. ↩