AlphaPept and PyOpenMS

The prerequisites involve CSD3 location, GNU C, cmake, TeX Live and Miniconda.

CSD3 location

This is set as follows,

export Caprion=/rds/project/jmmh2/rds-jmmh2-projects/Caprion_proteomics
cd ${Caprion}

GNU C, cmake and Tex Live

A number of aspects are worthwhile to highlight:

  1. OpenMS supports forc++17.
  2. It also requires a recent version of cmake, which will recognise its associate directives 1.
  3. TeX Live is not essential for OpenMS to be functional but allows for documentation with pdfTeX – additional packages are required 2.
  4. OpenMS requires somewhat earlier version of ghostscript which is actually available on CSD3 (GPL Ghostscript 9.25 (2018-09-13)) and we point to it via a symbolic link assuming ${HOME}/bin is on top of $PATH.

We have

module load gcc/7 cmake-3.19.7-gcc-5.4-5gbsejo texlive
ln -sf /usr/bin/ghostscript ${HOME}/bin/gs

In so doing gsftopk ntx-Bold-tlf-t1 600, etc. in the documentation building process will work. We obtain the much desired OpenMS_tutorial/latex_output/refman.pdf and OpenMS_tutorial.pdf.


This option is considerably easier and we set up the latest version.

# Step 1. Install Miniconda3
"$(/rds/project/jmmh2/rds-jmmh2-projects/Caprion_proteomics/miniconda3/bin/conda shell.bash hook)"
conda config --set auto_activate_base false
# Step 2. Specify module-like environment
export LD_LIBRARY_PATH=${Caprion}/miniconda3/lib:${LD_LIBRARY_PATH}
export PATH=${Caprion}/miniconda3/bin:${PATH}
export INCLUDE=${Caprion}/miniconda3/include:${INCLUDE}
export PYTHONPATH=${Caprion}/miniconda3/lib/python3.9/site-packages:${PYTHONPATH}

Note that Python 3.9.12 is installed, and in all cases the current environtal variables are carried over.

Only Step 2 is necessary in later calls.


Web: (latest installer).

After loading the Miniconda environment, we proceed with

pip install "alphapept[stable,gui-stable]"

Script for testing is called 3 which takes the following arguments,

Name Description
alphapept_settings.yaml A configuratino file containing relevant information, e.g., paths and n_processes.
szwk021704i19101xms1.raw, szwk021704i19101xms3.raw, szwk021704i19101xms5.raw Raw spectra
2022-07-05-reviewed-contam-UP000005640.fasta Database of protein sequences:



Miniconda installation

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda install -c openms pyopenms

so python responses. Note that currently it uses Python 3.9.6 therefore a slight backtrack which could be remedied by compiling from OpenMS in the next section.


Web: (Contributed software) (GitHub,

This procedure produces many efficient programs which are complementary to pyopenms.

cd ${Caprion}
git clone
cd OpenMS
git submodule update --init contrib
mkdir -p contrib/archives
cd contrib/archives
wget -nd --execute="robots = off" --mirror --convert-links --no-parent --wait=5 \
cd -
cmake -DBUILD_TYPE=ALL contrib
cmake -DOPENMS_CONTRIB_LIBS=${Caprion}/miniconda3/lib -DCMAKE_PREFIX_PATH=contrib -DPYOPENMS=ON ../OpenMS
make targets

The second wget statement is much more efficient to download all the files.

The last statement gives the most important targets for OpenMS, where TOPP refers to The OpenMS Proteomics Pipeline.


The following make targets are available:
    [no target]     builds the OpenMS library, TOPP tools and UTILS tools
    OpenMS          builds the OpenMS library
    TOPP            builds the TOPP tools
    UTILS           builds the UTILS tools
    GUI             builds the GUI tools (TOPPView,...)
    test            executes OpenMS and TOPP tests
                    make sure they are built using the 'all' target
    Tutorials_build builds the code snippets of the tutorials in source/EXAMPLES
    doc             builds the doxygen and class documentation, parameters
                    documentation, and tutorial PDFs
    doc_class_only  builds only the doxygen and class documentation
                    (faster then doc and very useful when writing
    doc_tutorials   builds the PDF tutorials
    help            list all available targets (very verbose)
    pyopenms           builds pyOpenMS inplace
    pyopenms_bdist_egg builds pyOpenMS bdist_egg
    pyopenms_bdist     builds pyOpenMS bdist as zip file
    pyopenms_rpm       builds pyOpenMS rpm

    (Disabled) OpenMS_coverage reporting target is not enabled (to enable use -D OPENMS_COVERAGE=ON).
               Caution: Building with debug and coverage info uses a lot of disk space (>40GB)

Single TOPP tools and UTILS have their own target, e.g. TOPPView


Upon completion, the pyopenms package is within OpenMS/pyOpenMS. It would come handy to link bin/, lib/, etc. in the environmental variables accordingly; in particular pyOpenMS to PYTHONPATH.

Beside the codebase, various other options are possible, e.g.,

wget -O Xerces-C_3_2_0.tar.gz
cmake -DOPENMS_CONTRIB_LIBS=${Caprion}/miniconda3/lib -DCMAKE_PREFIX_PATH=contrib -DMY_CXX_FLAGS="-std=c++17" -DWITH_GUI=OFF ../OpenMS

For a particular version without using GitHub, it is necessary to specify explicitly, e.g.,

cd ${Caprion}
export version=3.0.0
wget -qO-${version}-src.tar.gz | \
tar xfz -
cd ${Caprion}/OpenMS-${version}
mkdir -p contrib/archives
wget -nd --execute="robots = off" --mirror --convert-links --no-parent --wait=5 \
cd -
cmake -DBUILD_TYPE=ALL contrib


As with instances elsewhere, this section is kept not only for historical reasons but also for some useful information.

Python itself is lightweight but more involved.


The installations involve environements for several versions of Python, e.g., Python 3.7/3.8.

module load python/3.7
virtualenv py37
source py37/bin/activate


module load python/3.8
virtualenv py38
source py38/bin/activate

Note the virtualenv py3[7|8] lines are unnecessary after the installations.


This is in regard to use of module on CSD3.

module load miniconda/3
conda create -n miniconda38 python=3.8 ipykernel
conda activate miniconda38
conda deactivate

where conda create is unnecessary in later calls.


This requires Python >= 3.8.

cd ${Caprion}
module load python/3.8
source py38/bin/activate
pip install pyyaml --prefix=${Caprion}/py38
pip install numba --prefix=${Caprion}/py38
pip install tqdm --prefix=${Caprion}/py38
pip install h5py --prefix=${Caprion}/py38
pip install fastcore --prefix=${Caprion}/py38
pip install psutil --prefix=${Caprion}/py38
pip install click --prefix=${Caprion}/py38
pip install Bio --prefix=${Caprion}/py38
pip install networkx --prefix=${Caprion}/py38
pip install sqlalchemy --prefix=${Caprion}/py38
pip install matplotlib --prefix=${Caprion}/py38
pip install sklearn --prefix=${Caprion}/py38
python -m pip install --upgrade numpy --user
wget -qO- | \
tar xvfz -
cd alphapept-0.4.8
python install --prefix=${Caprion}/py38

Be careful with conflicts of packages.

The newest features are through the nightly build/wheel (needs to be version 3.0+).

wget -qO-\?status\=completed

We have

 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
96180971  Defl:N 95271735   1% 09-25-2022 00:41 37b53f6d  pyopenms_nightly-3.0.0.dev20220924-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
76497352  Defl:N 75780910   1% 09-25-2022 00:41 63b654a1  pyopenms_nightly-3.0.0.dev20220924-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
56599302  Defl:N 56080139   1% 09-25-2022 00:41 86fa8bd7  pyopenms_nightly-3.0.0.dev20220924-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
--------          -------  ---                            -------
229277625         227132784   1%                            3 files

It turned out only the first file is approppriate on CSD3, so we proceeded with

module load python/3.7
source py37/bin/activate
pip install pyopenms_nightly-3.0.0.dev20220924-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl --no-cache-dir  --prefix=${Caprion}/py37

However, a call to python gave the following error messages,

Traceback (most recent call last):
  File "", line 36, in <module>
    df = pd.read_pickle(pkl_input_filename)
  File "/rds/project/jmmh2/rds-jmmh2-projects/Caprion_proteomics/py37/lib/python3.7/site-packages/pandas/io/", line 217, in read_pickle
    return pickle.load(handles.handle)  # type: ignore[arg-type]
ValueError: unsupported pickle protocol: 5


As will become obvious, the addresses were random with a hybrid of integrated and manual compiling.

git clone
cd OpenMS
git submodule update --init contrib
cmake -DBUILD_TYPE=list ../OpenMS/contrib
cmake -DBUILD_TYPE=ALL -DNUMBER_OF_JOBS=4 ../OpenMS/contrib
cd archives
wget -O Xerces-C_3_2_0.tar.gz
# manual build is necessary!
wget -O CoinMP-1.8.3-vs22.tar.gz
wget -O kissfft-130.tar.gz
tar xvfz eigen-3.3.4.tar.gz
cd eigen-3.3.4
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=${Caprion}/OpenMS/contrib
make install
cd ${Caprion}/OpenMS

Now pyOpenMS is compiled with the following scripts

cd ${Caprion}/OpenMS
module load python/3.8
source ${Caprion}/bin/activate
pip install setuptools  --prefix=${Caprion}/py38
pip install pip  --prefix=${Caprion}/py38
pip install autowrap  --prefix=${Caprion}/py38
pip install nose  --prefix=${Caprion}/py38
pip install numpy  --prefix=${Caprion}/py38
pip install wheel  --prefix=${Caprion}/py38
make pyopenms

Further information is available from cmake website 1.


Strauss, M.T., et al., AlphaPept, a modern and open framework for MS-based proteomics. bioRxiv, 2021: p. 2021.07.23.453379.

Rost HL, et al., OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Other URLs

  1. The website explains several options which appear unnecessary for cmake 3.19.

    export Python_LIBRARY_DIRS=${Caprion}/miniconda3/lib
    export Python_INCLUDE_DIR=${Caprion}/miniconda3/include
    export Python_EXECUTABLE=${Caprion}/miniconda3/bin/python


  2. tlmgr – the TEX Live Manager

    The required newtx, fontaxes and xtab package can be installed as follows,

    tlmgr init-usertree
    tlmgr option repository --user-mode
    tlmgr install --user-mode newtx
    tlmgr install --user-mode fontaxes
    tlmgr install --user-mode xtab

    For instance tlmgr info newtx command gives the following information,

    package:     newtx
    category:    Package
    shortdesc:   Alternative uses of the TX fonts, with improved metrics
    longdesc:    The bundle splits txfonts.sty (from the TX fonts distribution) into two independent packages, newtxtext.sty and newtxmath.sty, each with fixes and enhancements. newtxmath's metrics have been re-evaluated to provide a less tight appearance, and to provide a libertine option that substitutes Libertine italic and Greek letter for the existing math italic and Greek glyphs, making a mathematics package that matches Libertine text quite well. newtxmath can also use the maths italic font provided with the garamondx package, thus offering a garamond-alike text-with- maths combination.
    installed:   Yes
    revision:    39072
    sizes:       doc: 865k, run: 7429k
    relocatable: Yes
    cat-version: 1.463
    cat-date:    2015-12-11 20:31:08 +0100
    cat-license: lppl1.3
    cat-topics:  font font-maths font-type1
    cat-related: minion2newtx
    collection:  collection-fontsextra

    See also Also related is # Karl's Path SEarch Library WHICH (kpsewhich), e.g., kpsewich tikz

  3. Benchmark

    Download file from Dropbox.

    ln -s alphapept-0.4.8 alphapept

    which gives and above along with data files.