AlphaBase provides all basic python functionalities for AlphaPept ecosystem from the Mann Labs at the Max Planck Institute of Biochemistry and the University of Copenhagen. To enable all hyperlinks in this document, please view it at GitHub. For documentation, please see readthedocs.
The infrastructure package of AlphaX ecosystem for MS proteomics. It was first published with AlphaPeptDeep, see Citations.
- AlphaPeptDeep: deep learning framework for proteomics.
- AlphaRaw: raw data reader for different vendors.
- AlphaDIA: DIA search engine.
- PeptDeep-HLA: personalized HLA-binding peptide prediction.
- AlphaViz: visualization for MS-based proteomics.
- AlphaQuant: quantification for MS-based proteomics.
Wen-Feng Zeng, Xie-Xuan Zhou, Sander Willems, Constantin Ammar, Maria Wahle, Isabell Bludau, Eugenia Voytik, Maximillian T. Strauss & Matthias Mann. AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics. Nat Commun 13, 7238 (2022). https://doi.org/10.1038/s41467-022-34904-3
AlphaBase was developed by the Mann Labs at the Max Planck Institute of Biochemistry and the University of Copenhagen and is freely available with an Apache License. External Python packages (available in the requirements folder) have their own licenses, which can be consulted on their respective websites.
AlphaBase can be installed and used on all major operating systems (Windows, macOS and Linux). There are two different types of installation possible:
- Pip installer: Choose this installation if you want to use AlphaBase as a Python package in an existing Python 3.8 environment (e.g. a Jupyter notebook).
- Developer installer: Choose this installation if you are familiar with conda and Python. This installation allows access to all available features of AlphaBase and even allows to modify its source code directly. Generally, the developer version of AlphaBase outperforms the precompiled versions which makes this the installation of choice for high-throughput experiments.
AlphaBase can be installed in an existing Python 3.8 environment with a
single bash
command. This bash
command can also be run directly
from within a Jupyter notebook by prepending it with a !
:
pip install alphabase
Installing AlphaBase like this avoids conflicts when integrating it in other tools, as this does not enforce strict versioning of dependancies. However, if new versions of dependancies are released, they are not guaranteed to be fully compatible with AlphaBase. While this should only occur in rare cases where dependencies are not backwards compatible, you can always force AlphaBase to use dependancy versions which are known to be compatible with:
pip install "alphabase[stable]"
NOTE: You might need to run pip install -U pip
before installing
AlphaBase like this. Also note the double quotes "
.
For those who are really adventurous, it is also possible to directly
install any branch (e.g. @development
) with any extras
(e.g. #egg=alphabase[stable,development]
) from GitHub with e.g.
pip install "git+https://github.com/MannLabs/alphabase.git@development#egg=alphabase[stable,development]"
AlphaBase can also be installed in editable (i.e. developer) mode with a
few bash
commands. This allows to fully customize the software and
even modify the source code to your specific needs. When an editable
Python package is installed, its source code is stored in a transparent
location of your choice. While optional, it is advised to first (create
and) navigate to e.g. a general software folder:
mkdir ~/folder/where/to/install/software
cd ~/folder/where/to/install/software
The following commands assume you do not perform any additional cd
commands anymore.
Next, download the AlphaBase repository from GitHub either directly or
with a git
command. This creates a new AlphaBase subfolder in your
current directory.
git clone https://github.com/MannLabs/alphabase.git
For any Python package, it is highly recommended to use a separate conda virtual environment, as otherwise dependancy conflicts can occur with already existing packages.
conda create --name alphabase python=3.9 -y
conda activate alphabase
Finally, AlphaBase and all its dependancies need to be
installed. To take advantage of all features and allow development (with
the -e
flag), this is best done by also installing the development
dependencies instead of only
the core dependencies:
pip install -e "./alphabase[development]"
By default this installs loose dependancies (no explicit versioning),
although it is also possible to use stable dependencies
(e.g. pip install -e "./alphabase[stable,development]"
).
By using the editable flag -e
, all modifications to the AlphaBase
source code folder are directly reflected when running
AlphaBase. Note that the AlphaBase folder cannot be moved and/or renamed
if an editable version is installed. In case of confusion, you can
always retrieve the location of any Python module with e.g. the command
import module
followed by module.__file__
.
TODO
In case of issues, check out the following:
- Issues: Try a few different search terms to find out if a similar problem has been encountered before
- Discussions: Check if your problem or feature requests has been discussed before.
If you like this software, you can give us a star to boost our visibility! All direct contributions are also welcome. Feel free to post a new issue or clone the repository and create a pull request with a new branch. For an even more interactive participation, check out the discussions and the the Contributors License Agreement.
While AlphaBase offers an object-oriented interface, algorithms for manipulating data should be implemented in a functional way and called from class methods. This allows the functions to be reused without instatiating a class.
- Return DataFrames in the same order as they were passed
- Minimize in-place modifications of DataFrames. Mention them explicitly in the docstring
- Implement low-level functions that operate on numpy arrays and return arrays. Use higher-level functions to assign array results to DataFrames
Avoid making assumptions about:
- Precursor ordering by
nAA
- Fragment indices ordering (e.g.,
frag_start_idx
) - Continuity of
frag_start_idx
wherefrag_start_idx[i+1] == frag_stop_idx[i]
- All fragments being assigned to a precursor
Assumptions are only permitted for low-level or optimized functions and should be documented in the docstring.
When performance optimization is needed:
- Implement the general solution first
- Add optimized versions for special cases for refined precursor df or order
nAA
- Check conditions at runtime to use optimized versions when applicable
- Include python type hints
- Include docstrings in numpy style (see numpy docstring example)
It is highly recommended to use the provided pre-commit hooks, as the CI pipeline enforces all checks therein to pass in order to merge a branch.
The hooks need to be installed once by
pre-commit install
You can run the checks yourself using:
pre-commit run --all-files
In order to have release notes automatically generated, pull requests need to be tagged with labels.
The following labels are used (should be safe-explanatory):
breaking-change
, bug
, enhancement
.
This package uses a shared release process defined in the alphashared repository. Please see the instructions there.
For a full overview of the changes made in each version see CHANGELOG.md (until version 1.1.0) and the github release notes (from >1.1.0).