3D Protein structure prediction (2)
First entry in this series: AlphaFold background.
This entry provides information on where to find the code and how it could be run on Google Colab free method. See also next entry for practical details.
ColabFold – Making protein folding accessible to all
Milot Mirdita, Konstantin Schütze, Yoshitaka Moriwaki, Lim Heo, Sergey Ovchinnikov, and Martin Steinegger doi: https://doi.org/10.1101/2021.08.15.456425 (version posted February 8, 2022.)
So, where do I find the code?
The AlphaFold code has been released to the public and can be run by anyone.
One of the main premises of AlphaFold is its reliance on multiple sequence alignments (MSA). Thus, the most accurate version requires up to 2 TeraBytes (Tb) of installation of databases, including sequence and structure databases. A smaller (1/10 size) database can also be used.
DeepMind and Google have created a method to access the code on GitHub. All the details to install AlphaFold locally are on the “readme” page, visible on the lower portion of the GitHub page ColabFold (See my Blog: Google colab is a free cloud notebook environment).
There are multiple options and versions.
The current version (March 2022) is version 2.1.0. One of the new updates is the new capabilities of AlphaFold to predict complexes, including hetero- complexes (multiple structures.)
CoLab offers Jupyter Notebooks with included (embedded) Python scripts ready to run. The user only has to Copy/Paste the sequence to be modeled, and click along various steps.
The following table is displayed on GitHub but is prone to changes. Columns “mmseq2” and “jackhmmer” are methods to create the multiple sequence alignment(s).
Making Protein folding accessible to all via Google Colab!
|AlphaFold2 (from Deepmind)||Yes||Yes||No||Yes||No|
|BETA (in development) notebooks|
See GitHub page for all the details: https://github.com/sokrypton/ColabFold