An illustration of protein structure prediction. Altered AI-generated image by Copilot.

Colabfold in Docker

Tutorial on running Colbafold for AlphaFold2 predictions using a Linux Cluster with HTCondor. A description blog (AlphaFold2 with ColabFold in Container) was written describing the process, but scripts can be found in the complete tutorial as well as plain text files below.

HTML
PDF

Text scripts and files can be copied from the tutorial, but for convenience are provided as individual files below. Each file has an extra .txt at the end to be acceptable by the web server. and should be removed. The prefix `bcc_` and `chtc_` were added to distinguish the files to be run on different servers as detailed in the tutorial.

All files in single zip file: runfiles.zip

Individual files:

bcc_runaf.sh.txt
bcc_runaf.sub.txt
chtc_dl.sh.txt
chtc_download_weights.sub.txt
chtc_runaf.sh.txt
chtc_runaf.sub.txt
hemoglobin-colab.fa.txt
test.fa.txt

Large-scale batch predictions

The set-up described in this tutorial is suitable for a few runs, but for very large batch runs other resources should be used.In particular, this tutorial made the specific effort to provide the necessary information and scripts to run on a cluster of Linux computer with HTCondor. The resources below assume a direct use of the container.

Using Colabfold this blog titled Dockerized Colabfold for large-scale batch predictions [Archived – 2024-11-20] provides detailed information to add a large database.

This blog titled How to Install ColabFold & Run AI Protein Folding Locally [Archived – 2024-11-20] provides a detailed explanation to install Colabfold on one’s laptop/desktop with 2 alternate github repositories.

Both provide details on how Colabfold helps run a prediction faster than a “native” installation.


Image Credits: portion of a Copilot created AI art, retouched with Adobe Firefly.