Summary
Computing a PNG image for each of 1,000 PDB file as a cartoon, color-coded by B factor.
PyMOL without GUI
PyMOL is routinely used by Biologists to illustrate molecules, using the graphical user interface (GUI.) However, there are situations where it may be beneficial to run PyMOL without using the mouse. I recently computed 1,000 predicted 3D structures using Omegafold and I wanted to render each of on the predicted structure as a ribbon diagram, colored by B factor. Using the mouse would mean to open each of the 1,000 PDB files, and then click the relevant buttons, or at best type their command equivalent.
This is where the option to start PyMOL without GUI is amazingly powerful. This can be accomplished on one’s own laptop, even if PyMOL is not installed thanks to a Docker image. (See also a previous Docker attempt with PyMOL in this blog entry “Docker PyMOL, Posted on March 3, 2020.)
1,000 PDF files
This large number of files was the result of computing predictions based on sequence with “evodiff“, and subsequently computed as 3D structures by “Omegafold,” which will be detailed on a subsequent post. This post is dedicated to imaging 1,000 3D coordinates all at once, without manual intervention.
PyMOL Script: reproducibility and sharing
While most users are happy to tweak an image graphically using the PyMOL interface, the best method to working with PyMOL is to create a text script that can re-create the same image at will. The use of script is very well illustrated in the book PyMOL in book Exploring Protein Structure – Principles and Practice which provides a script for each of the molecular images within the book. I also made an example of this in my “PyMOL scripts book” illustrating the COVID19 spike protein.
The principle is simple: each action within the PyMOL GUI can be done as a command. The list of commands saved into a simple text file creates a “PyMOL Script” file typically saved with the .pml
filename extension. In turn, the script can be submitted to PyMOL from either the GUI or a call to PyMOL from the command line resulting in a saved PNG image.
A command can be “looped” in a terminal therefore allowing the processing of a large number of files.
One script to image them all
One script could probably contain most of the commands necessary to open and present each structure as a cartoon, and then save a graphic file in PNG format. This would be nice, but in many cases it is useful to color the structure based on a property contained within a specific column in the 3D data. In that sense, “one script to rule them all” would probably not provide the best images.
Coloring each structure by B factor
The B factor (sometimes called the “temperature factor”) contained within these 3D prediction can help visualize a measure of the accuracy of the prediction. To “personalize” each structure image it is necessary to provide the minimal and maximal value of the B factor within each PDB file. It is therefore necessary to process each file to obtain these numbers. Once again, the “loop” option of a shell command can integrate that request. The computation for each will be done with a short awk
script saving the results in a temporary shell variable called min
and max
respectively.
“Here file” to the rescue
Since we need specific number for each file we need to create a specific .pml
filename script for each one. How can we make each command specific to each structure without manual intervention? This is the genius of a “here document” which can create a new document “on the fly”. And since it’s used only once, it can be overwritten each time once a structure has been processed.
PyMOL Commands
PyMOL is written in the Python language, but most commands have a simpler PyMOL equivalent closer to English. For example, the PyMOL load
command is simpler than the Python function cmd.load()
while providing the same functionality. What do we need to insert in the script?
1. we want to load the file: load filename.pdb
2. orient
the molecule along its longer axis
3. remove solvent
i.e. water molecules (if any)
4. color by B-factor: spectrum b
5. as cartoon
to show a ribbon-diagram
6. save image: png
command. The default size is 640×480 pixels
Calling PyMOL on the command line
OK that sounds great. BUT… How do we call PyMOL from the command line?
On a Linux system where PyMOL is properly installed the command is simply pymol
which is quite simple.
However, this is also possible on a Mac and on Windows if one knows where to look….!
Mac option: On macOS, an application is a special kind of a folder containing all that is necessary. To open any application, simply right-click (or control-click) the icon named PyMOL or PyMOL.app and use the second menu: “Show Package Contents” and then follow the menu cascade on the folders by clicking: Contents > MacOS in which we find PyMOL with a back/terminal icon. If we right-click (or control-click) on this specific PyMOL icon and then click the “option” key, we can see further down the option to ‘Copy “PyMOL” as Pathname‘ which is the “secret” we need… The clipboard will then contain:
/Applications/PyMOL.app/Contents/MacOS/PyMOL
which is the text command we need and should be the same for all Macs if the application is installed in the standard location.
Windows Option: PyMOL is typically installed in the “Program Files” “Program Files (x86)” directory on Windows 10 or 11. Look for a folder named “PyMOL” or a similar name. We want to find a file called pymol.exe
whih should be located within this folder. Once you locate the pymol.exe
file, note its full path. For example it could be:
C:\Program Files\Schrodinger\PyMOL\pymolwin.exe
However, the exact location of pymol.exe
may vary depending on the version of PyMOL and how it was installed.
The current Windows installed for version 2.5.4 proposes to install either “just for me” (i.e. the user) or for all users (requiring Admin password.) The installation locations for these were either one of:
C:\Users\sgro\AppData\Local\pymol
C:\ProgramData\pymol
In both cases the name of the software was: PyMOLWin.exe
but upper/lower case does not affect Windows commands.
Linux option: the program is simply called with pymol
Docker option: This Docker image contains the free version PyMOL 2.5.0 and can be activated, sharing the current directory containing all PDB files with this command.
docker run -it --rm -v ${PWD}:/data -w /data biopod/pymol:2.5.0
(Note:Mac wtih Silicon Chip (M1 etc.) may need to add –-platform linux/amd64
to “pull”or “run” this Docker image.)
We are almost there… One last thing we need to do is to tell PyMOL that we don’t want the GUI to start when it is invoked on the command line. There are many command line options available. The options of interest for 0ur purpose are:
-c launch in command-line only mode for batch processing
-q supress startup message
-Q quiet, suppress all text output
We can add -c
and -q
or -Q
to suppress text output which can be combined as follows for the Linux command, but also works with the appropriately called application on macOS and Windows.
pymol -qc
or
pymol -Qc
One adapted script to rule them all
We can combine all these requirements to create a “here document” that is personalized to each file with a loop. The very last command is calling PyMOL, and this command will depend on the operating system as described just above. The example below is calling the macOS version.
#!/bin/bash
for f in SEQUENCE*.pdb
do
min=`cat $f| egrep ^ATOM | awk '{print $11}' | awk 'BEGIN{a=1000}{if ($1<0+a) a=$1} END{print a}'`
max=`cat $f| egrep ^ATOM | awk '{print $11}' | awk 'NR==1{max = $1 + 0; next} {if ($1 > max) max = $1;} END {print max}'`
cat > commands.pml <<-EOF
load $f
orient
remove solvent
spectrum b, blue_white_red, minimum=$min, maximum=$max
as cartoon
# cartoon putty
png $f.png
EOF
/Applications/PyMOL.app/Contents/MacOS/PyMOL -Qc commands.pml
done
To make the command simpler, this version will create a file ending with .pdb.png
If the folder contains 1,000 PDB files then 1,000 PNG files will be created… with just a dozen lines of code.
Contact sheets
The final step was to create contact sheets to assemble multiple images together. This was done with yet another Docker image containing the command-line programs of “ImageMagick” placing 35 PNGs onto each sheet with the program montage
. This resulted in 29 contact sheets. The program was called with a docker
command from a Docker image:
docker run -it --rm -v ${PWD}:/data -w /data minidocks/imagemagick montage
Then for each set of 35 PNG files within a PNG directory the command looked like:
montage -verbose -label '%f' -font Inconsolata-Regular -pointsize 10 -background '#EEEEEE' -fill 'black' -define jpeg:size=200x200 -geometry 200x200+2+2 -auto-orient -tile 5x7 ./PNG/SEQUENCE_{0..34}.pdb.png contact_0..34.jpg
The 29 contact sheets are at the top of this page.