This script allows you to recode the B-factor column of a PDB/CIF file with any values.
This is useful for using molecular visualisation programs like PyMOL, to color proteins, for example using PyMOL's spectrum function to recolor a protein according to mutational effects.
- Install Python and packages
- Prepare input files
- Run script
- Recoloring in PyMOL
- Install Python
https://www.python.org/downloads/windows/
- Download my github project
- Unzip folder
- Right-click folder and select "Open in Terminal"
- Enter the following into the terminal and press Enter:
pip install -r requirements.txtIn the "input" folder, replace "molecule.cif" with your protein of choice. The input has to be named "molecule.cif". You may change this in the .py file if you are not daunted by code.
Similarly, you have to replace the data in the "newvalues.xlsx" file with your own. In the first column put the positions of the amino acid residues whose B-factor you want to change. In the right column, put the value.
- Right-click project folder and select "Open in Terminal", like in step 0.4 above
- Enter the following into the terminal and press Enter:
python '.\recode B-factors.py'- The output is in the "output" folder, named "recoded_molecule.cif"
It is a copy of the input, but all B-factors got overwritten with your input data. The unlisted positions get the value "-999.0" by default - this is to help us not color them later in PyMOL.
Here I will show how to color the protein, for example with an asymmetric scale.
In this example I mutated an enzyme and got many variants with new reaction speeds. A value of 1 represents no change relative to wild type, 0 would be a dead enzyme, a value of 2 would be double speed relative to wild type and so on. Since there is no upper bound to how much speed a variant can gain, the scale is asymmetric - everything between 0 and 1 is a loss (I will color that with a gradient of blue to white) and everything above 1 is a gain (colored white to red).
I don't want PyMOL to color the positions that I never mutated. These I therefore assigned the B-factor -999, a number outside of that scale.
- Open recoded_molecule.cif in PyMOL
- Create selections for coloring by entering this into PyMOL:
select untouched, (b = -999.0)
select mutation_bad, (b < 1) and not (b = -999.0)
select mutation_good, ((b = 1) or (b >1)) and not (b = -999.0)
- Color
color gray70, untouched
spectrum b, blue_white, mutation_bad, minimum=0, maximum=1
spectrum b, white_red, mutation_good, minimum=1, maximum=8
show surface, molecule_recoded
Tip: use the custom script spectrumany (see pymolwiki) for more control over the colors, possibly in conjunction with spectrumbar to also render the spectrum as a bar for your figure.
It could be that your most extreme values are in positions that aren't contained in the structure you're using (for example, because they're flexible regions that are hard to image). In that case, you may want to model your protein with something like AlphaFold (check out ColabFold for a user-friendly version that doesn't require downloading anything) and match the missing bits to your template.
Inspired by Professor Tyler Starr, who explained the conceptual approach for recoding B-factors in PDB files to me. This implementation is independently written. I wanted to make a lightweight script that is non-programmer friendly.
I highly recommend looking at his and Allison Greeney's paper for some impressive figures. He also linked me their repo which I believe contains all the code they used for the paper, including the figures.






