HPC: Generate PAE/PLDDT/MSA plots

As discussed before, AlphaFold does not automatically generate all visual outputs that are required for the interpretation of its predictions, but it stores information in python-specific .pkl files. Therefore, we supply a python script to do so, after prediction has finished.

In this exercise, you will generate the images from either your own AlphaFold outputs, or from the outputs for the SARSCoV2-VHH-E protein complex. You can download them here. Upload this file to your $VSC_DATA/alphafold/runs/directory.

Next, download the python script here.

To run the visualization, take the following steps:

  • First, we need to load the appropriate modules. Run the following two lines:
module load matplotlib/3.7.2-gfbf-2023a
module load AlphaFold/2.3.2-foss-2023a
  • Then, we run the python script, specifying the file locations of where the input .pkl files are located, and where the outputs should be stored (use a single dot . to specify the current directory). An example:

python visualize_alphafold_results.py --input_dir <input_directory> --output_dir <output_directory> --name <prefix>

Note that output_dir and name are optional. By default, the resulting jpgs are placed in the same directory as the input. For example:

python visualize_alphafold_results.py --input_dir runs/RBD/RBD

After this, two .png files should have been created in that directory.

IF you get an error “Illegal instruction (core dumped)“, this is due to an incompatibility of scripts and modules on the joltik cluster — just swap back to doduo (module swap cluster/doduo) before trying to run the python script again.

These steps and extra information can also be found at https://elearning.vib.be/courses/alphafold/lessons/alphafold-on-the-hpc/topic/alphafold-outputs/.