Loading SCORE : 0
MOLECULAR MODELING
1
Introduction

In our project, SpyTag and SpyCatcher can dock through the formation of a peptide bond, enabling the binding of T3 to C3 and the formation of a stable mesh structure to assist OTA-degrading enzymes in adapting to harsh environments. At the same time, it makes the degradation process more sustainable and programmable. However, our knowledge about this docking and mesh structure is limited. To better understand the molecular docking between SpyTag and SpyCatcher and the formation of the mesh structure, we used Swiss-Model, I-TASSER, and AlphaFold2 to predict the protein's tertiary structure. We performed protein docking using ClusPro 2.0, ZDOCK, and GRAMM, and simulated the docking results and mesh structure in GROMACS to better assist the wet lab team in completing the project.

2
SpyaCatcher and SptTag

Prediction of SpycaTcher and SpyTag

Our primary task is to construct the three-dimensional structures of SpycatCher and SpyTag utilizing only their amino acid sequences. This can be done by employing servers and homology modeling tools like SWISS-MODEL and I-TASSER. (SWISS-MODEL is a fully automated protein structure homology-modelling server. I-TASSER (Iterative Threading ASSEmbly Refinement) is a method for protein structure and function prediction.)

Using SWISS-MODEL, we found that the GMQE value of SpyCatcher reaches as high as 0.79, and all the values are sufficient to ensure the reliability of the results.(GMQE stands for Global Model Quality Estimation, which is a metric used to assess the quality of protein models. The evaluation range of GMQE is from 0 to 1, where a higher GMQE value closer to 1 indicates a higher quality and reliability of the protein model. Generally, a GMQE value of 0.7 or higher is considered good.)

Fig. 1 SpyCatcher protein predected by SWISS-MODEL.

However, the short amino acid sequence of SpyTag cannot be effectively identified by SWISS-MODEL and I-TASSER. This has made our work challenging to proceed. Upon the advice of our adviser, we turned to try the AlphaFold2 server. (AlphaFold2 is a deep learning-based algorithm used for protein structure prediction.) With the help of AlphaFold2, we obtained the following results:

Fig. 2 IDDT of SpyTag pepitide predected by AlphaFold2.

IDDT (Interface Distance Difference Test, which is used to evaluate the local distance difference of all atoms in a model. The range of IDDT is between 0 and 1, with a higher value indicating higher confidence.)

There is no doubt that rank_1 will be selected as the final result for our simulation.

Fig. 3 SpyTag pepitide predected by AlphaFold2.

Docking between SpyTag and SpyCatcher

The structure predictions for SpyCatcher and SpyTag have been completed, and now we can proceed with the docking work. The PDB files of SpyCatcher and SpyTag, serving as the receptor and ligand, have been uploaded to ClusPro 2.0. Subsequently, from the received more than fifty docking results, we have selected the top three results based on their respective scoring as candidates:

Fig. 4 Docking results of SpyCatcher and SpyTag predicted by ClusPro 2.0 for No. 001, No. 002 and No. 003 from left to right.

In addition, we also found the docking results of SpyCatcher and SpyTag on the official website of RCSB PDB. (Unfortunately, the individual results for SpyCatcher and SpyTag were not found.) Taking this into consideration, No.001 seems to be a relatively good choice.

Fig. 5 Reference from RCSB

3
T3 and C3

Prediction of T3 and C3

T3 is composed of three SpyTags and hydrophilic elastin-like polypeptides (ELPs). C3 is composed of three SpyCatchers and hydrophilic elastin-like polypeptides (ELPs). We obtained the sequence from the wet experimental group.

We performed homology modeling using SWISS-MODEL. In the prediction results for T3, the GMQE reached 0.56, which is significantly higher than 30%. The quality of the predicted results is quite good.

Fig. 6 T3 protein predected by SWISS-MODEL

However, the prediction results for C3 seem to be less optimistic, with the optimal GMQE value only reaching 0.16.(The GMQE value for C3 needs to be at least greater than 20%).Among them, the highest-rated template was selected, and it was found to be highly similar to the previously predicted SpyCatcher results. As can be seen from the graph, the similarity between the two is extremely high. However, this actually demonstrates that the template is unreliable.

Fig. 7 The Comparison of C3(from SWISS-MODEL in Green) and Spycatcher(from SWISS-MODEL in Blue)

Therefore, we have to rely on I-TASSER's online server to perform structure prediction on C3 using the folding recognition method. We select the result that is ranked first as our prediction.

Fig. 8 C3 protein predected by I-TASSER

Docking between C3 and T3

With the predictions of C3 and T3, we will now proceed to perform docking prediction between C3 and T3. Firstly, the PDB files of C3 and T3 are uploaded to the GRAMM server as the receptor and ligand files respectively, resulting in five prediction results (listed below):

Fig. 9 Five docking results from GRAMM (receptor color: Green; ligand colors: Blue, Pink, Yellow, Orange, Gray)

From the prediction results, it is evident that the binding positions of the five ligands with the receptor are mostly consistent and satisfy the principle of Steric Complementarity in protein docking. Based on this observation, we have decided to continue using more accurate methods for docking prediction, taking into account the position derived from the previous step.GRAMM has advantages in terms of speed and simplified flexible modeling, but it is not as accurate as ClusPro 2.0 and ZDOCK. Therefore, we uploaded the PDB files of C3 and T3 to the ClusPro 2.0 server. Among the 30+ results obtained, even after incorporating the predictions from GRAMM, there were still 13 results that met the criteria. Subsequently, we uploaded the PDB files of C3 and T3 to the ZDOCK server, and obtained the top ten results. Again, considering the predictions from GRAMM, we selected five results for further analysis.

Considering the results from both servers, we compared the filtered results and identified the common intersection between the two. The table below shows the selected results:

Table 1. Four docking results selected after comparing the results from ZDOCK and ClusPro 2.0 (data from ClusPro 2.0).

One important thing to note is that, although ClusPro 2.0 provides some data to assist our judgment, as stated on their official website, relying solely on this data is not sufficient for effective model scoring.

Molecular dynamics simulations enable us to evaluate the stability of complex predicted structures and obtain natural conformations that are more likely to exist in solvents, which will also be possible states in our wet laboratory and factory. Next, we will use Gromacs to perform dynamic simulations on the listed complexes in the table to help us select the most stable model.results:

Steps of the utility of GROMACS are derived from the GROMACS tutorial and summarized as the NAU-China 2023 GROMACS protocol, which differs slightly from the tutorial in some details.

To summarize, we started from a complex PDB file and generated the corresponding topology structure. We placed it in a box, filled with water, and added NA+ ions to balance the charge. Then, we performed energy minimization using gradient descent. After energy minimization, we coupled temperature, pressure, and potential energy and collected data for further analysis. Subsequently, we conducted the final simulation. After some time, we obtained simulation trajectory files that can be visualized as graphics. However, to facilitate the observation of the molecular motion trajectory, we extracted the complex from the box and generated a PDB file that can be directly visualized using Pymol.

During the simulation, we specifically collected potential energy data of the complex, and presented it as follows:

Table 2. Potential energy from Gromacs

(Average: The average energy of the potential. Err.Est: Statistical error estimate of the simulation results, with smaller values indicating more reliable results. RMSD: The square root of the average squared difference between the simulated data and reference data, with smaller RMSD values indicating better agreement between the simulation results and the reference structure or experiment. Tot-Drift: The total displacement or drift of atoms during the simulation, with smaller Tot-Drift values indicating a more stable system.)

After comparing the data in the table, we have decided to abandon result 02 (due to its significantly larger Err.Est and Tot-Drift compared to the others).

According to the protocol, we obtained the simulation results of the complex for 0.1 nanoseconds as follows:

Fig. 10 Molecular dynamics trajectories of 04, 18, and 24 in solution, simulated for 0.1 nanoseconds, from left to right.

Then, let's take a look at the fluctuations in their temperature, pressure, and potential energy:

Fig. 11 Elucidation of potential energy fluctuation, temperature fluctuation and pressure fluctuation of 04, 18 and 24 simulation for 0.1 nanosecond. The top three pictures are the results of 04, in the middle are the results of 18, and the bottom are the results of 24.

From the above results, we can see that the structures of all three complexes are very stable. The average temperature for structural stability is around 300k, and the average pressure is around 0 bar.

Based on the comprehensive analysis of the data, we are almost unable to find obvious errors to rule anyone of them out.Ultimately, we decided to proceed with further experiments using result 24, which exhibits relatively lower potential energy and more stable temperature and pressure.

4
Mesh structure

According to the literature, the binding of multiple C3 and T3 units can indeed form a stable mesh-like structure. However, considering that it is not feasible to exhaustively explore all possible outcomes, we have utilized two C3 and two T3 units to construct a simplified mesh-like structure. This structure provides insights into the results of protein crosslinking.

Fig. 12 A mesh-like structure formed by two T3 and C3 docked by ClusPro 2.0.

We will now proceed with molecular dynamics simulations on the mesh-like structure:

Similar to the previous simulation, we generated the corresponding topological structure using the PDB file and placed it in a box. Then, we filled the box with water and NA+ ions. Subsequently, energy minimization was performed using the gradient descent method. After energy minimization, we coupled temperature, pressure, and potential energy. Finally, simulations were conducted.The difference is that in this simulation, we extended the time to 1 nanosecond.

The following are corresponding parameters of fluctuations in temperature, pressure, and potential energy:

Fig. 13 Elucidation of temperature fluctuation, pressure fluctuation and potential energy fluctuation of mesh-like structure simulation for 1 nanosecond, from top to bottom

We can draw a conclusion from the results above that the mesh-like structure is getting more stable, especially the temperature. The average temperature, average pressure, and average potential energy of the stable structure are approximately 300K, 2 bar, and -4.61083e+06 kJ/mol, respectively.

According to the GROMCAS tutorial, the final simulation after equilibrium is 1ns in total. Therefore, we conducted the simulation for 1 nanosecond. The result of the mesh-like structure is as follows:

Fig. 14 Molecular dynamics trajectories of mesh-like structure in solution, simulated for 1 nanoseconds.

5
Reference

Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303

Gabriel Studer, Christine Rempfer, Andrew M Waterhouse, Rafal Gumienny, Juergen Haas, Torsten Schwede, QMEANDisCo—distance constraints applied on model quality estimation, Bioinformatics, Volume 36, Issue 6, March 2020, Pages 1765-1771.

Yang, J. , Yan, R. , Roy, A. , Xu, D. , Poisson, J. , & Zhang, Y. . (2014). The i-tasser suite: protein structure and function prediction. Nature Methods, 12(1), 7-8.

Structural Analysis and Optimization of the Covalent Association between SpyCatcher and a Peptide Tag. Li, L., Fierer, J.O., Rapoport, T.A., Howarth, M. (2014) J Mol Biol 426: 309-31