| TU-Eindhoven - iGEM 2023

Abstract

Introduction

Methods

Results

Conclusion

Abstract

We used a molecular dynamics (MD) simulation to analyze the folding process of our Elastin-Like Polypeptide (ELP). The goal is to get a better understanding of the behavior of the ELPs below and above the transition temperature (T_t). It is used for more characterization of our main composite part, BBa_K4905006. The ELP is folded for 50 ns in an implicit solvent simulation at 3, 15, 25, and 37°C. The Root Mean Square Deviation (RMSD), Radius of Gyration, Relative Shape Anisotropy, secondary structure analysis, and Solvent Accessible Surface Area (SASA) are analyzed to give a better understanding of the ELPs behavior. The RMSD results show that the ELP folding stabilized within the 50 ns. The simulations taught us that an increase in temperature above the T_t results in a more compact and more spherical ELP due to increased hydrophobicity. This causes more and stronger hydrophobic interactions between the ELPs, which is an important driver of the hydrogel formation.

Introduction

Molecular dynamics (MD) simulations are commonly used to see how proteins or other molecules move over time. This can contribute to the understanding of molecular systems, which cannot be seen by the eye or even with the best microscopes. There are many different software packages available that can be used for MD simulations, each with its own advantages and limitations. However, they are all based on the same principle: calculating the force between all atoms in the system with their known positions. At every time step, the positions and velocities are updated to form a trajectory of the atoms. This timestep is in the range of a few femtoseconds (10^-15 s), which is needed to keep the system numerically stable. That means that the error between the predicted and true value is limited. On molecular scale, a simulation can show how proteins fold to reduce the energy in the system, or how proteins interact with each other^[1].

In our application of forming a hydrogel with Elastin-Like Polypeptides (ELPs), it can be useful to see how the ELPs will behave at different temperatures. Above a certain T_t, there is a disruption in the thermal energy of the water molecules around the ELPs, which causes the water molecules to move away from the ELPs. This makes room for interactions between the ELPs. It has been found that there will be conformational changes in another ELP sequence when the temperature rises above T_t, which makes them more ordered and increases the hydrophobicity^[2].

It is interesting to see how this applies to our own ELP, which helps in the characterization of our main composite part, BBa_K4905006. The MD simulation will give a better understanding of the differences in the structure below and above the T_t and it will help with understanding what the ELP actually looks like. We will analyze different properties of the proteins, like the secondary structure formation, compactness, and the surface area that is accessible for the solvent. This will give information about how the conformation differs at specific temperatures below and above the T_t. It is expected that the ELP will be more compac, have more secondary structure formations and that it decreases in surface area above the T_t^[2].

Methods

MD simulations are typically computationally expensive. This means that it takes a lot of time to run the simulation. For this reason, we shortened our ELP sequence to make the computational time manageable. We chose to shorten the hydrophobic parts to I_[10] instead of I_[60] and to shorten the hydrophilic part to A_[50] instead of A_[120] or A_[100], as shown in Figure 1. The Leucine zippers stay the same. In this length reduction, the proportions are not the same as in the original ELP. However, we wanted to make sure that the hydrophobic parts were not shorter than the Leucine zippers. This maintains a greater distance between the Leucine zippers and the hydrophilic part, and it prevents unrealistic folding. The hydrophilic part is kept long, since this part gives the ELP its length and it will provide relevant information on how the protein behaves when it folds. The choice to change the sequence might affect the outcome a bit, however we still epxect to find relevant information on how the ELP behaves at different temperatures.

Figure 1: ELP construct used for the MD simulation. In this construct, the hydrophobic VPGIG parts are repeated 10 times instead of 60 times and the hydrophilic part in the middle is repeated 10 times instead of 24 or 20 times. At the ends, the two different leucine zippers are located, which remain the same.

Click on the dropdown menus below to see more information about the methods.

In the MD simulation, we used VMD 1.9.4 software for the visualization and NAMD 2.14 software for the simulation itself. This choice is based on the paper from López Barriero et al. (2023)^[3] who also used this software in a related topic. We had a meeting with one of the authors of this paper, Diego López Barreiro, who explained that they chose this combination of software because they were made by the same people and thus have a more integrated workflow. Since there are also clear tutorials available, we chose to use this software. More information about this meeting can be found on the Human Practices Page.

For the MD simulation, different input files are needed. A more detailed description of these input files can be found in Figure 2. PyMol is used to make the PDB files of the ELP, with a FASTA file of the amino acid sequence as input. The PDB file is then imported in VMD, where the PSF file is made with integrated option of the automatic PSF builder. The CHARMM force field topology file for proteins is used in this step, with the corresponding parameter file for the next steps. For the different parts of the MD simulation, configuration files are made. This is the direct input in the MD simulation^[4]. The FASTA, PDB, PSF, and an example of the configuration file can be found in our Gitlab Repository.

Figure 2: Graphic representation of the workflow to make the MD simulation, including the input and output files of the applications that are used. More information about the different files can be found in the PDF file below.

You can download the file formats here.

A generalized born implicit solvent is used to fold the protein. In an implicit solvent simulation, the solvent (in this case water) molecules are not individually modeled, but with an approximation of the mean force from the solvent on the protein. Different theories can be used for this, and we chose to use the generalized born model. Implicit solvent simulations are less computationally expensive compared to a explicit solvent simulations that have individually simulated water molecules. Furthermore, explicit solvent simulations are more detailed than a simulation in vacuum^[5]. The MD simulation is performed at temperatures of 3 °C, 15 °C, 25 °C, and 37 °C, since the transition temperature was found to be 20 °C in the conducted UV-VIS experiments. The simulations are run in triplet to increase the reliability of the results. This is necessary, since the initial velocities are chosen by the NAMD software and thus differ every time. Different velocities can cause differences in the final stable conformations, which can affect the results of the analytical methods.

Just like in López Barreiro et al., first energy minimization is done for 20,000 time steps. Then, Langevin dynamics is used with Generalized Born implicit solvent. The simulation step was 2 fs with a simulation time of 50 ns. The cutoff distance is 18 Å, the switch distance 16 Å^[3].

The results are analyzed with Root Mean Square Distance (RMSD), Radius of Gyration and relative shape anisotropy, Ratio of Secondary Structures, and Solvent Accessible Surface Area (SASA) to show the different behaviors of the ELP at different temperatures. They are calculated with the MDTraj Python library. For the analysis, methods are calculated for the last 5 ns, which contain 1000 frames. This is chosen based on the RMSD results, where a stable conformation is shown at all temperatures. Even when the conformation is stable, there are still some small conformation differences. These differences are taken into account with this distribution of the results of the last 5 ns.

RMSD

RMSD can be used to measure the difference in position between the backbone of a protein and its initial structural conformation. This will give a measure about if and when the folding is finished and how stable the final conformation is. The individual frame from the final conformation and its initial conformation are first positioned as close as possible to each other with a least-squares fit^[6]. As shown in Figure 3a, RMSD calculates for each atom the distance between these two structures, sums them and divides it through the number of atoms. Lastly, the square root is taken which results in the RMSD. During the folding process, the RMSD curve rises until the point where the folding stops and the conformation is stable^[7].

Radius of Gyration and shape

The radius of gyration is a measure of how compact a protein is. It is used in many applications, but for proteins it is the same as the root-mean-square distance of the atoms to the center of mass^[8]. Figure 3b schematically shows what this will look like. With this information, it indicates the total size of the chain molecule. This can be used as a measure for the variation of the structure of the protein during the MD simulation^[9]. The radius of gyration will start very high at the beginning of the MD simulation and will drop until the protein comes into a stable conformation. Literature taught us that there is a drop in the average radius of gyration around the critical temperature when comparing the stable regions of the protein at different temperatures^[2].

To get more insight in the compactness and shape of the ELPs, we will also compute the relative shape anisotropy. In this method, the result will be between 0 (symmetric sphere) and 1 (linear chain)^[10]. This is also schematically shown in Figure 3c. It is calculated with the eigenvalues of the gyration tensor, which is a matrix with the sum of the differences between the atoms and the center of mass in the x, y, and z direction^[11]. This is the same matrix from which the radius of gyration is derived.

Secondary structure analysis

With an analysis of the secondary structure, we can assess how structured the ELP is at the different temperatures. The DSSP method is used, this is standard method for assigning the secondary structure to the amino acids of a protein, given the atomic-resolution coordinates of the protein. In this method, each residue is classified into coils, α-helices, or β-sheets. For every frame, the ratio of them is calculated by dividing them with the total amount of residues. A schematic overview of this is shown in Figure 3d. It results in three plots with the ratios of coils, α-helices and β-sheets at different temperatures. It is observed in literature that ELPs form more secondary structures at higher temperatures^[2].

SASA

SASA is a way to define the peptide surface and interior. As shown in Figure 3e, it measures the surface area of the protein that is accessible by solvent molecules^[12]. A Vanderwaals border can first be defined at the surface of the protein. A sphere representing the solvent molecule can then be fitted around this surface, where the center of it will define the SASA surface. It is suggested in the literature that the behavior of ELPs related to the transition temperature abruptly decreases SASA^[2]. It is also suggested that there is less heterogeneity in the structure when the temperature is increased.

Figure 3: Graphic representation of the analysis methods used for the MD simulation, with a) the RMSD, b) the radius of gyration, c) the relative shape anisotropy, d) the secondary structure analysis, and e) SASA.

Results

In Figure 4, a video is shown from one of the simulations at 25°C. Not all videos are included, since the process is comparable for the other simulations, especially when comparing them by eye. Some frames of those simulations can be found in the file below. When comparing the simulations at the four different temperatures, they all start with folding at the ends of the ELP. This is in line with our expectations since the ends of the proteins are the most hydrophobic. Some of the simulations also show the folding of the ELP in the middle at the more hydrophilic part. However, there is no clear relationship between temperature and this folding behavior, which makes the reason for it unclear.

Figure 4: Video with one of the MD simulations at 25°C. The red residues represent the Leucine zipper, the blue residues the hydrophobic parts, and the light-blue residues represent the hydrophilic parts.

You can download the simulation frames here.

Click on the buttons below to see the results for each analysis method.

RMSD

Radius of Gyration

Secondary Structure

SASA

The radius of gyration results of the individual simulations are shown in Figure 5a. The individual simulation results of 3 and 37°C show large differences in the results. This can be caused by the different velocities that are chosen for the single simulations. Since the individual simulations of 3 and 37°C differ a lot, the total boxplot in Figure 5b shows a large distribution of the results.

Figure 5: Radius of gyration results with a) three individual simulations for each of the four temperatures and b) the overall results with the different simulations per temperature taken together.

There is a clear difference between the radius of gyration at 15°C and 25°C, which are just below and just above the transition temperature. When looking at the median at 37°C, these results also show a lower radius of gyration than 15°C. At 3°C, the radius of gyration is very low for two of the simulations compared to the other temperatures. When looking at the final conformations of the simulations, this can be caused by the shape of the stable configurations. When the shape is more cylindrical instead of spherical, which seems especially the case for the simulations at 15°C, the radius of gyration will increase. As mentioned in the methods section, we thus also computed the relative shape anisotropy.

Figure 6: Relative shape anisotropy results with a) three individual simulations for each of the four temperatures and b) the overall results with the different simulations per temperature taken together.

In Figure 6, the relative shape anisotropy is shown for the individual simulations. There is clearly a difference in the shape between the different simulations in Figure 6a, also in those with the same temperature. For example 25°C, which has a clear variability between the simulations. Some of the differences were already expected after the analysis of the radius of gyration, like the second simulation at 3°C that is clearly less spherical than the other two simulations at that temperature. Overall, when looking at Figure 6b, the shape of the ELP at 15°C is less spherical than at 25°C. Because of the large variabilities between the measurements at 3 and 37°C, there cannot be concluded how this is affected at respectively lower and higher temperatures.

This analysis suggests that there is a difference in the compactness of the ELP below and above the transition temperature. Below the transition temperature, the ELP is less compact. At a temperature above the transition temperature, the ELP will become more hydrophobic with fewer solvent molecules around it. This makes the folded ELP more compact, but also more likely to induce interactions with other ELPs. However, for harder conclusions with the role of solvent molecules included, explicit simulations are needed in the future.

Conclusion

The MD simulation of the folding of a shortened version of our ELP sequence taught us that an increase in temperature above the T_t results in a more compact and more spherical ELP due to increased hydrophobicity. This causes more and stronger interactions between the ELPs, which is an important driver of the hydrogel formation^[2].

The greatest difference in results is found for the radius of gyration and relative shape anisotropy between 15 and 25°C. SASA also shows a small difference, but this is not very pronounced. We have found no difference in the secondary structure of the protein between the different temperatures. When comparing the triplicate simulations, there are large differences for especially 3°C but also 37°C. This makes these results less reliable, but there still seems to be a difference in properties between 15 and 37°C. In further research, explicit solvent simulations can be used with more simulated temperatures. This would give more details for better and possibly more convincing results. Also, more extensive research can be done on the interactions between the ELPs at different temperatures and conditions.

[1] S. A. Hollingsworth and R. O. Dror, “Molecular dynamics simulation for all,” Neuron, vol. 99, no. 6, p. 1129, Sep. 2018, doi: 10.1016/J.NEURON.2018.08.011.

[2] N. K. Li, F. G. Quiroz, C. K. Hall, A. Chilkoti, and Y. G. Yingling, “Molecular description of the lcst behavior of an elastin-like polypeptide,” Biomacromolecules, vol. 15, no. 10, pp. 3522–3530, Oct. 2014, doi: 10.1021/BM500658W/SUPPL_FILE/BM500658W_SI_001.PDF.

[3] D. López Barreiro, A. Folch-Fortuny, I. Muntz, J. C. Thies, C. M. J. Sagt, and G. H. Koenderink, “Sequence Control of the Self-Assembly of Elastin-Like Polypeptides into Hydrogels with Bespoke Viscoelastic and Structural Properties,” Biomacromolecules, vol. 24, no. 1, pp. 489–501, Jan. 2023, doi: 10.1021/ACS.BIOMAC.2C01405/ASSET/IMAGES/LARGE/BM2C01405_0003.JPEG.

[4] T. Isgro et al., “NAMD TUTORIAL,” 2017, Accessed: Sep. 07, 2023. [Online]. Available: http://www.ks.uiuc.edu/Training/Tutorials/

[5] J. Kleinjung and F. Fraternali, “Design and application of implicit solvent models in biomolecular simulations,” Curr Opin Struct Biol, vol. 25, no. 100, p. 126, 2014, doi: 10.1016/J.SBI.2014.04.003.

[6] K. Sargsyan, C. Grauffel, and C. Lim, “How Molecular Size Impacts RMSD Applications in Molecular Dynamics Simulations,” J Chem Theory Comput, vol. 13, no. 4, pp. 1518–1524, Apr. 2017, doi: 10.1021/ACS.JCTC.7B00028/ASSET/IMAGES/LARGE/CT-2017-000289_0005.JPEG.

[7] I. Aier, P. K. Varadwaj, and U. Raj, “Structural insights into conformational stability of both wild-type and mutant EZH2 receptor,” Scientific Reports 2016 6:1, vol. 6, no. 1, pp. 1–10, Oct. 2016, doi: 10.1038/srep34984.

[8] A. Rudin and P. Choi, “Practical Aspects of Molecular Weight Measurements,” The Elements of Polymer Science & Engineering, pp. 89–148, Jan. 2013, doi: 10.1016/B978-0-12-382178-2.00003-1.

[9] S. Ghahremanian, M. M. Rashidi, K. Raeisi, and D. Toghraie, “Molecular dynamics simulation approach for discovering potential inhibitors against SARS-CoV-2: A structural review,” J Mol Liq, vol. 354, p. 118901, May 2022, doi: 10.1016/J.MOLLIQ.2022.118901.

[10] J. Iwata and T. Ando, “Molecular Dynamics Study on Behavior of Resist Molecules in UV-Nanoimprint Lithography Filling Process,” Nanomaterials, vol. 12, no. 15, Aug. 2022, doi: 10.3390/NANO12152554/S1.

[11] H. Arkin and W. Janke, “Gyration tensor based analysis of the shapes of polymer chains in an attractive spherical cage,” Journal of Chemical Physics, vol. 138, no. 5, p. 54904, Feb. 2013, doi: 10.1063/1.4788616/193129.

[12] “How to compute the Solvent Accessible Surface Areas (SASA) with GROMACS - Compchems.” Accessed: Sep. 18, 2023. [Online]. Available: https://www.compchems.com/how-to-compute-the-solvent-accessible-surface-areas-sasa-with-gromacs/#solvent-accessible-surface-area-sasa

Simulation

Simulation

Abstract

Introduction

Methods

Software

Implicit solvent simulation

Analysis methods

Results

Conclusion

References