Description
loading
-
-
-
Dynamics Simulation


Scroll down for more
Centered Image Centered Image Centered Image Centered Image Centered Image Centered Image Centered Image Centered Image Centered Image

Purpose

The core of our project centers around a dodecahedral protein cage formed by assembling 60 Mi3 subunits. Ensuring the stability of protein components on the test paper is pivotal for accurate detection. Consequently, our primary objective is to investigate the thermal stability of this protein cage. Furthermore, we are utilizing specific protein coupling systems and intend to employ molecular dynamics simulations to assess their stability in aqueous solutions. This analysis will guide us in the selection of these coupling elements.

Method

Prior to commencing molecular dynamics simulations, protein modeling is an essential step. In our project, we employed the online tool ColabFold to generate the structure of the fusion protein. To mitigate the impact of linkers between protein sequences, we opted to utilize GGS exclusively for separating the proteins during the modeling process. Protein structure visualization was accomplished using Pymol and VMD. Subsequently, molecular dynamics simulations were executed using GROMACS on the AutoDL server platform.

Molecular dynamics simulation is a computational technique used to analyze the movements of molecules and atoms based on classical mechanics and the potential energy assigned to them. This potential energy is defined by parameters that encompass bonded forces relating to chemical bonds, bond angles, dihedral angles, as well as non-bonded forces involving van der Waals and electrostatic interactions. These parameters are typically derived from accurate experimental data and empirical adjustments based on quantum mechanics, rendering molecular dynamics simulation a valuable and practical tool. Below, we will present a detailed overview of the primary workflow using GROMACS, a widely utilized molecular dynamics simulation software.

loading

Fig 1. A typical MD simulation workflow for proteins

As shown in Fig. 1, the workflow of MD simulations is typically divided into four main parts: (1) system preparation, (2) equilibration, (3) production simulation, and (4) analysis.

Step 1:Gets the initial structure

To obtain the main protein's PDB file, it can be retrieved from the RCSB Protein Data Bank (PDB). Once acquired, Pymol can be used to eliminate any unnecessary structures or components from the protein. Furthermore, if there are missing amino acid residues in the protein, CharmmGUI can be employed to patch and rectify them. CharmmGUI is a graphical user interface that offers tools for protein modeling, refinement, and simulation using the CHARMM force field. It helps address missing residues and prepares the protein structure for further analysis or simulations. However, if the protein of interest lacks a structure in the database, its initial structure can be obtained through homology modeling or ab initio modeling. In this project, certain fusion protein structures were predicted using ColabFold.

Step 2: Generate the Protein Topology

Import the PDB file into GROMACS to create the protein's topology structure. The topology comprises parameters for both non-bonded interactions (atom types, charges) and bonded interactions (bonds, angles, dihedrals) of the protein within the simulation. It also encompasses a force field, which consists of equations and associated constants describing the physics of the system. In our model, we utilize the Charmm36 force field.

Step 3: Define Box and Solvate

During this step, we build a box around the protein based on its dimensions. The box encapsulates the desired protein structure, and water molecules are introduced inside the box using GROMACS commands. In our scenario, we opt for the TIP3P water model.

Step 4: Add Neutralizing Ions

To mimic the protein's conformation in an aqueous solution more accurately, chloride ions or sodium ions are added in this step to neutralize the charges present in the protein. Other types of ions can also be chosen, or the ion concentration for addition can be specified.

Step 5: Energy Minimization

In order to prevent simulation failure due to large forces acting on atoms, steric clashes, and inappropriate geometries, it is essential to eliminate these issues and refine low-resolution experimental structures by minimizing the system's energy. In our project, the energy minimization step size is set to 0.01. If the system is challenging to converge, a smaller step size can be attempted, or potential issues in the previously constructed structure should be addressed. The minimization process is halted when the maximum force is below 10.0 kJ/mol.

Step 6: Constrained Equilibration

To maintain the protein's structure unchanged during simulation, the solvent molecules and ions around the protein must undergo equilibration in the constrained pre-equilibration stage. If a non-constrained simulation begins directly, the system may collapse. Positional constraints are achieved by applying a harmonic potential to each heavy atom in the protein. The system's temperature and pressure are brought to the desired values using the NVT and NPT ensembles, respectively.

Step 7: Equilibration without Position Constraint

In the second stage of equilibration, the positional constraint imposed on the protein is removed, and the simulation is conducted using the NPT ensemble.

Step 8: Production Simulation

During the production simulation, the equilibrated system is employed as input, and the parameters remain consistent with those used in equilibration simulation without position constraint, except for the simulation time. To ensure high-quality results, it's important to verify the convergence of various thermodynamic parameters, similar to the equilibration simulation.

Step 9:analysis

GROMACS comes with a variety of analysis tools, and in our project, we primarily utilized the following methods: 

1.RMSD
RMSD is defined by the following equation,

loading

To quantify the variance between the protein's conformation during the simulation and a reference structure (typically a crystal structure), the Root Mean Square Deviation (RMSD) analysis method in GROMACS is employed. RMSD computes the average distance between atoms in the simulated protein and those in the reference structure, offering a metric for assessing structural similarity or deviation.

2.RMSF

loading

RMSF is the average of atomic positional changes over time, which can characterize the flexibility and extent of motion of protein amino acids throughout the entire simulation process.

3.Radius of Gyration
The radius of gyration is defined as follows,

loading

where and are the mass and position of atom i with respect to the center of mass of the molecule, respectively. is the root mean quare distance of the protein from its center of mass. This parameter describes the size or compactness of the protein during the simulation process. In the case of cage proteins, is equivalent to the radius of the cage structure.

4.Radial Distribution Function
In statistical mechanics, the Radial Distribution Function (RDF) refers to the probability distribution of finding a particle at a given distance from a reference particle. In our project, we use this function to describe the thickness of the protein cage. The RDF provides information about the spatial arrangement and distribution of particles around the reference particle, helping us understand the protein's structural characteristics and interactions with its surroundings.

5.Dictionary of Protein Secondary Structure
In molecular dynamics simulation studies, it is common to analyze the changes in protein secondary structure, which includes α-helices, β-sheets, turns, bends, coils, and more. In our project, we utilized the relevant program in GROMACS to calculate the variations in the secondary structure of certain proteins throughout the entire simulation process. This analysis helps us understand how the protein's secondary structure evolves and responds to the simulated environment.

6.Solvent-accessibility Surface Area
SASA could directly show the changes in the accessibility of the protein to the solvent and it is related to protein stability as well. A lower SASA generally represents higher thermodynamic stability of the protein.