×
Home Team Description Human Practices Wet Lab Dry Lab

Bioinformatics

In this part, we first constructed the metabolic pathway map of our engineered bacteria (based on kegg), and then we analyzed the structure, stability, interaction and active centers of the fusion proteins we constructed using bioinformatics methods.

Metabolic pathway map construction

The metabolic pathway of curcumin synthesis involves a number of enzymatic steps. 4-coumaroyl-CoA ligase (4CL) catalyzes the conversion of ferulic acid to feruloyl-CoA, and acetyl-CoA carboxylase (ACC) catalyzes the conversion of acetyl-CoA to malonyl-CoA. Further, diketide-CoA synthase(DCS) can catalyze feruloyl-CoA and malonyl-CoA to β-keto acid, while feruloyl-CoA and β-ketoacyl-coA synthesize curcumin via curcumin synthase(CURS). Based on this metabolic pathway, we designed to clone the key enzyme genes 4CL, ACC, DCS and curs into E. coli to construct curcumin-producing engineering bacteria. We used the metabolic pathway in KEGG to represent the metabolic process of the curcumin production with some modifications of the metabolic pathway of map00945 (Stilbenoid, metabolic flow chart of diarylheptanoid and gingerol biosynthesis).To better exhibitthe effect, We made the specific metabolic pathways in yellow.


Computional structural biology

Introduction

DCS and CURS are key rate-limiting enzymes for the synthesis of curcumin [1], among which DCS exists in turmeric and CURS exists in ginger.In this project, we connected the protein sequences of the above two enzymes and selected linker (GSG) to construct a fusion expression enzyme system in Escherichia coli, which improved the output of the two enzyme molecules. Thus the curcumin yield was increased.

Linker Determination

The design principles and evaluation process of the joint between two proteins in fusion expression are complicated and require in-depth professional background, so designing the joint from scratch is very challenging. Here, we did not do in-depth design, but followed the ideal linker in the literature. The linker sequence we selected played the role of flexible connection, so that the domain could move freely while maintaining the stability of the structure.

Modeling of fusion proteins

The fusion protein model predicted by AlphaFold2 is shown in the figure.It can be seen that pLDDT values in most regions of the obtained model are above 90, indicating that the obtained model has high quality. At the same time, according to the good fit of the two parts of the fusion protein, we speculated that there was an interaction between the two parts of the fusion protein. Therefore, the interaction between the two proteins was further explored by using ColabFold.

Figure 1. 3D visualization of the fusion protein AlphaFold2 prediction model. a, the model was colored by pLDDT confidence per residue. b, the three parts of the fusion protein, DCS, linker, and CURS.DCS is green and CURS is blue

The RMSD calculated by AlphaFold2 predicted by the fusion protein model compared with the crystal structure of DCS and CURS was 0.700Å and 0.501Å, respectively, indicating that the two parts of the fusion protein and the corresponding crystal structure have a high degree of consistency.

Figure 2. 3D model of DCS and CURS complex predicted by ColabFold

The interaction between the two parts of the fusion protein

From a structural biology point of view, the complex conformation formed by the two parts of the fusion protein is plausible, so it is speculated that there is an interaction between the two protein monomers used to build the fusion protein.Therefore, we further analyzed the interaction between the two proteins using ColabFold. ColabFold's prediction results for the two protein complexes are shown in the figure, and the quality evaluation of the complex model is performed using the FoldDock scoring function, and the results show that the pDockQ value of the complex model is 0.714, greater than 0.5, and the PPV value is greater than 0.981 and greater than 0.9, which supports the interaction between the two proteins [citing the FoldDock paper]. Moreover, the ipTM value given by ColabFold to the prediction model is 0.953, which is greater than 0.75, which also supports the interaction between the two proteins.

Figure 3. Residues interaction between DCS and CURS

The study further analyzed and visualized the interaction between DCS and CURS, and the results showed that there were a large number of hydrogen bonds between the residues on the interaction surface of the two proteins (indicated by the green dashed line in the figure), and some of the residues also formed salt Bridges (indicated by the red dashed line in the figure).These interacting amino acid residues are thought to play a key role in maintaining the interaction between the two proteins.

Figure 4. Residues interaction between DCS and CURS

Molecular dynamics simulations demonstrate PPI

The interaction between the two unilateral complexes was further verified by molecular dynamics simulation.First, the complex model was constructed with a molecular dynamics simulation system, a simulation box was constructed, and water molecules and ions were filled into it. Then, the energy of the system was minimized and the temperature, pressure and density of the system were balanced.

Figure 5. Construction of molecular dynamics simulation system,molecular dynamics simulation box containing proteins, water molecules and ions, in which the simulation process is carried out. b, the molecular dynamics simulation system reaches the energy minimization state, the system temperature reaches 300 K, and remains stable in the remaining time of equilibrium, and the pressure and density of the system are balanced

The results of molecular dynamics simulation of DCS and CURS complex at 50ns are shown in the figure.The RMSD of the skeleton between the initial structure of the complex and the simulated structure is calculated, and the results show that the structure of the complex reaches stability at about 10ns, and the RMSD is maintained at ~0.15 nm (1.5 A). This further proves that the complex structure of the two proteins is very stable and there is an interaction between the two proteins.

Figure 6. Molecular dynamics simulation of the DCS and CURS complex model. (a) Skeleton RMSD between the initial structure and the simulated structure of the DCS and CURS complex. The simulation time was 50 ns, and the RMSD of the complex model was ~0.15 nm (1.5 A). (b) Molecular dynamics traces of DCS and CURS complex in solution, simulated 50ns. (c) Comparison of the initial structure(red) with the 50ns molecular dynamics simulated structure(blue)

Based on the molecular dynamics simulation results of 50ns, the stability of each part of the complex was further evaluated.The results showed that the C-terminal and N-terminal residues of the two proteins and residues in other regions of the complex had low b-factor and RMSF, indicating that these regions of the complex had good stability.

Figure 7. Stability of various parts of the DCS and CURS complex. (a) DCS and CURS complex 3D model, which was colored by B-factor. (b) RMSF of each residue relative to the start model during the 50ns simulation of DCS. (c) RMSF of each residue relative to the start model during the 50ns simulation of CURS

Prediction of fusion protein pockets

In order to explore the catalytic mechanism of the constructed fusion protein to the substrate, the machine learning algorithm was used to predict the pocket of the fusion protein model, and the results showed that there were two pockets in the fusion protein, Pocket-1 was located in the curs part of the fusion protein, and Pocket-2 was located in the DCS part of the fusion protein.

Figure 8. Use advanced machine learning algorithms to make pocket predictions for fusion proteins. Blue is Pocket-1, located in the CURS section, and red is Pocket-2, located in the DCS section

Molecular docking of fusion proteins to substrates

AutoDock Vina was used to pair the corresponding two sets of substrate molecules to the corresponding pockets of fusion proteins to obtain the predicted complexes between substrate molecules and fusion proteins.

Figure 9. Molecular docking results of substrate molecules and constructed fusion proteins. The blue ligand is beta ketoacid, the green ligand is Feruloyl CoA, and the orange ligand is Malony CoA. The figure shows the complex conformation of the receptor and ligand with the lowest binding affinity

Interaction of substrate with fusion proteins

Based on the docking results obtained, the possible interaction mechanism between fusion protein and various substrates was analyzed.The results showed that in Pocket-1, there was hydrogen bond interaction between beta-ketoacid and glycine at position 697 of fusion protein, Feruloyl CoA and glutamine at position 662. There is hydrogen bond interaction between lysine and other four amino acids at position 665.

Figure 10. Interaction patterns between residues and substrate molecules beta-ketoacid and Feruloyl CoA in the fusion protein Pocket-1

In Pocket-2, hydrogen bond interactions between multiple residues and substrate molecules were also observed, and salt Bridges were formed between some residues and substrate molecules.These residues played a key role in the catalysis of fusion proteins on substrate molecules. This provides theoretical guidance for further design and modification of fusion proteins.



Figure 11. Interaction patterns between residues and substrate molecules Malonyl CoA and Feruloyl CoA in Pocket-2 fusion protein

Reference

Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat Commun 13, 1265 (2022).
Katsuyama Y, Kita T, Funa N, et al. Curcuminoid biosynthesis by two type III polyketide synthases in the herb Curcuma longa. J Biol C h e m , 2009, 284(17): 11160-11170. Nat Commun 13, 1265 (2022).