Model | NJTech-China-B

Overview

Escherichia coli has a complex metabolic network inside, although a large number of researchers have conducted a lot of research on it, but random or semi-rational regulatory strategies often consume a lot of manpower, material resources and financial resources, can’t meet the construction of biological factories to produce specific products. By constructing a genome-scale metabolic network, the relationship between genome, metabolic response and protein is characterized, which provides a powerful means for the targeted modification of microorganisms. In this part, we attempted to construct a genome-scale metabolic network model of E.coli, analyze the complex metabolic network, and combine different computational methods to predict the efficient metabolic pathway of E.coli to synthesize 1,5-pentanediol (1,5-PDO), and provide effective optimization strategies for relevant protein sites.

1.Metabolic network model Construction

In the Wet Lab, 6-step heterologous reaction was added to the original E.coli metabolic pathway to enable E.coli to produce 1, 5-pentanediol (1,5-PDO). In order to further improve the yield of 1,5-PDO, we need to build a metabolic network model of E.coli genome including 6-step heterologous reaction (Table 1,2).

Table 1 Metabolites designed for 6-step allogenic reactions

Metabolite	BIGG Metabolite ID
5-Aminopentanamide	5apentam_c
5-Aminopentanoate	5aptn_c
Glutarate semialdehyde	oxptn_c
5-Hydroxypentanoate	5hptn_c
5-Hydroxypentanal	5hptnal_c
1,5-Pentanediol	15ptnd_c

Table 2 Reactions designed for 6-step allogenic reactions

Reaction	BIGG Reaction ID	BIGG Reaction
L-lysine monooxygenase	LYSMO	1 lys__L_c + 1 o2_c -> 1 co2_c + 1 h_c + 1 h2o_c + 1 5apentam_c
5-aminopentanamidase	APENTAMAH2	1 h_c + 1 h2o_c + 1 5apentam_c -> 1 nh4_c + 1 5aptn_c
4-aminobutyrate aminotransferase GabT	APTNAT	1 5aptn_c + 1 akg_c -> 1 glu__L_c + 1 oxptn_c
aldehyde reductase, NADPH-dependent	ARND	1 oxptn_c + 1 nadp_c -> 1 5hptn_c + 1 nadph_c + 1 h_c
carboxylic acid	CARs	1 5hptn_c + 1 atp_c + 1 nadph_c + 1 h_c -> 1 amp_c + 1 5hptnal_c + 1 nadp_c + 1 ppi_c
aldehyde reductase, NADPH-dependent	ARND2	1 5hptnal_c + 1 nadp_c -> 1 15ptnd_c + 1 nadph_c + 1 h_c

We chose the latest iML1515 model (Figure 1) as the chassis model of the metabolic network model, which contains 2,712 metabolic reactions and 1,877 metabolites, with the addition of the three-dimensional structure of all enzymes.

Figure 1 The contents of the iML1515 chassis model

After the metabolic network chassis was selected, we added the above 6-step heterologous reaction of E.coli production of 1,5-PDO to the model, and supplemented the corresponding exchange reaction.

First of all, it is necessary to add the reaction metabolites that are not in the model (Figure 2), and give them the corresponding Metabolite ID value, which is convenient to directly call the Metabolite ID to complete the addition of the reaction in the later stage.

Figure 2 The addition of metabolites not present in the model

After adding the metabolites of the reaction, it is necessary to define the added heterologous reaction, and also give the corresponding Reaction ID for calling (Figure 3).

Figure 3 Reaction creation

After completing the above operations, it is necessary to start adding the reaction equation and defining the stoichiometric number of the reaction (Figure 4). The following operations can be done quickly by using the Metabolite ID and Reaction ID.

Figure 4 Add reaction metabolites and stoichiometric numbers

Finally, it is necessary to bind the genes related to each reaction to the reaction in order to optimize the subsequent gene strategy (Figure 5).

Figure 5 Gene addition

We named the new model STC1515
The number of metabolites increased to 1885 and the number of reactions increased to 2721(Figure 6).

Figure 6 The content after the model is added

2.Metabolic network model Validation

2.1.Maximum flux of metabolic network model

FBA Optimize

FBA algorithm uses a stoichiometric matrix (S) of size m * n (m compounds and n reactions) to represent metabolic reactions. The flux through these all of the reactions is represented by a vector v. The objective of FBA is to maximize or minimize an objective function:

Where c is a vector of weights.

The output of FBA is a particular flux distribution v which is used of linear programming to solve the equation Sv=0 given a set of upper and lower bounds on that maximizes or minimizes the objective function.

For our STC1515 metabolic network model:

pFBA algorithm optimizes a flux rate under the given steady-state network constraints, but also minimizes the sum of absolute fluxes to achieve this optimum.

pFBA Optimize

After the establishment of the metabolic network model, we tested the prediction ability of the model. We took the Biomass Equation as the target equation and applied the FBA algorithm to solve it.

Figure 7 Comparison of model growth rate before and after addition

The growth rate obtained was consistent with that of the E.coli model before any changes were made, both being 0.877 mmol/gDW/h. It is proved that the newly added heterologous reaction has no effect on the prediction ability of the model (Figure 7).

After verifying the prediction ability of the model, we took EX_15ptnd_e as the target equation, took glucose as the carbon source and the feed rate was 10mmol/gDW/h, and solved it using the pFBA algorithm.

The following results show that our model can synthesize 1, 5-pentanediol from glucose, and the synthesis rate of 1, 5-pentanediol in the target reaction DM_15ptnd_c is 8.467 mmol/gDW/h (Table 3).

Table 3 Reactions and fluxes involved in 1,5-PDO synthesis

Reaction ID	Reaction Name	pFBA Flux
EX_glc__D_e	D-Glucose exchange	10
GLCptspp	D-glucose transport via PEP:Pyr PTS (periplasm)	8.76
GLCt2pp	D-glucose transport in via proton symport (periplasm)	1.24
HEX1	Hexokinase (D-glucose:ATP)	1.24
G6PDH2r	Glucose 6-phosphate dehydrogenase	8.32
GND	Phosphogluconate dehydrogenase	8.32
RPE	Ribulose 5-phosphate 3-epimerase	5.546
RPI	Ribose-5-phosphate isomerase	2.773
PGI	Glucose-6-phosphate isomerase	1.68
TKT1	Transketolase	2.773
TKT2	Transketolase	2.773
PFK	Phosphofructokinase	4.454
PFK_3	Phosphofructokinase (s7p)	2.773
FBA	Fructose-bisphosphate aldolase	4.454
FBA3	Sedoheptulose 1,7-bisphosphate D-glyceraldehyde-3-phosphate-lyase	2.773
TPI	Triose-phosphate isomerase	7.227
GAPD	Glyceraldehyde-3-phosphate dehydrogenase	17.227
PGK	Phosphoglycerate kinase	17.227
PGM	Phosphoglycerate mutase	17.227
ENO	Enolase	17.227
PPC	Phosphoenolpyruvate carboxylase	8.467
PDH	Pyruvate dehydrogenase	0.294
CS	Citrate synthase	0.294
ASPTA	Aspartate transaminase	8.467
ASPK	Aspartate kinase	8.467
ASAD	Aspartate-semialdehyde dehydrogenase	8.467
DHDPS	Dihydrodipicolinate synthase	8.467
DHDPRy	Dihydrodipicolinate reductase (NADPH)	8.467
THDPS	Tetrahydrodipicolinate succinylase	8.467
SDPTA	Succinyldiaminopimelate transaminase	8.467
SDPDS	Succinyl-diaminopimelate desuccinylase	8.467
DAPE	Diaminopimelate epimerase	8.467
DAPDC	Diaminopimelate decarboxylase	8.467
LYSMO	Lysine oxygenase	8.467
APENTAMAH2	5 aminopentanamide amidohydrolase	8.467
APTNAT	4-aminobutyrate aminotransferase GabT	8.467
ARND	aldehyde reductase,NADPH-dependent	8.467
CARs	carboxylic acid reductase	8.467
ARND2	aldehyde reductase, NADPH-dependent	8.467
DM_15ptnd_c	1,5-Pentanediol demand	8.467

2.2.Visualization of result

In order to make the metabolic pathways clearer and easier for the Wet Lab to understand, we drew the metabolic pathways and flux diagrams with ChemDraw according to the data in the table (Figure 8).

Figure 8 Main pathways and fluxes of 1,5 pentanediol metabolism

3.Prediction of genetic modification

After communicating with Dr. Zhou of Tianjin University, we learned that the FSEOF algorithm can optimize the reaction flux related to the target product in the metabolic network model through simulation iteration, search the reaction that conforms to the synthesis trend of the target product, find the gene target, and simulate and optimize the production of 1, 5-pentadiol.

Through literature review, we learned that FSEOF algorithm has a very significant effect in predicting target overexpression, but the prediction of weakened and knockout targets by FSEOF algorithm is more likely to be inconsistent with the actual situation in biology, while the predicted overexpression targets will not appear in this situation.

In this regard, we combine the FSEOF algorithm with the OptKnock algorithm.. The weakening and knockout targets obtained by FSEOF algorithm were tested by OptKncok algorithm.(FSEOF and OptKncok are shown below).

Finally, by iterating the reaction flux with the target product, we observe the difference of the synthesis trend between relevant reactions and target product reaction in the metabolic network model to confirm the reaction target，which will simulate and optimize of 1, 5-pentanediol production.

3.1.FSEOF

The FSEOF algorithm selects the Overexpression, Weaken and Knockout target for enhanced product formation. In its implementation, The FSEOF algorithm simulates and iterates the differences in the trend of each reaction and the reaction of the target product in the process of maximizing the biomass reaction, and selects the corresponding reaction as the Overexpression, Weaken or Knockout target according to the differences in these reactions.

Figure 9 Schematic diagram of FSEOF algorithm principle

By FSEOF algorithm, ARND2 was used as the target equation for the production of 1, 5-PDO, BIOMASS_Ec_STC1515_core_75p37M was used as the biomass reation, and the flux changes and related parts of all reactions were obtained through 20 iterations of optimization (Figure10).

Figure 10 Part of the FSEOF algorithm optimizes the flux change graph

With the V_product and V_biomass obtained by FSEOF, we are able to calculate the Q_slope, and based on the value of Q_slope, we are able to determine which points need to be overexpressed, weakened, or knocked out. In addition to calculating the Q_slope of the target reaction, FSEOF also calculates all the relevant reactions calculated by pFBA. If the slope of the calculated results is positive, it is Overexpressed, and the higher the slope, the higher the ranking. But if the slope is less than 0.1, it is Weaken; If the slope is negative, it is Knockout, and the greater the absolute value of the slope, the higher the ranking.The Q_slope formula is as follows and Gene optimization strategy figure is as follows (Figure 11).

Figure 11 Gene optimization strategy

3.2.OptKnock

Although the genes to be knocked out were obtained by FSEOF, the predicted weakening and knockout targets were not accurate and needed to be tested with OptKnock.

OptKnock is based on a two-layer optimization problem：

The nested optimization is converted to a single-layer problem, which can then be solved as a mixed integer linear problem (MILP).

Converting a nested optimization to a single-level optimization results:

Using EX_15ptnd_e as the target equation, Set the number of knockout outputs to 5, set the output strategy to 10 and select one of them for display.The following five knockout reactions were obtained by OptKnock Solver (Table 4).

Table 4 OptKnock prediction results

Reaction ID	Reaction Name	Reaction
ACKr	Acetate kinase	Acetate +ATP <=>Acetyl phosphate+ADP
PTAr	Phosphotransacetylase	Acetyl-CoA + Phosphate <=> Acetyl phosphate + Coenzyme A
PDH	Pyruvate dehydrogenase	Coenzyme A + Nicotinamide adenine dinucleotide + Pyruvate --> Acetyl-CoA + CO₂ + Nicotinamide adenine dinucleotide - reduced
GCALDD	Glycolaldehyde dehydrogenase	Glycolaldehyde + H₂O + NAD <=> Glycolate + 2H⁺ +NADH
ASAD	Aspartate-semialdehyde dehydrogenase	L-Aspartate 4-semialdehyde + Nicotinamide adenine dinucleotide phosphate + Phosphate <=> 4-Phospho-L-aspartate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced

3.3.Gene Optimization Strategy

In view of the fact that FSEOF's prediction of Knockout targets is inconsistent with reality, we compare the knockout targets obtained by FSEOF with those obtained by OptKnock, and select the four targets with the highest recurrence rate for optimization.

Combining the different results of FSEOF and OptKnock, we divide the optimizationstrategies into Overexpression, Weaken, and Knockout (Table 5).

Table 5 Gene optimization strategy

Reaction ID	Reaction	Enzyme or gene	Optimization strategy
ARND2	5-Hydroxypentanal + Nicotinamide adenine dinucleotide phosphate --> 1,5-Pentanediol + H⁺ + Nicotinamide adenine dinucleotide phosphate - reduced	1.1.1.2	Overexpression
APENTAMAH2	5-Aminopentanamide + H₂O + H⁺ --> 5-Aminopentanoate + Ammonium	3.5.1.30	Overexpression
CARs	5-hydroxypentanoate + ATP + H⁺ + Nicotinamide adenine dinucleotide phosphate - reduced --> 5-Hydroxypentanal + AMP + Nicotinamide adenine dinucleotide phosphate + Diphosphate	1.2.1.30	Overexpression
LYSMO	L-Lysine + O₂ --> 5-Aminopentanamide + CO₂ + H₂O + H⁺	1.13.12.2	Overexpression
APTNAT	5-Aminopentanoate + 2-Oxoglutarate --> L-Glutamate + 5-Oxopentanoate	2.6.1.48	Overexpression
ARND	Nicotinamide adenine dinucleotide phosphate + 5-Oxopentanoate --> 5-hydroxypentanoate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced	1.1.1.2	Overexpression
DAPDC	Meso-2,6-Diaminoheptanedioate + H⁺ --> CO₂ + L-Lysine	4.1.1.20	Overexpression
DAPE	LL-2,6-Diaminoheptanedioate <=> Meso-2,6-Diaminoheptanedioate	5.1.1.7	Overexpression
SDPDS	H₂O + N-Succinyl-LL-2,6-diaminoheptanedioate --> LL-2,6-Diaminoheptanedioate + Succinate	3.5.1.18	Overexpression
SDPTA	2-Oxoglutarate + N-Succinyl-LL-2,6-diaminoheptanedioate <=> L-Glutamate + N-Succinyl-2-L-amino-6-oxoheptanedioate	2.6.1.17	Overexpression
ASPK	L-Aspartate + ATP <=> 4-Phospho-L-aspartate + ADP	2.7.2.4	Overexpression
THDPS	H₂O + Succinyl-CoA + 2,3,4,5-Tetrahydrodipicolinate --> Coenzyme A + N-Succinyl-2-L-amino-6-oxoheptanedioate	2.3.1.117	Overexpression
DHDPRy	2,3-Dihydrodipicolinate + H⁺ + Nicotinamide adenine dinucleotide phosphate - reduced --> Nicotinamide adenine dinucleotide phosphate + 2,3,4,5-Tetrahydrodipicolinate	1.3.1.26	Overexpression
DHDPS	L-Aspartate 4-semialdehyde + Pyruvate --> 2,3-Dihydrodipicolinate + 2.0 H₂O + H⁺	b2478	Overexpression
ACCOAL	ATP + Coenzyme A + Propionate (n-C3:0) --> ADP + Phosphate + Propanoyl-CoA	6.2.1.13	Overexpression
PPCSCT	Propanoyl-CoA + Succinate --> Propionate (n-C3:0) + Succinyl-CoA	2.8.3.1	Overexpression
ASPK	L-Aspartate + ATP <=> 4-Phospho-L-aspartate + ADP	2.7.2.4	Overexpression
HEX7	ATP + D-Fructose --> ADP + D-Fructose 6-phosphate + H⁺	2.7.1.1	Weaken
GLCt2pp	D-Glucose + H⁺ --> D-Glucose + H⁺	b2943	Weaken
KARA1	(R)-2,3-Dihydroxy-3-methylbutanoate + Nicotinamide adenine dinucleotide phosphate <=> (S)-2-Acetolactate + H⁺ + Nicotinamide adenine dinucleotide phosphate - reduced	1.1.1.86	Weaken
ADK3	AMP + GTP <=> ADP + GDP	2.7.4.10	Weaken
HSDy	L-Homoserine + Nicotinamide adenine dinucleotide phosphate <=> L-Aspartate 4-semialdehyde + H⁺ + Nicotinamide adenine dinucleotide phosphate - reduced	1.1.1.3	Weaken
IMPC	H2O + IMP <=> 5-Formamido-1-(5-phospho-D-ribosyl)imidazole-4-carboxamide	2.1.2.3，3.5.4.10	Weaken
IPPMIb	2-Isopropylmaleate + H₂O <=> 3-Carboxy-3-hydroxy-4-methylpentanoate	4.2.1.33	Weaken
IPPMIa	3-Carboxy-2-hydroxy-4-methylpentanoate <=> 2-Isopropylmaleate + H₂O	4.2.1.33	Weaken
ASAD	L-Aspartate 4-semialdehyde + Nicotinamide adenine dinucleotide phosphate + Phosphate <=> 4-Phospho-L-aspartate + H⁺ + Nicotinamide adenine dinucleotide phosphate - reduced	1.2.1.11	Knockout
ACKr	Acetate +ATP <=>Acetyl phosphate+ADP	2.7.2.15, 2.7.2.1	Knockout
PTAr	Acetyl-CoA + Phosphate <=> Acetyl phosphate + Coenzyme A	2.3.1.8	Knockout
PDH	Coenzyme A + Nicotinamide adenine dinucleotide + Pyruvate --> Acetyl-CoA + CO₂ + Nicotinamide adenine dinucleotide - reduced	b0115 and b0116 and b0114	Knockout

4.Improvement of the wet experiment by the model

4.1.Identifying the theoretically feasibility of our designed 1,5-PDO pathway

Because none of natural pathways for 1,5-pentanediol (1,5-PDO) biosynthesis is present, Wet Lab first designed an artificial 1,5-PDO biosynthetic pathway in our work. By adding heterologous enzymes into the E.coli iML1515 model, the modeling group constructed a metabolic model of E.coli capable of producing 1,5-PDO. By using the FBA algorithm, the maximum theoretical yield of 1,5-PDO produced by the designed pathway was 8.467 mmol/gDW/h, indicating that our designed 1,5-PDO synthetic pathway was theoretically feasible.

Figure 12 A proposed biosynthesis route for de novo production of 1,5-PDO from glucose

4.2.Optimizing the enzyme expression and activity to improve the efficiency of limiting module

After establishment of the 1,5-PDO metabolic network in E.coli, we performed a simulation iteration by using FSEOF algorithm combined with OptKncok algorithm. Several genetic targets that can be used to optimize the fluxes to produce the production of 1, 5-pentadiol are predicted. In addition, in combination with the wet experiment result analysis that intermediate of 5-HV was largely accumulated, the experimental students focused on the enzymes of CARs and ARND2 in the 1,5-PDO synthesis module (Figure 13, Table5). They first changed the expression of CARs and ARND2 to a a high copy number plasmid of pRSFDuet from a low copy number plasmid of pACYCDuet, which increased 1,5-PDO production by 3.5 fold. These results indicated the success of strain optimization guided by the dry experiment. For further overexpression of CARs, experimental students further employed the enzyme engineering and assembly as another alternative way to increase the activity of targeted enzymes, and the 1,5-PDO production ability by the engineered strain was correspondingly improved.

4.3.Gene Knockout of branched pathway

Combined with the knockout results obtained, without affecting the normal growth of E. coli and the occurrence of the exchange reaction, the Wet Lab decided to knockout the two genes ackA and pta, the acka-pta gene encodes enzymes that convert pyruvate to acetic acid, the main by-product during the 1,5-PDO fermentation. Therefore, Wet Lab designed to knock out these two genes to assess the effect on 1, 5-PDO.

In addition, combined with previous laboratory studies and the flux trend prediction of the model, the Wet Lab found that the dehydrogenase of YcjQ participated in the degradation of 1,5-PDO, which was completely opposite to the flux trend of EX_15ptnd_e, the target equation for producing 1,5-PDO. Therefore, in order to increase the yield, Wet Lab also decided to knock out the YcjQ gene.

Finally, Wet Lab constructed two strains NT1003-Δacka-pta and NT1003-ΔYcjQ via CRISPR/Cas9 (Figure 13). When using these two strain as the chassis, the 1,5-PDO production could be increased by 16% and 75% respectively. These results are also another success of strain optimization guided by the dry experiment.

Figure 13 Metabolic pathway for 1,5-PDO production in NT1003. The genes marked red would be knocked out.

Software and Tools

Table 6 List of Software and Tools

Software and Tools	Version	Website
Python	3.7	https://www.python.org/
Cobra	0.26.3	https://pypi.org/project/cobra/
Gurobipy	10.0.2	https://pypi.org/project/gurobipy/
Cameo	0.13.6	https://pypi.org/project/cameo/
Straindesign	1.1	https://straindesign.readthedocs.io/en/latest/index.html
Escher	1.7.3	https://escher.readthedocs.io/en/latest/index.html
CNApy	1.1.9	https://cnapy-org.github.io/CNApy/
ChemDraw	20.0.0	http://www.chemdraw.net.cn/
VS Code	1.83	https://code.visualstudio.com/
BiGG	1.83	http://bigg.ucsd.edu/
KEGG	1.83	https://www.kegg.jp/
Pycharm	2023.2	https://www.jetbrains.com/pycharm

Reference

[1].Orth, J., Thiele, I. & Palsson, B. What is flux balance analysis?. Nat Biotechnol 28, 245–248 (2010). https://doi.org/10.1038/nbt.1614

[2].Lewis NE, Hixson KK, Conrad TM, Lerman JA, Charusanti P, Polpitiya AD, Adkins JN, Schramm G, Purvine SO, Lopez-Ferrer D, Weitz KK, Eils R, König R, Smith RD, Palsson BØ. Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models. Mol Syst Biol. 2010 Jul;6:390. doi: 10.1038/msb.2010.47. PMID: 20664636; PMCID: PMC2925526.

[3].Choi HS, Lee SY, Kim TY, Woo HM. In silico identification of gene amplification targets for improvement of lycopene production. Appl Environ Microbiol. 2010 May;76(10):3097-105. doi: 10.1128/AEM.00115-10. Epub 2010 Mar 26. PMID: 20348305; PMCID: PMC2869140.

[4].Burgard, A.P., Pharkya, P. and Maranas, C.D. (2003), Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng., 84: 647-657. https://doi.org/10.1002/bit.10803

[5].https://cobrapy.readthedocs.io/en/0.26.3/index.html

[6].https://cameo.bio/tutorials.html

[7].https://straindesign.readthedocs.io/en/latest/index.html

[8].https://escher.readthedocs.io/en/latest/index.html