Escherichia coli has a complex metabolic network inside, although a large number of researchers have conducted a lot of research on it, but random or semi-rational regulatory strategies often consume a lot of manpower, material resources and financial resources, can’t meet the construction of biological factories to produce specific products. By constructing a genome-scale metabolic network, the relationship between genome, metabolic response and protein is characterized, which provides a powerful means for the targeted modification of microorganisms. In this part, we attempted to construct a genome-scale metabolic network model of E.coli, analyze the complex metabolic network, and combine different computational methods to predict the efficient metabolic pathway of E.coli to synthesize 1,5-pentanediol (1,5-PDO), and provide effective optimization strategies for relevant protein sites.

1.Metabolic network model Construction

In the Wet Lab, 6-step heterologous reaction was added to the original E.coli metabolic pathway to enable E.coli to produce 1, 5-pentanediol (1,5-PDO). In order to further improve the yield of 1,5-PDO, we need to build a metabolic network model of E.coli genome including 6-step heterologous reaction (Table 1,2).

Table 1 Metabolites designed for 6-step allogenic reactions

Metabolite BIGG Metabolite ID
5-Aminopentanamide 5apentam_c
5-Aminopentanoate 5aptn_c
Glutarate semialdehyde oxptn_c
5-Hydroxypentanoate 5hptn_c
5-Hydroxypentanal 5hptnal_c
1,5-Pentanediol 15ptnd_c

Table 2 Reactions designed for 6-step allogenic reactions

Reaction BIGG Reaction ID BIGG Reaction
L-lysine monooxygenase LYSMO 1 lys__L_c + 1 o2_c -> 1 co2_c + 1 h_c + 1 h2o_c + 1 5apentam_c
5-aminopentanamidase APENTAMAH2 1 h_c + 1 h2o_c + 1 5apentam_c -> 1 nh4_c + 1 5aptn_c
4-aminobutyrate aminotransferase GabT APTNAT 1 5aptn_c + 1 akg_c -> 1 glu__L_c + 1 oxptn_c
aldehyde reductase, NADPH-dependent ARND 1 oxptn_c + 1 nadp_c -> 1 5hptn_c + 1 nadph_c + 1 h_c
carboxylic acid CARs 1 5hptn_c + 1 atp_c + 1 nadph_c + 1 h_c -> 1 amp_c + 1 5hptnal_c + 1 nadp_c + 1 ppi_c
aldehyde reductase, NADPH-dependent ARND2 1 5hptnal_c + 1 nadp_c -> 1 15ptnd_c + 1 nadph_c + 1 h_c

We chose the latest iML1515 model (Figure 1) as the chassis model of the metabolic network model, which contains 2,712 metabolic reactions and 1,877 metabolites, with the addition of the three-dimensional structure of all enzymes.

Figure 1 The contents of the iML1515 chassis model

After the metabolic network chassis was selected, we added the above 6-step heterologous reaction of E.coli production of 1,5-PDO to the model, and supplemented the corresponding exchange reaction.

First of all, it is necessary to add the reaction metabolites that are not in the model (Figure 2), and give them the corresponding Metabolite ID value, which is convenient to directly call the Metabolite ID to complete the addition of the reaction in the later stage.

Figure 2 The addition of metabolites not present in the model

After adding the metabolites of the reaction, it is necessary to define the added heterologous reaction, and also give the corresponding Reaction ID for calling (Figure 3).

Figure 3 Reaction creation

After completing the above operations, it is necessary to start adding the reaction equation and defining the stoichiometric number of the reaction (Figure 4). The following operations can be done quickly by using the Metabolite ID and Reaction ID.

Figure 4 Add reaction metabolites and stoichiometric numbers

Finally, it is necessary to bind the genes related to each reaction to the reaction in order to optimize the subsequent gene strategy (Figure 5).

Figure 5 Gene addition

We named the new model STC1515
The number of metabolites increased to 1885 and the number of reactions increased to 2721(Figure 6).

Figure 6 The content after the model is added

2.Metabolic network model Validation
2.1.Maximum flux of metabolic network model
FBA Optimize

FBA algorithm uses a stoichiometric matrix (S) of size m * n (m compounds and n reactions) to represent metabolic reactions. The flux through these all of the reactions is represented by a vector v. The objective of FBA is to maximize or minimize an objective function:

Where c is a vector of weights.

The output of FBA is a particular flux distribution v which is used of linear programming to solve the equation Sv=0 given a set of upper and lower bounds on that maximizes or minimizes the objective function.

For our STC1515 metabolic network model:

pFBA algorithm optimizes a flux rate under the given steady-state network constraints, but also minimizes the sum of absolute fluxes to achieve this optimum.

pFBA Optimize

After the establishment of the metabolic network model, we tested the prediction ability of the model. We took the Biomass Equation as the target equation and applied the FBA algorithm to solve it.

Figure 7 Comparison of model growth rate before and after addition

The growth rate obtained was consistent with that of the E.coli model before any changes were made, both being 0.877 mmol/gDW/h. It is proved that the newly added heterologous reaction has no effect on the prediction ability of the model (Figure 7).

After verifying the prediction ability of the model, we took EX_15ptnd_e as the target equation, took glucose as the carbon source and the feed rate was 10mmol/gDW/h, and solved it using the pFBA algorithm.

The following results show that our model can synthesize 1, 5-pentanediol from glucose, and the synthesis rate of 1, 5-pentanediol in the target reaction DM_15ptnd_c is 8.467 mmol/gDW/h (Table 3).

Table 3 Reactions and fluxes involved in 1,5-PDO synthesis

Reaction ID Reaction Name pFBA Flux
EX_glc__D_e D-Glucose exchange 10
GLCptspp D-glucose transport via PEP:Pyr PTS (periplasm) 8.76
GLCt2pp D-glucose transport in via proton symport (periplasm) 1.24
HEX1 Hexokinase (D-glucose:ATP) 1.24
G6PDH2r Glucose 6-phosphate dehydrogenase 8.32
GND Phosphogluconate dehydrogenase 8.32
RPE Ribulose 5-phosphate 3-epimerase 5.546
RPI Ribose-5-phosphate isomerase 2.773
PGI Glucose-6-phosphate isomerase 1.68
TKT1 Transketolase 2.773
TKT2 Transketolase 2.773
PFK Phosphofructokinase 4.454
PFK_3 Phosphofructokinase (s7p) 2.773
FBA Fructose-bisphosphate aldolase 4.454
FBA3 Sedoheptulose 1,7-bisphosphate D-glyceraldehyde-3-phosphate-lyase 2.773
TPI Triose-phosphate isomerase 7.227
GAPD Glyceraldehyde-3-phosphate dehydrogenase 17.227
PGK Phosphoglycerate kinase 17.227
PGM Phosphoglycerate mutase 17.227
ENO Enolase 17.227
PPC Phosphoenolpyruvate carboxylase 8.467
PDH Pyruvate dehydrogenase 0.294
CS Citrate synthase 0.294
ASPTA Aspartate transaminase 8.467
ASPK Aspartate kinase 8.467
ASAD Aspartate-semialdehyde dehydrogenase 8.467
DHDPS Dihydrodipicolinate synthase 8.467
DHDPRy Dihydrodipicolinate reductase (NADPH) 8.467
THDPS Tetrahydrodipicolinate succinylase 8.467
SDPTA Succinyldiaminopimelate transaminase 8.467
SDPDS Succinyl-diaminopimelate desuccinylase 8.467
DAPE Diaminopimelate epimerase 8.467
DAPDC Diaminopimelate decarboxylase 8.467
LYSMO Lysine oxygenase 8.467
APENTAMAH2 5 aminopentanamide amidohydrolase 8.467
APTNAT 4-aminobutyrate aminotransferase GabT 8.467
ARND aldehyde reductase,NADPH-dependent 8.467
CARs carboxylic acid reductase 8.467
ARND2 aldehyde reductase, NADPH-dependent 8.467
DM_15ptnd_c 1,5-Pentanediol demand 8.467
2.2.Visualization of result

In order to make the metabolic pathways clearer and easier for the Wet Lab to understand, we drew the metabolic pathways and flux diagrams with ChemDraw according to the data in the table (Figure 8).

Figure 8 Main pathways and fluxes of 1,5 pentanediol metabolism

3.Prediction of genetic modification

After communicating with Dr. Zhou of Tianjin University, we learned that the FSEOF algorithm can optimize the reaction flux related to the target product in the metabolic network model through simulation iteration, search the reaction that conforms to the synthesis trend of the target product, find the gene target, and simulate and optimize the production of 1, 5-pentadiol.

Through literature review, we learned that FSEOF algorithm has a very significant effect in predicting target overexpression, but the prediction of weakened and knockout targets by FSEOF algorithm is more likely to be inconsistent with the actual situation in biology, while the predicted overexpression targets will not appear in this situation.

In this regard, we combine the FSEOF algorithm with the OptKnock algorithm.. The weakening and knockout targets obtained by FSEOF algorithm were tested by OptKncok algorithm.(FSEOF and OptKncok are shown below).

Finally, by iterating the reaction flux with the target product, we observe the difference of the synthesis trend between relevant reactions and target product reaction in the metabolic network model to confirm the reaction target,which will simulate and optimize of 1, 5-pentanediol production.


The FSEOF algorithm selects the Overexpression, Weaken and Knockout target for enhanced product formation. In its implementation, The FSEOF algorithm simulates and iterates the differences in the trend of each reaction and the reaction of the target product in the process of maximizing the biomass reaction, and selects the corresponding reaction as the Overexpression, Weaken or Knockout target according to the differences in these reactions.

Figure 9 Schematic diagram of FSEOF algorithm principle

By FSEOF algorithm, ARND2 was used as the target equation for the production of 1, 5-PDO, BIOMASS_Ec_STC1515_core_75p37M was used as the biomass reation, and the flux changes and related parts of all reactions were obtained through 20 iterations of optimization (Figure10).

Figure 10 Part of the FSEOF algorithm optimizes the flux change graph

With the Vproduct and Vbiomass obtained by FSEOF, we are able to calculate the Qslope, and based on the value of Qslope, we are able to determine which points need to be overexpressed, weakened, or knocked out. In addition to calculating the Qslope of the target reaction, FSEOF also calculates all the relevant reactions calculated by pFBA. If the slope of the calculated results is positive, it is Overexpressed, and the higher the slope, the higher the ranking. But if the slope is less than 0.1, it is Weaken; If the slope is negative, it is Knockout, and the greater the absolute value of the slope, the higher the ranking.The Qslope formula is as follows and Gene optimization strategy figure is as follows (Figure 11).

Figure 11 Gene optimization strategy


Although the genes to be knocked out were obtained by FSEOF, the predicted weakening and knockout targets were not accurate and needed to be tested with OptKnock.

OptKnock is based on a two-layer optimization problem:

The nested optimization is converted to a single-layer problem, which can then be solved as a mixed integer linear problem (MILP).

Converting a nested optimization to a single-level optimization results:

Using EX_15ptnd_e as the target equation, Set the number of knockout outputs to 5, set the output strategy to 10 and select one of them for display.The following five knockout reactions were obtained by OptKnock Solver (Table 4).

Table 4 OptKnock prediction results

Reaction ID Reaction Name Reaction
ACKr Acetate kinase Acetate +ATP <=>Acetyl phosphate+ADP
PTAr Phosphotransacetylase Acetyl-CoA + Phosphate <=> Acetyl phosphate + Coenzyme A
PDH Pyruvate dehydrogenase Coenzyme A + Nicotinamide adenine dinucleotide + Pyruvate --> Acetyl-CoA + CO2 + Nicotinamide adenine dinucleotide - reduced
GCALDD Glycolaldehyde dehydrogenase Glycolaldehyde + H2O + NAD <=> Glycolate + 2H+ +NADH
ASAD Aspartate-semialdehyde dehydrogenase L-Aspartate 4-semialdehyde + Nicotinamide adenine dinucleotide phosphate + Phosphate <=> 4-Phospho-L-aspartate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced
3.3.Gene Optimization Strategy

In view of the fact that FSEOF's prediction of Knockout targets is inconsistent with reality, we compare the knockout targets obtained by FSEOF with those obtained by OptKnock, and select the four targets with the highest recurrence rate for optimization.

Combining the different results of FSEOF and OptKnock, we divide the optimizationstrategies into Overexpression, Weaken, and Knockout (Table 5).

Table 5 Gene optimization strategy

Reaction ID Reaction Enzyme or gene Optimization strategy
ARND2 5-Hydroxypentanal + Nicotinamide adenine dinucleotide phosphate --> 1,5-Pentanediol + H+ + Nicotinamide adenine dinucleotide phosphate - reduced Overexpression
APENTAMAH2 5-Aminopentanamide + H2O + H+ --> 5-Aminopentanoate + Ammonium Overexpression
CARs 5-hydroxypentanoate + ATP + H+ + Nicotinamide adenine dinucleotide phosphate - reduced --> 5-Hydroxypentanal + AMP + Nicotinamide adenine dinucleotide phosphate + Diphosphate Overexpression
LYSMO L-Lysine + O2 --> 5-Aminopentanamide + CO2 + H2O + H+ Overexpression
APTNAT 5-Aminopentanoate + 2-Oxoglutarate --> L-Glutamate + 5-Oxopentanoate Overexpression
ARND Nicotinamide adenine dinucleotide phosphate + 5-Oxopentanoate --> 5-hydroxypentanoate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced Overexpression
DAPDC Meso-2,6-Diaminoheptanedioate + H+ --> CO2 + L-Lysine Overexpression
DAPE LL-2,6-Diaminoheptanedioate <=> Meso-2,6-Diaminoheptanedioate Overexpression
SDPDS H2O + N-Succinyl-LL-2,6-diaminoheptanedioate --> LL-2,6-Diaminoheptanedioate + Succinate Overexpression
SDPTA 2-Oxoglutarate + N-Succinyl-LL-2,6-diaminoheptanedioate <=> L-Glutamate + N-Succinyl-2-L-amino-6-oxoheptanedioate Overexpression
ASPK L-Aspartate + ATP <=> 4-Phospho-L-aspartate + ADP Overexpression
THDPS H2O + Succinyl-CoA + 2,3,4,5-Tetrahydrodipicolinate --> Coenzyme A + N-Succinyl-2-L-amino-6-oxoheptanedioate Overexpression
DHDPRy 2,3-Dihydrodipicolinate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced --> Nicotinamide adenine dinucleotide phosphate + 2,3,4,5-Tetrahydrodipicolinate Overexpression
DHDPS L-Aspartate 4-semialdehyde + Pyruvate --> 2,3-Dihydrodipicolinate + 2.0 H2O + H+ b2478 Overexpression
ACCOAL ATP + Coenzyme A + Propionate (n-C3:0) --> ADP + Phosphate + Propanoyl-CoA Overexpression
PPCSCT Propanoyl-CoA + Succinate --> Propionate (n-C3:0) + Succinyl-CoA Overexpression
ASPK L-Aspartate + ATP <=> 4-Phospho-L-aspartate + ADP Overexpression
HEX7 ATP + D-Fructose --> ADP + D-Fructose 6-phosphate + H+ Weaken
GLCt2pp D-Glucose + H+ --> D-Glucose + H+ b2943 Weaken
KARA1 (R)-2,3-Dihydroxy-3-methylbutanoate + Nicotinamide adenine dinucleotide phosphate <=> (S)-2-Acetolactate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced Weaken
ADK3 AMP + GTP <=> ADP + GDP Weaken
HSDy L-Homoserine + Nicotinamide adenine dinucleotide phosphate <=> L-Aspartate 4-semialdehyde + H+ + Nicotinamide adenine dinucleotide phosphate - reduced Weaken
IMPC H2O + IMP <=> 5-Formamido-1-(5-phospho-D-ribosyl)imidazole-4-carboxamide, Weaken
IPPMIb 2-Isopropylmaleate + H2O <=> 3-Carboxy-3-hydroxy-4-methylpentanoate Weaken
IPPMIa 3-Carboxy-2-hydroxy-4-methylpentanoate <=> 2-Isopropylmaleate + H2O Weaken
ASAD L-Aspartate 4-semialdehyde + Nicotinamide adenine dinucleotide phosphate + Phosphate <=> 4-Phospho-L-aspartate + H+ + Nicotinamide adenine dinucleotide phosphate - reduced Knockout
ACKr Acetate +ATP <=>Acetyl phosphate+ADP, Knockout
PTAr Acetyl-CoA + Phosphate <=> Acetyl phosphate + Coenzyme A Knockout
PDH Coenzyme A + Nicotinamide adenine dinucleotide + Pyruvate --> Acetyl-CoA + CO2 + Nicotinamide adenine dinucleotide - reduced b0115 and b0116 and b0114 Knockout
4.Improvement of the wet experiment by the model
4.1.Identifying the theoretically feasibility of our designed 1,5-PDO pathway

Because none of natural pathways for 1,5-pentanediol (1,5-PDO) biosynthesis is present, Wet Lab first designed an artificial 1,5-PDO biosynthetic pathway in our work. By adding heterologous enzymes into the E.coli iML1515 model, the modeling group constructed a metabolic model of E.coli capable of producing 1,5-PDO. By using the FBA algorithm, the maximum theoretical yield of 1,5-PDO produced by the designed pathway was 8.467 mmol/gDW/h, indicating that our designed 1,5-PDO synthetic pathway was theoretically feasible.

Figure 12 A proposed biosynthesis route for de novo production of 1,5-PDO from glucose

4.2.Optimizing the enzyme expression and activity to improve the efficiency of limiting module

After establishment of the 1,5-PDO metabolic network in E.coli, we performed a simulation iteration by using FSEOF algorithm combined with OptKncok algorithm. Several genetic targets that can be used to optimize the fluxes to produce the production of 1, 5-pentadiol are predicted. In addition, in combination with the wet experiment result analysis that intermediate of 5-HV was largely accumulated, the experimental students focused on the enzymes of CARs and ARND2 in the 1,5-PDO synthesis module (Figure 13, Table5). They first changed the expression of CARs and ARND2 to a a high copy number plasmid of pRSFDuet from a low copy number plasmid of pACYCDuet, which increased 1,5-PDO production by 3.5 fold. These results indicated the success of strain optimization guided by the dry experiment. For further overexpression of CARs, experimental students further employed the enzyme engineering and assembly as another alternative way to increase the activity of targeted enzymes, and the 1,5-PDO production ability by the engineered strain was correspondingly improved.

4.3.Gene Knockout of branched pathway

Combined with the knockout results obtained, without affecting the normal growth of E. coli and the occurrence of the exchange reaction, the Wet Lab decided to knockout the two genes ackA and pta, the acka-pta gene encodes enzymes that convert pyruvate to acetic acid, the main by-product during the 1,5-PDO fermentation. Therefore, Wet Lab designed to knock out these two genes to assess the effect on 1, 5-PDO.

In addition, combined with previous laboratory studies and the flux trend prediction of the model, the Wet Lab found that the dehydrogenase of YcjQ participated in the degradation of 1,5-PDO, which was completely opposite to the flux trend of EX_15ptnd_e, the target equation for producing 1,5-PDO. Therefore, in order to increase the yield, Wet Lab also decided to knock out the YcjQ gene.

Finally, Wet Lab constructed two strains NT1003-Δacka-pta and NT1003-ΔYcjQ via CRISPR/Cas9 (Figure 13). When using these two strain as the chassis, the 1,5-PDO production could be increased by 16% and 75% respectively. These results are also another success of strain optimization guided by the dry experiment.

Figure 13 Metabolic pathway for 1,5-PDO production in NT1003. The genes marked red would be knocked out.

Software and Tools

Table 6 List of Software and Tools

Software and Tools Version Website
Python 3.7 https://www.python.org/
Cobra 0.26.3 https://pypi.org/project/cobra/
Gurobipy 10.0.2 https://pypi.org/project/gurobipy/
Cameo 0.13.6 https://pypi.org/project/cameo/
Straindesign 1.1 https://straindesign.readthedocs.io/en/latest/index.html
Escher 1.7.3 https://escher.readthedocs.io/en/latest/index.html
CNApy 1.1.9 https://cnapy-org.github.io/CNApy/
ChemDraw 20.0.0 http://www.chemdraw.net.cn/
VS Code 1.83 https://code.visualstudio.com/
BiGG 1.83 http://bigg.ucsd.edu/
KEGG 1.83 https://www.kegg.jp/
Pycharm 2023.2 https://www.jetbrains.com/pycharm

[1].Orth, J., Thiele, I. & Palsson, B. What is flux balance analysis?. Nat Biotechnol 28, 245–248 (2010). https://doi.org/10.1038/nbt.1614

[2].Lewis NE, Hixson KK, Conrad TM, Lerman JA, Charusanti P, Polpitiya AD, Adkins JN, Schramm G, Purvine SO, Lopez-Ferrer D, Weitz KK, Eils R, König R, Smith RD, Palsson BØ. Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models. Mol Syst Biol. 2010 Jul;6:390. doi: 10.1038/msb.2010.47. PMID: 20664636; PMCID: PMC2925526.

[3].Choi HS, Lee SY, Kim TY, Woo HM. In silico identification of gene amplification targets for improvement of lycopene production. Appl Environ Microbiol. 2010 May;76(10):3097-105. doi: 10.1128/AEM.00115-10. Epub 2010 Mar 26. PMID: 20348305; PMCID: PMC2869140.

[4].Burgard, A.P., Pharkya, P. and Maranas, C.D. (2003), Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng., 84: 647-657. https://doi.org/10.1002/bit.10803