Dynamics modeling of RebM synthesis through a cascade

enzyme reaction from Steviol

1 Enzyme Cascade Reaction Introduction

Biochemical reactions critical to cellular metabolism are typically orchestrated by enzymes working in a coordinated manner. Inspired by nature, enzyme-coupled reactions combine multiple catalytic steps in a single reaction vessel to produce a wide range of chemical products, offering a time-saving, cost-effective alternative to single-step reactions.

According to the timeline of catalytic steps in the cascade enzyme reaction, it can be divided into several categories (Figure 1). In vitro enzyme cascade reactions offer several advantages, including the ability to synthesize complex molecules without the need for intermediate product separation and the potential to overcome unfavorable equilibria towards the final product. Measuring and optimizing a multi-enzyme system is a challenging task due to the influence of various conditions, such as the reaction temperatures of each enzyme, pH levels, and the consistency of the buffering systems used, as well as whether the products of the previous steps inhibit the activity of other enzymes, and more. Establishing mathematical models to assess and predict a multi-enzyme system can provide guidance for optimizing the entire system.

Figure 1: Classification of In Vitro Enzyme Cascade Reactions. a. Cascade reactions are classified based on the reaction environment and catalysts. In vivo, enzyme reactions occur within subcellular compartments of natural cells. In vitro, enzymes are purified, and reactions take place in test tubes. Chemical enzymatic methods combine chemical catalysts and enzyme reactions for synthesizing substances. b. Cascade reactions are classified based on the timeline of catalytic steps. Linear cascade reactions involve the continuous conversion of a substrate into a product through consecutive enzyme reaction steps. Parallel reactions link two enzyme reactions through complementary cofactors or shared substrates. Orthogonal reactions couple product formation with the regeneration of shared substrates. Circular reactions allow the direct recycling of byproducts to form shared substrates. Triangular reactions involve multiple catalytic reactions occurring simultaneously from the substrate, converging to yield a single product. S, substrate; I, intermediate; P/P2, target product; P1, byproduct.

2 Single-Step Enzyme Reaction Model

A cascade enzyme reaction is defined as a series of consecutive enzyme reactions where the product of one reaction serves as the substrate for the next. Before studying the kinetics of a cascade reaction, we first investigate an individual enzyme reaction, which can be expressed as follows:

In this context, S represents the substrate, E represents the enzyme, and P represents the product. During the catalytic process, the enzyme first binds to the substrate to form an enzyme-substrate complex (ES). Therefore, when the substrate concentration reaches a certain level, all enzyme molecules will bind to the substrate, leading to enzyme saturation. This saturation point is when Michaelis-Menten enzyme kinetics is achieved. Therefore, the reaction catalyzed by Michaelis-Menten enzyme can be represented as:

The total concentration of enzymes [Et] is defined as:


As a result, the rate of formation of ES is given by:


And the rate of decomposition of ES is:


When the reaction is at a steady state, the rate of ES formation equals the rate of ES decomposition, which can be expressed as:


After rearranging the equation, we get the equation for [ES]:


By dividing both the numerator and denominator on the right side of the equation by k1, we obtain:


At this point, let's define (k-1 + k2)/k1 as the Michaelis constant (Km). Therefore, the equation can be rewritten as:


The rate of product formation depends on [ES], which can be expressed as: v=k2[ES]

So, the original equation can also be rewritten as:


When enzyme molecules reach saturation, meaning that all enzyme molecules are bound to substrate molecules, i.e., [Et]=[ES], the reaction rate reaches its maximum. At this point, the maximum reaction rate (vmax) is given by vmax=k2*[Et]. Therefore, the original equation can be rewritten as v=vmax*[S]/(Km+[S]).

The relationship between reaction rate and substrate concentration is shown in the graph below:

The relationship between substrate reaction rate and substrate concentration is shown in Figure 2. When the substrate concentration is very low, the catalytic sites of enzyme molecules are not fully occupied. As the substrate concentration increases, the reaction rate increases almost linearly. When the substrate concentration is very high, the enzyme becomes saturated, and the reaction rate reaches its maximum value, vmax

When the substrate concentration is abundant, all enzyme molecules bind to the substrate to carry out the reaction, reaching saturation, and the reaction rate reaches its maximum. At this point, the reaction rate is solely dependent on the enzyme concentration, which means:


kcat is the turnover number, measured in s-1, and can be understood as the number of substrate molecules converted by each enzyme molecule per second when the enzyme is saturated. A higher kcat value indicates a higher catalytic efficiency of the enzyme.

When considering abundant substrate, the enzyme reaction rate is directly proportional to the enzyme concentration. Therefore, a simple kinetic equation for an enzyme reaction can be written as:


( v = reaction rate; k = constant; E = enzyme concentration)

3 Pathway construction and enzyme kinetics parameter determination

The pathway design for the synthesis of the Reb M cascade enzyme reaction is divided into six steps in total, with each step corresponding to an enzyme labeled in Figure 3. In the paper "De novo biosynthesis of rubusoside and rebaudiosides in engineered yeasts," it is mentioned that UGT91D2 can have the same catalytic function as EUGT11. Therefore, here we use EUGT11 with enzyme parameters.

Figure 3: Schematic Diagram of the Pathway for Steviol Synthesis by Reb M and the Enzyme Names Used in Each Step

Search for the kinetic parameters of various enzymes on Google Scholar, compile them into a table, and list the references at the end.

step Enzyme Kcat Km
1 UGT85C2 0.1138/s 0.1538mM
2 EUGT11 17450/s 110.5uM
3 UGT74G1 0.1294/s 0.6636mM
4 UGT76G1 33.8/min 360uM
5 EUGT11(F379A) 21060/s 61.2uM
6 UGT76G1 26.36/min 442.8uM

In order to facilitate subsequent calculations, convert them to consistent units and define the concentration of enzymes involved in each step of the cascade enzyme reaction as 100μM. Additionally, calculate the maximum reaction rates of each step reaction based on vmax=kcat*[E].

step Enzyme Kcat(min-1) Km(μM) vmax(μM/min)
1 UGT85C2 6.828 153.8 682.8
2 EUGT11 1047000 110.5 104700000
3 UGT74G1 7.764 663.6 776.4
4 UGT76G1 33.8 360 3380
5 EUGT11(F379A) 1263600 61.2 126360000
6 UGT76G1 26.36 442.8 2636

The definition of the reaction rates for each step of the enzyme reaction are respectively denoted as v1, v2, v3, v4, v5, v6, and the maximum reaction rates as vmax1, vmax2, vmax3, vmax4, vmax5, vmax6. The product concentrations are represented as P1, P2, P3, P4, P5, P6, all as functions of time variable t. Our ultimate goal is to calculate P6.

4 Cascade Enzyme Reaction Modeling

When considering a cascade enzyme reaction composed of multiple enzyme reactions, the product of the first-step enzyme reaction will increase linearly over time (with enzymes being in a saturated state, and the reaction rate consistently maintained at the maximum reaction rate, vmax). The product concentration at time t is:




P1, acting as the substrate for the second reaction, influences the rate of the second-step reaction.




And P2 is a function of the time variable "t", and the rate of the second-step reaction can be expressed as the derivative of P2 with respect to time, which is dP2/dt = v2 = 104700000 * 682.8 * t / (110.5 + 682.8 * t).

According to the principles of calculus, P2 is the indefinite integral of the function v2 with respect to the variable "t"

To solve indefinite integrals using Python, the code is as follows:

from sympy import *

x,y = symbols('x y')




The output result is:

104700000.0*x - 16943980.6678383*ln(682.8*x + 110.5)

Hence, it is determined.

P2=104700000.0*t - 16943980.6678383*ln(682.8*t + 110.5)

P2, also serving as a substrate for the third reaction, influences the rate of the third step reaction.


=776.4*(104700000.0*t - 16943980.6678383*ln(682.8*t + 110.5))/(663.6+104700000*x-16943980.6678383*ln(682.8*x+110.5))

Similarly, P3 involves solving an indefinite integral for v3.

from sympy import *

x,y = symbols('x y')

expr=776.4*(104700000.0*x - 16943980.6678383*ln(682.8*x + 110.5))/(663.6+104700000.0*x - 16943980.6678383*ln(682.8*x + 110.5))



The output result is:

776.4*(1.0*Integral(1.0*x/(1.0*x - 0.161833626244874*log(682.8*x + 110.5) + 6.33810888252149e-6), x) + 1.0*Integral(-0.161833626244874*log(682.8*x + 110.5)/(1.0*x - 0.161833626244874*log(682.8*x + 110.5) + 6.33810888252149e-6), x))

The function is too complex, and it contains integrals, making it unsuitable for further application in the next step of indefinite integration.

We may want to take a look back at the functions of v3, examine the characteristics of v3, and plot the function graph of v3 as it varies over time.

v3stabilizes rapidly and maintains a rate of 776.4 μM/min, so it can be considered constant in the subsequent calculation process, which is:

v3=776.4 μM/min

Then, P3=v3*t=776.4*t

In addition, P3 serves as a substrate for the fourth step reaction, influencing v4



from sympy import *

x,y = symbols('x y')




The output result is:

3380.0*x - 1567.23338485317*log(776.4*x + 360.0)

即P4=3380.0*t- 1567.23338485317*ln(776.4*t + 360.0)

Substituting P4 into the calculation formula for v5

v5=vmax5*P4/(Km5+P4)=126360000*(3380.0*t- 1567.23338485317*ln(776.4*t + 360.0))/(61.2+3380.0*t- 1567.23338485317*ln(776.4*t + 360.0))

from sympy import *

x,y = symbols('x y')

expr=126360000*(3380.0*x - 1567.23338485317*ln(776.4*x + 360.0))/(61.2+3380.0*x - 1567.23338485317*ln(776.4*x + 360.0))



The output result is:

126360000.0*(1.0*Integral(1.0*x/(1.0*x - 0.463678516228749*log(776.4*x + 360.0) + 0.0181065088757396), x) + 1.0*Integral(-0.463678516228749*log(776.4*x + 360.0)/(1.0*x - 0.463678516228749*log(776.4*x + 360.0) + 0.0181065088757396), x))

Similarly involving integrals, they cannot be used for the next step in finding indefinite integrals. Let's examine the characteristics of the function v5.

v5 tends towards the constant 126335864.4, and thus, v5 is defined as 126335864.4 μM/min.




from sympy import *

x,y = symbols('x y')




The output result is:

2636.0*x - 0.00923902967335069*log(126335864.4*x + 442.8)

Which is P6=2636.0*t - 0.00923902967335069*ln(126335864.4*t + 442.8)

Drawing the function of P6 changing over time.

Therefore, in the entire cascade reaction, assuming an abundant supply of sugar donor, with enzyme concentrations for each step at 100 μM, the concentration of the final product Reb M gradually increases over time. Approximately 0.158 M of the product is generated after 60 minutes of reaction (the amount of the product at any given time can be calculated using a function).


ugt76g1 Enzymatic parameter reference Molecular basis for branched steviol glucoside biosynthesis

ugt85c2 and ugt74g1 Enzymatic parameter reference Diterpenoid UDP-Glycosyltransferases from Chinese Sweet Tea and Ashitaba Complete the Biosynthesis of Rubusoside

Structural Insights into the Catalytic Mechanism of a Plant Diterpene Glycosyltransferase SrUGT76G1

UGT91D can be replaced by EUGT11 Catalytic flexibility of rice glycosyltransferase OsUGT91C1 for the production of palatable steviol glycosides

Modeled structure-based computational redesign of a glycosyltransferase for the synthesis of rebaudioside D from rebaudioside A

Other References

Biological insights into non-model microbial hosts through stable-isotope metabolic flux analysis

Characterization of Panax ginseng UDP-Glycosyltransferases Catalyzing Protopanaxatriol and Biosyntheses of Bioactive Ginsenosides F1 and Rh1 in Metabolically Engineered Yeasts

De novo biosynthesis of rubusoside and rebaudiosides in engineered yeasts

Enzyme Cascade Kinetic Modelling

Enzyme-Catalyzed Glycosylation of Curcumin and Its Analogues by Glycosyltransferases from Bacillus subtilis ATCC 6633

Getting the Most Out of Enzyme Cascades Strategies to Optimize In Vitro Multi-Enzymatic Reactions H. The Kinetics of Enzyme Cascade Systems General kinetics of enzyme cascades

In silico evaluation of a complex multi-enzymatic system using one-pot and modular approaches Application to the high-yield production of hydrogen from a synthetic metabolic pathway

Pathway mining-based integration of critical enzyme parts for de novo biosynthesis of steviolglycosides sweetener in Escherichia coli

Reaction modelling and simulation to assess the integrated use of transketolase and -transaminase for the synthesis of an aminotriol

Structural and biochemical studies of the glycosyltransferase Bs-YjiC from Bacillus subtilis