Biological Modelling | Sheffield

Introduction

The building of genetic constructs encoded on bacterial plasmids has represented a fundamental strategy for the study of gene function in the molecular and synthetic biology fields. A typical transcriptional unit in prokaryotes contains a promoter (the binding region for the RNA Polymerase), a ribosome binding site (RBS), a coding sequence (gene) and a terminator (the unbinding site of the transcription complex). Depending on its activation dynamics, a promoter can be classified as constitutive, which is always able to initiate mRNA transcription or inducible, when it requires the presence of an external molecule (typically a small molecule inducer and its associated transcription factor protein) to modulate the formation of the transcription complex.

When testing the function of a heterologous plasmid and its contained genes, two critical factors must be considered to ensure the expected function in the cell: an uninterrupted expression of the transcriptional unit at the right times and amounts; and the absence of undesired effects of this transcription that may otherwise compromise normal cell function. With this in mind, fluorescent reporter genes have become a valuable tool as they enable the analysis of transcriptional unit expression by the production of a single molecule of a fluorescent protein with each round of transcription. Also, their sensitivity, stability and variety of optical properties have earned them reliability and provide them with versatility for these applications.

To test the expression and function of our genetic system and aim to develop a model to describe its expression, we created a computational simulation of how a contained vsfGFP gene would be expressed as a part of our system in optimal conditions. Several parameters that dictate the mechanisms of the involved reactions were considered. Both constitutive and IPTG inducible promoters were included, and their promoter strength metrics were calculated. With this information, the actual transcription level of our developed set of plasmids could be estimated in each condition. On the other hand, the resulting values were compared with the experimental data to test the validity of our initial model.

Method

Determining promoter strength allows the characterisation of inducible promoters, and within PARSE, enables pLac strength to be determined in relation to the Anderson scale. This comparison was completed using RStudio to enable us to generate plots for easier visualisation and calculation of the promoter strength from constructs of pLac- vsfGFP in pET28 and pLac-GS in pET28. Promoter activity (P) is the amount of gene transcription into mRNA. This can also be defined as the polymerases that bind to the promoter and process the entire gene sequence [1], or the amount of polypeptides produced by the process [2]. Using fluorescence to represent the quantity of vsfGFP confines our definition to the amount of vsfGFP. In reality, nascent vsfGFP polypeptides will then undergo a crucial post-translational modification process to produce an active vsfGFP, or - in the case of aberrant expression or folding - undergo degradation by quality control proteases.

However, our model, illustrated in Figure 1 below, makes the assumption that degradation of vsfGFP (D), active or inactive, is 0 (D=0). This means it’s assumed that all vsfGFP produced by expression of the gene will fluoresce. This assumption is made as vsfGFP is closely related to sfGFP; this is a GFP with enhanced stability, and a higher thermostability to standard GFPs [3].

Figure 1: The modelling structure of the protein expression of fluorescent GFP. The rate determining step is assumed to be the translation and transcription of $GFP_{f}$ which is generalised as P as the promoter activity quantified as $GFP_{n}$ per cell per hour.

To determine the promoter activity, Equations $1$ and $2$ below are first used to describe the general reaction pathway of inactive and active vsfGFP, as illustrated from Figure 1 above. $$\begin{equation}\frac{\partial n}{\partial t}= P- m\cdot n - \mu \cdot n -D_n\end{equation}$$ Where $P$ is the promoter activity $GFP_{n}$ per cell per hour, $m$ is the maturation constant $h_{-1}$, $n$ is the amount of inactive GFP (yet to be folded), $\mu$ is the specific growth rate, and $D_n$ is the degradation rate of inactive GFP. $$\begin{equation}\frac{\partial f}{\partial t}= m \cdot n - \mu \cdot f -D_f\end{equation}$$ Where $f$ is the amount of active, fluorescing GFP, and $D_f$ is the degradation rate of active GFP.

The function of m $\cdot n$ quantifies the amount of inactive vsfGFP that will be converted into active GFP in a given time.\m represents the maturation constant $h_{-1}$ which represents the temporal delay between vsf GFP expression on the onset of active fluorescent protein generation in the cells,. $\mu\cdot n$ or $\mu\cdot f$ quantifies the amount of inactive or active GFP produced respectively that is dependent on the specific growth rate of the microbial culture. In reference to the Simbiolgy simulation, when the growth rate becomes greater, the accumulation of the viable bacteria is less. This later reduces the amount of inactive and active vsfGFP, making dilution to be among the reaction pathway that minimises the accumulation of fluorescent GFP.

Within a steady state, the amount of inactive vsfGFP can be determined by the equation below [5]: $$\begin{equation}n_{ss}=\frac{P\:-\:D_{n_{\mathrm{ss}}}}{m\:+\:\mu\:}\end{equation}$$ Where $n_{ss}$ is the amount of inactive vsfGFP at steady state, and $D_{n_{\mathrm{ss}}}$ is the degradation rate of inactive vsfGFP at steady state.

If we assume that $D_{n_{\mathrm{ss}}}$ and $D_{f_{\mathrm{ss}}}=0$ as degradation of vsfGFP is very slow due to its high stability compared to other GFPs.
Then equation 3 can be assumed to be: $$\begin{equation}n_{ss}= \frac{P}{m+ \mu}\end{equation}$$ This can also be determined for the amount of active vsfGFP at steady state. $$\begin{equation}f_{\mathrm{ss}}=\frac{m \cdot n_{\mathrm{ss}}-D_{f_{\mathrm{ss}}}}{\mu}\end{equation}$$ Or $$\begin{equation}f_{\mathrm{ss}}=\frac{m \cdot n_{\mathrm{ss}}}{\mu}\end{equation}$$ for a system that assumes $D=0$. Where $f_{ss}$ is the amount of active GFP at steady state, and $D_{f_{\mathrm{ss}}}$ is the degradation rate of active vsfGFP at steady state.

The relationship of fluorescence for a given time illustrated from F vs t plots from experimental Results is expressed in Equation $5$ below. $$\begin{equation}\frac{\partial F}{\partial t}= m \cdot N -OD -D_f\end{equation}$$ Where $F$ is the amount of fluorescence, $N$ is the amount of non-fluorescence and OD is optical density. As the amount of non-fluorescence increases, the fluorescence intensity also increases however, with the decreasing of the specific growth rate.

The trend of the optical density $OD$ with time collected from experimental runs is plotted to visualise the growth behaviour of the bacteria. Given the equation: $$\begin{equation}\frac{\partial O D}{\partial t}=\mu \cdot O D\end{equation}$$ Where $OD$ is the optical density, \mu as the specific growth rate $h^{-1}$, and t as time $h$.

The plot is further linearised to $log_{2} OD$ versus time to determine the specific growth rate $h^{-1}$ that is derived from Equations (10)-(13).

When both equations are combined, $\frac{\partial F}{\partial O D}$ can be determined, and this correlates to the amount of fluorescence produced by a single cell as shown in Equation 7 below. $$\begin{equation}\frac{\partial F}{\partial O D} =\frac{\left(m \cdot N-O D \cdot D_f\right) \cdot \partial t}{(\mu \cdot O D) \cdot \partial t} \\=\frac{m \cdot \frac{N}{O D}-D_f}{\mu} \\ =\frac{m \cdot n-D_f}{\mu}\end{equation}$$

Taking that $f_{ss}$ is equal to $ \frac{\partial F}{\partial O D}$, $n_{ss}$ is substituted from Equation 3 to Equation 4 where the promoter activity can be determined with known $f_{ss}$, specific growth rate calculated and maturation constant. Due to the unknown maturation constant of vsfGFP, it is estimated to be ln2 divided by the maturation time of sfGFP which is 13.6 min [6].

$$\begin{equation}P=f_{s s} \cdot \mu \cdot\left(1+\frac{\mu}{m}\right)+D_{n_{\mathrm{ss}}}+D_{f_{\mathrm{ss}}}\cdot \mu \cdot\left(1+\frac{\mu}{m}\right)\end{equation}$$ Under the assumption that degradation is null, this allows 8 to be simplified to the following: $$\begin{equation}P=f_{s s} \cdot \mu \cdot\left(1+\frac{\mu}{m}\right)\end{equation}$$

Calculating the Specific Growth Rate $\mu$

In order to calculate the specific growth rate, the following equation was applied to describe the binary fission of bacterial cells:$$\begin{equation}N=2^{\frac{t}{d_t}}\cdot N_0\end{equation}$$ Where $N$ is population after time, $N_0$ is the initial population size and t is time, and $d_t$ is doubling time.

This can be rearranged to: $$\begin{equation}\log_{2}N=\frac{t}{t_d}+log{2}\cdot N_0\end{equation}$$ Given that: $$\begin{equation}\mu = \frac{(\ln{X} - \ln{X_0})}{t}\end{equation}$$ Where $X_0$ is the initial cell concentration, $X$ is the final cell concentration, while $t$ is the time interval $h$.

Then after one $d_t$ , $X$ = 2$X_0$.

This allows rearrangement of the equation to:$$\begin{equation}\mu t_d = ln\:2\end{equation}$$Which was then applied to calculate the growth rates from the plot of $log_{2} OD$ versus time to the experimental results.

To Improve

Within system modelling, assumptions had to be made about the models. These create limitations that shall be explored below.

Firstly, RNA polymerase and ribosomes are assumed to always be present in sufficient concentrations for the reaction to proceed, meaning there is no rate-limiting step to the reaction. If there were insufficient amounts of these elements, then this would slow down rates of transcription and translation meaning an extra step would need to be considered in future Simbiology models.

The process of protein binding and unbinding is also assumed to not be a time-limiting step in transcription and translation.

The rate-determining steps in the production of vsfGFP and the GS proteins are the transcription and translation of the genes being expressed. Other factors have been considered, such as the maturation rate of the GFP. The maturation rate of the GFP was important to consider as the folding of the GFP dictates the activation of its fluorophore.

That vsfGFP is so stable that during the 24 hour period of the experiment, no degradation occurs. If this assumption was to be changed, the Michaelis-Menten equation would be applied to calculate degradation rate.

The maturation constant was assumed from sfGFP, and this may differ due to the change in structure of the vsfGFP-0 used, particularly since vsfGFP is dimeric in contrast to the sfGFP monomer.

SimBiology Simulation and Compartment Analysis

In reference to Figure 1 of the structured modelling of vsfGFP expression, the simulation of the process was performed using Simbiology to estimate the amount of vsfGFP expressed from the induction of IPTG. The simulation was used as a visualisation technique to predict the overall performance within estimated kinetic parameters to understand how they affect the reaction process. Processes analysed include the uptake of IPTG across the cell membrane via active transport, to the translation of the mRNA that leads to the transcription of non-fluorescent vsfGFP $GFP_{n} $. The $GFP_{n} $ then matures to fluorescent GFP $GFP_{f}$ that later makes the bacteria fluoresce. Due to the increase in bacterial growth, the concentration of vsfGFP also varies with time. Therefore using the Gompertz modelling equation as an estimate of bacterial growth, the amount of GFP is estimated.

In reference to Figure 3 below, the compartment with different species applied to the simulation of bacterial growth.

Figure 3: The growth compartment to simulate the growth of microbial cells that consists equationof the number of dormant bacteria which later divides at a specific activation rate. The model is used to estimate the cell concentration and at a certain activation rate, the number of dormant bacteria will either take a longer or a shorter time to undergo exponential growth.

The general equation used in computing the bacterial growth is given as; $$\begin{equation}\frac{dn_D}{dt} = -\alpha n_D\end{equation}$$ $$\begin{equation}\frac{dn_A}{dt} = r_a + r_g = \alpha n_D + \mu_{max}(1 - \frac{n_A}{N})n_A\end{equation}$$ Where $\mu_{max}$ is the maximum growth rate ($h^{-1}$), $\text{N}$ as the maximum concentration of the biomass at the point at which they are fed with limited nutrients, $n_A$ as the concentration of active bacteria, $n_D$ as the concentration of dormant bacteria, $\alpha$ as the activation rate of the dormant cell ($h^{-1}$), and $t$ is the time in hours $\text{hr}$. The sum of populations of both the dormant bacteria and active bacteria $n$, is then expressed as; $$\begin{equation}n = n_A + n_D\end{equation}$$ To finalise the simulation for the production of GFP, another compartment is added as illustrated in Figure 4 below.

Figure 4: The compartment of IPTG intake within a cell system. The reaction processes occurring within the compartment consist of the mass transfer of IPTG across the cell membrane (influenced by the co-transpport of lactose using the energy associated with the transmembrane proton motive force), the overall promoter activity, and the dilution and maturation rates of GFP proteins.

Due to the presence of the cell membranes and the lactose permease within of an E.coli bacteria, the mass transfer of IPTG is considered to occur due to the difference of the intercellular and extracellular difference of IPTG inducer concentration [4].Therefore, with the initial condition that the intercellular IPTG concentration is 0, the governing equation that can be used for IPTG uptake is described in Equation 17 below [4]: $$\begin{equation}\frac{\mathrm{d}\left [ IPTG \right ] }{\mathrm{d} t} = k_{c}\left (IPTG_{e} \right ) - \left (IPTG_{i} \right ) + K' \left [ IPTG_{e} \right ]\end{equation}$$ Where $[IPTG]_e$ is the extracellular IPTG concentration$\mu$M, $[IPTG]_f$ as the intracellular IPTG concentration $\mu$M, $k_c$ as the mass transfer coefficient $h^{-1}$, while $K^{\prime} = \frac{k}{K_i}$ assuming that the IPTG initial concentration is less than $K_i$.

Due to the increase in the number of active bacteria, it is predicted that the amount of fluorescent GFP increases with time as expressed in Equations below [7]; $$\begin{equation}\frac{d[GFP_n]}{dt} = gX_A(t) - k_m [GFP_n] -- \mu \cdot [GFP_n]- \frac{\gamma [GFP_n]}{[GFP_n] +[GFP_f] + M}\end{equation}$$ $$\begin{equation}\frac{d[GFP_f]}{dt} = gX_A(t) -k_m[GFP_f] -\mu \cdot [GFP_f]- \frac{\gamma [GFP_f]}{[GFP_n] +[GFP_f] + M}\end{equation}$$ Where $[GFP]_n$ is the concentration of non-fluorescent GFP, $[GFP]_f$ is the concentration of fluorescent GFP protein, $M$ is the degradation capacity during the degradation process, $g$ as the generation rate and $m$ as the maturation constant and $\gamma$ as the degradation rate [7].

The simulation was then performed with the assumptions that the degradation of GFP is negligible, the maturation constant to be the ln(2) divided by the maturation time of non-fluorescent GFP, which is assumed to be 13.6 min, the maximum concentration of biomass $N$ as $2\times 10^{9}$, the number of dormant bacteria to be 0.5 \times N, the mass transfer coefficient to be 0.213 $h^{-1}$, $K^{\prime}$ to be 0.0893 $h^{-1}$., \mu_max to be 1 while the generation rate becomes 0.76 $h^{-1}$.

Results and Analysis

Following from Figure 5 below, the amount of $GFP_f$ and $GFP_n$ together with the biomass concentration is plotted with time.

Figure 5: The concentration profiles of the active and dormant bacteria with the amount of $GFP_n$ and $GFP_f$ where $k_0$ is the maximum specific growth rate. $\mu_{max}$.

It is observed that the number of the active bacteria first increases with time which then reduces. The amount of $GFP_f$ and $GFP_n$ increases over time until the maximum value. This is because of the influx of IPTG into the interstitial space of the bacteria at which the concentration becomes saturated.

Figure 6: The concentration profiles of the active and dormant bacteria with the amount of $GFP_n$ and $GFP_f$ where $k_0$ is the maximum specific growth rate. $\mu_{max}$. The maximum growth rate and the activation rate are manipulated to observe significant changes to the overall product formation.

When decreasing the activation rate, the time taken to produce GFP becomes greater compared to when the activation rate is 1.2. Shown in the Figure 6 above, the first run is when the activation rate constant becomes 0.98 while the 2nd run is when the activation rate becomes 1.2. When reducing the specific growth rate $k_{0}$, it is expected that the slope of the bacterial culture growth becomes less steep. However, Iit is seen from the Figure above that the amount of active bacteria is more accumulated with lower growth rate. This could be that the value of $\frac{\mu_{max}}{N}$ is < $\alpha$ as shown in Figure 6 above.

This shows that the activation rate influences the time taken for the production of $GFP_f$ where with an increasing activation rate, the less time $GFP_f$ is producted. Also, the maximum growth rate has less effect to the growth of the microbial cells.
For the IPTG uptake, given that the initial concentration is 1 $\mu$M, it is assumed that the mass transfer coefficient is 0.213 and the overall profile for the IPTG intake becomes

Figure 7: The concentration profile of IPTG intake to the cell over time with change of the initial IPTG concentration.

Supporting Document of the Simbiology Simulation is available on iGEM Sheffield's GitLab

References

[1] J. R. Kelly et al., “Measuring the activity of BioBrick promoters using an in vivo reference standard,” Journal of Biological Engineering, vol. 3, no. 1, p. 4, 2009, doi: https://doi.org/10.1186/1754-1611-3-4.
[2] J. H. J. Leveau and S. E. Lindow, “Predictive and Interpretive Simulation of Green Fluorescent Protein Expression in Reporter Bacteria,” Journal of Bacteriology, vol. 183, no. 23, pp. 6752–6762, Dec. 2001, doi: https://doi.org/10.1128/jb.183.23.6752-6762.2001.
‌[3] E. Frenzel, J. Legebeke, A. van Stralen, R. van Kranenburg, and O. P. Kuipers, “In vivo selection of sfGFP variants with improved and reliable functionality in industrially important thermophilic bacteria,” Biotechnology for Biofuels, vol. 11, no. 1, Jan. 2018, doi: https://doi.org/10.1186/s13068-017-1008-5.
[4] D. Calleja, A. Fernández-Castañé, M. Pasini, Carles de Mas, and Josep López‐Santín, “Quantitative modeling of inducer transport in fed-batch cultures of Escherichia coli,” Biochemical Engineering Journal, vol. 91, pp. 210–219, Oct. 2014, doi: https://doi.org/10.1016/j.bej.2014.08.017.
[5] [1]H. Alper, C. Fischer, E. Nevoigt, and G. Stephanopoulos, “Tuning genetic control through promoter engineering,” Proceedings of the National Academy of Sciences, vol. 102, no. 36, pp. 12678–12683, Aug. 2005, doi: https://doi.org/10.1073/pnas.0504604102.
[6] J.-D. Pédelacq, T. Tran, T. C. Terwilliger, G. S. Waldo, and S. Cabantous, “Engineering and characterization of a superfolder green fluorescent protein,” Nature biotechnology, vol. 24, no. 1, pp. 79–88, 2006, doi: 10.1038/nbt1172.
[7] V. R. Krishnamurthi, I. I. Niyonshuti, J. Chen, and Y. Wang, “A new analysis method for evaluating bacterial growth with microplate readers,” PLOS ONE, vol. 16, no. 1, p. e0245205, Jan. 2021, doi: https://doi.org/10.1371/journal.pone.0245205.