Gene Circuit : An Overview

August 19: Our iGEM project commenced its dry lab modeling journey, a vital component of our mission to engineer biofilms through a gene expression circuit. We established the foundational framework, creating an Ordinary Differential Equation (ODE)/Stochastic Differential Equation (SDE) network to observe the circuit's behavior over time.

August 22: Assumptions began to take shape, assuming that the transcription rate greatly exceeded the natural degradation rate and that protein-protein interactions were robust. We applied the concept of non-cooperative binding, signifying independent events, while also assuming relatively weak protein-protein inhibition.

September 3: We delved into the heart of our modeling approach, working with rate parameters like mRNA spontaneous degradation rates and mRNA-to-protein translation rates, which were vital in governing the circuit's behavior.

September 15: We introduced the Hill equation into our model, enabling us to characterize the binding of ligands to macromolecules. Parameters like the dissociation constant and Hill coefficient were critical in determining the strength and cooperativity of these bindings.

September 21: We incorporated an input signal into our model, offering flexibility that could be adjusted to cater to the specific requirements of our project.

September 30: Our modeling was fully underway, implemented in Julia and utilizing packages such as Catalyst, Plots, DifferentialEquations, Latexify, GraphViz, and Random to facilitate the modeling process.

October 5: Dry lab modeling has provided us with invaluable insights into the behavior of the gene expression circuit, allowing us to optimise its design with the ultimate goal of enhancing biofilm production.

Biofilm genetic circuit model diagram
Figure 1: Biofilm genetic circuit model diagram1

Gene Circuit : Methods & Motivations

ODE/SDE network: Our modelling approach is based on an ordinary differential equations (ODE)/stochastic differentials equation (SDE) network, which allows us to simulate the behavior of the circuit over time. This approach is motivated by the fact that gene expression is a dynamic process that changes over time, and ODEs are a natural way to model such processes. The ODE network consists of a set of differential equations that describe the rates of change of the concentrations of the different species in the circuit. The equations are solved numerically using a variety of methods, such as the Runge-Kutta method or the Euler method; the specific implementation is done using the numerical solver in Julia.

\begin{align} \frac{\mathrm{d} m_{spo0A\_P}\left( t \right)}{\mathrm{d}t} =& \gamma - \delta m_{spo0A\_P}\left( t \right) \\ \frac{\mathrm{d} m_{abrB}\left( t \right)}{\mathrm{d}t} =& \gamma - \delta m_{abrB}\left( t \right) + \mathrm{hillr}\left( P_{Spo0A\_P}\left( t \right), \alpha, K, n \right) \\ \frac{\mathrm{d} m_{abbA}\left( t \right)}{\mathrm{d}t} =& \gamma - \delta m_{abbA}\left( t \right) + \mathrm{hillr}\left( K, \alpha, P_{Spo0A\_P}\left( t \right), n \right) \\ \frac{\mathrm{d} m_{sinI}\left( t \right)}{\mathrm{d}t} =& \gamma - \delta m_{sinI}\left( t \right) + \mathrm{hillr}\left( K, \alpha, P_{Spo0A\_P}\left( t \right), n \right) \\ \frac{\mathrm{d} m_{sinR}\left( t \right)}{\mathrm{d}t} =& \gamma - \delta m_{sinR}\left( t \right) \\ \frac{\mathrm{d} m_{eps}\left( t \right)}{\mathrm{d}t} =& \gamma - \delta m_{eps}\left( t \right) + \mathrm{hillr}\left( K, \alpha, P_{AbrB}\left( t \right), n \right) + \mathrm{hillr}\left( K, \alpha, P_{SinR}\left( t \right), n \right) \\ \frac{\mathrm{d} P_{Spo0A\_P}\left( t \right)}{\mathrm{d}t} =& \beta_1 m_{spo0A\_P}\left( t \right) - \mu P_{Spo0A\_P}\left( t \right) \\ \frac{\mathrm{d} P_{AbrB}\left( t \right)}{\mathrm{d}t} =& \beta_2 m_{abrB}\left( t \right) - \mu P_{AbrB}\left( t \right) - \tau P_{AbbA}\left( t \right) P_{AbrB}\left( t \right) \\ \frac{\mathrm{d} P_{AbbA}\left( t \right)}{\mathrm{d}t} =& \beta_3 m_{abbA}\left( t \right) - \mu P_{AbbA}\left( t \right) - \tau P_{AbbA}\left( t \right) P_{AbrB}\left( t \right) \\ \frac{\mathrm{d} P_{SinI}\left( t \right)}{\mathrm{d}t} =& \beta_4 m_{sinI}\left( t \right) - \mu P_{SinI}\left( t \right) - \tau P_{SinI}\left( t \right) P_{SinR}\left( t \right) \\ \frac{\mathrm{d} P_{SinR}\left( t \right)}{\mathrm{d}t} =& \beta_5 m_{sinR}\left( t \right) - \mu P_{SinR}\left( t \right) - \tau P_{SinI}\left( t \right) P_{SinR}\left( t \right) \\ \frac{\mathrm{d} P_{Eps}\left( t \right)}{\mathrm{d}t} =& \beta_6 m_{eps}\left( t \right) - \mu P_{Eps}\left( t \right) \end{align}
Equation 1: System of differential equations that governs the gene expression model
  • All species with m denotes mRNA; all species with P denotes protein
  • i is the rate at which input signal is transmitted to m_Spo0A_P
  • δ is the spontaneous degradation rate of mRNA
  • γ is the spontaneous transcription rate of mRNA
  • μ is the spontaneous degradation rate of protein
  • β₁ - β₆ are translation rate of mRNA to protein
  • K, α, and n are parameters in the Hill equation governing the binding of ligands to macromolecules

Rate parameters: We use rate parameters such as the spontaneous degradation rate of mRNA and the translation rate of mRNA to protein to govern the behavior of the circuit. These rate parameters are motivated by the fact that they reflect the underlying biochemical processes that govern gene expression. They can be estimated from experimental data or from the literature, or they can be optimised using parameter fitting techniques.


Gene expression model connected graph
Figure 2: Gene expression model connected graph

Assumptions: We make several assumptions in our modelling approach, such as assuming that the transcription rate is much higher than the natural degradation rate, and that the protein-protein interaction is strong. These assumptions are motivated by the fact that they simplify the model and make it easier to analyze. However, they may not always hold true in real biological systems, so it's important to be aware of their limitations.


Hill equation: The Hill equation is used to model the binding of ligands to macromolecules, with parameters such as the dissociation constant and the Hill coefficient determining the strength and cooperativity of the binding. This equation is motivated by the fact that it is a simple and widely used model for describing the binding of ligands to macromolecules. The parameters can be estimated from experimental data or from the literature, or they can be optimised using parameter fitting techniques.


\begin{align*} \mathrm{input1} &\xrightarrow{i} \mathrm{input1} + \mathrm{m_{spo0A_P}} \\ \mathrm{m_{spo0A_P}} &\rightleftharpoons^\gamma_\delta\varnothing \\ \mathrm{m_{abrB}} &\rightleftharpoons^\gamma_\delta \varnothing \\ \mathrm{m_{abbA}} &\rightleftharpoons^\gamma_\delta\varnothing \\ \mathrm{m_{sinI}} &\rightleftharpoons^\gamma_\delta\varnothing \\ \mathrm{m_{sinR}} &\rightleftharpoons^\gamma_\delta\varnothing \\ \mathrm{m_{eps}} &\rightleftharpoons^\gamma_\delta \varnothing \\ \mathrm{P_{Spo0A\_P}} &\xrightarrow{\mu} \varnothing \\ \mathrm{P_{AbrB}} &\xrightarrow{\mu} \varnothing \\ \mathrm{P_{AbbA}} &\xrightarrow{\mu} \varnothing \\ \mathrm{P_{SinI}} &\xrightarrow{\mu} \varnothing \\ \mathrm{P_{SinR}} &\xrightarrow{\mu} \varnothing \\ \mathrm{P_{Eps}} &\xrightarrow{\mu} \varnothing \\ \\ \end{align*}
\begin{align*} \varnothing &\xrightarrow{m_{spo0A\_P} \beta_1} \mathrm{P_{Spo0A\_P}} \\ \varnothing &\xrightarrow{m_{abrB} \beta_2} \mathrm{P_{AbrB}} \\ \varnothing &\xrightarrow{m_{abbA} \beta_3} \mathrm{P_{AbbA}} \\ \varnothing &\xrightarrow{m_{sinI} \beta_4} \mathrm{P_{SinI}} \\ \varnothing &\xrightarrow{m_{sinR} \beta_5} \mathrm{P_{SinR}} \\ \varnothing &\xrightarrow{m_{eps} \beta_6} \mathrm{P_{Eps}} \\ \varnothing &\xrightarrow{\frac{\alpha P_{Spo0A\_P}^{n}}{P_{Spo0A\_P}^{n} + K^{n}}} \mathrm{m_{abbA}} \\ \varnothing &\xrightarrow{\frac{\alpha P_{Spo0A\_P}^{n}}{P_{Spo0A\_P}^{n} + K^{n}}} \mathrm{m_{sinI}} \\ \varnothing &\xrightarrow{\frac{\alpha P_{AbrB}^{n}}{P_{AbrB}^{n} + K^{n}}} \mathrm{m_{eps}} \\ \varnothing &\xrightarrow{\frac{\alpha P_{SinR}^{n}}{P_{SinR}^{n} + K^{n}}} \mathrm{m_{eps}} \\ \varnothing &\xrightarrow{\frac{\alpha K^{n}}{K^{n} + P_{Spo0A\_P}^{n}}} \mathrm{m_{abrB}} \\ \mathrm{P_{SinI}} + \mathrm{P_{SinR}} &\xrightarrow{\tau} \varnothing \\ \mathrm{P_{AbbA}} + \mathrm{P_{AbrB}} &\xrightarrow{\tau} \varnothing \end{align*}
Equation 2: All the chemical reaction in the modelled system

Input signal: We incorporate an input signal, which can be adjusted based on the specific needs of the project. This input signal can be a chemical inducer, a change in temperature, or any other stimulus that affects the behaviour of the circuit. The input signal is motivated by the fact that it allows us to control the behaviour of the circuit and study its response to different stimuli.


Implementation: Our modelling is implemented in Julia, using packages such as Catalyst, Plots, DifferentialEquations, Latexify, GraphViz, and Random. Julia is a high-level programming language developed by MIT originally for educational purposes. Here we use it because of its outstanding ability to create, model, and analyse specific biochemical reaction networks.

ODE Model Results
Figure 3: ODE model results graph
SDE Model Results
Figure 4: SDE model results graph

Molecular Dynamics : An Introduction

The biosensor we designed used a transcription factor, QseB, from E. coli. We wanted to put this in our B. subtilis strain, in which we needed QseB to bind to the B. subtilis form of RNA polymerase. This brings up the interesting question of “how do we know the genes from one organism will work in a different organism?”. This presented the opportunity to use Dry Lab to extend what we were doing in Wet Lab, as given our time constraints we were unsure if we'd be able to transform into B. subtilis. We decided to simulate the binding of QseB to each polymerase (E. coli and B. subtilis), and the subsequent binding of DNA, to be able to visualise if QseB should work in B. subtilis.

Molecular Dynamics : Assembling the E. coli Polymerase

Online structures of RNA Polymerases (RNAP’s) in the Protein Databank 2 had missing parts, and so the first step was to assemble both of the RNA polymerases, which both take the form of α2ββ’ω, which we initially decided to try using HDock 3 to assemble the individual subunits. It had been observed that firstly the 2 α-subunits bind, then the β and β' and ω subunits, respectively4, and so we began docking them in order. This lead us quickly to our first problem: HDock uses rigid, solid structures to dock, whereas proteins are able to wiggle around whilst binding, thus HDock isn’t always able to bind proteins in the correct orientation. We were quickly able to bind together α2 and ββ’; however, we then struggled to bind these together.

AA & BB' Structure

We explored other software options for docking, such as Hex, AutoDock and HADDOCK5; however, we came across a different solution to the restrictions of rigid structure docking - the use of Gromacs6, the molecular dynamics software we’d been getting to know, to wobble the proteins into a slightly different configuration and try HDock again.

AA Structure in GMX

After which, we re-tried running HDock on the wobbled proteins, and in doing so were able to get the alpha2 structure to bind to the correct region of the bbp structure.

AABB' Structure

Molecular Dynamics : Assembling the B. subtilis Polymerase

Learning from our attempt to assemble the RNAP of E. coli, we wanted to try another route with B. subtilis’ RNAP utilising the ‘align’ feature of Pymol. It felt a waste that we had online structures in the PDB, but couldn’t use them in GMX due to the missing parts. Whilst we were unable to add the missing amino acids, we figured we could take the subunits from the AlphaFold databank and position them in the same place as the equivalent subunits in the PDB structure using the inbuilt ‘align’ function in Pymol. With this, we were quickly able to put together the subunits, except there was quite a significant problem with this method: as the subunits we were using were slightly different than the ones in the PDB one (a cryo-EM structure 4), when we aligned them, we found that there was overlap between the amino acids. This creates more problems for GMX, and is difficult to correct as the structures were quite interlinked and parts had to slot together nicely which we couldn’t untangle.

AA & BB' Structure of <i> B. subtilis </i>

To combat this, we first used HDock to put together the a2 and bbp parts, which it can do somewhat successfully. We used the align function to put these together, which did create a few problems, due to the presence of a TF in the online structure that caused the subunits to bind more openly (slightly differently).

Unfortunately, here we ran out of time, but desribed below was the continuation of this model.

Once we’d assembled the polymerases, we would check the possible binding positions of the phosphorylated QseB dimer with the polymerases, as a first step to confirm that QseB should be able to bind to the B. subtilis RNAP. With the E. coli RNAP, we would first use HDock to suggest the most probable binding sites, and then check these if they were possible with the B. subtilis RNAP. For the ones that would bind in analogous positions, we would run them in Gromacs to assess if they were stable (i.e. if they wouldn't dissociate).