Model | Evry-Paris-Saclay

1. Modeling on our EWOD platform

Extensive research has focused on examining the influence of diverse biochemical parameters on reaction yields, establishing a robust foundation within the scientific community. Nevertheless, it is essential to recognize that these reactions are contingent on the experimental setup (standard laboratory glassware), which is continually evolving towards smaller, safer, and more innovative devices. One class of such devices are the electrowetting-on-dielectric (EWOD) devices, of which we have used a custom-built prototype in our project. Our prototype of such cutting-edge devices holds immense promise as a pivotal component in shaping the future of wet lab experiments, and our primary objective here is to integrate this novel experimental environment into our model to explore how its unique characteristics can be effectively harnessed to further enhance reaction yields.

Our work is centered around bridging two distinct worlds: the physical realm, enriched by the uniqueness of the reactant shape in the form of moving droplets, and the biochemical parameters governing the reactions we induce. We aim to study and understand how this combination can lead to the enhancement of any reaction. While we will apply these findings to our specific reactions, we have diligently worked to design our model with global applicability, intending it to serve as a foundational framework for diverse types of reactions. By doing so, we want to further generalize our model and contribute to the broader scientific community.

The Story of Two Droplets

For that, let’s follow the story of two droplets, each of them carrying a reactant (A and B). When the two droplets merge, they undergo a physical process known as coalescence, where surface tension forces drive the fusion of the individual droplets into a single, larger droplet. This phenomenon is governed by the minimization of surface energy, resulting in the formation of a more stable and cohesive entity.

To enhance the mixing of both components prior to any reaction, while macroscopic experimental setups can utilize mechanical means (e.g., vortex, magnetic stirrer), microfluidic systems face limitations due to their small size. Consequently, alternative approaches must be explored to accelerate the mixing in these droplets.

Certainly, in the absence of any intervention, mixing in the microfluidic system mainly relies on passive diffusion, where convection arises counterbalance the significant concentration gradient (Figure 1). To ensure the efficacy and time efficiency of our wet-lab practices, we investigated whether passive diffusion alone is adequate to achieve satisfactory mixing.

Figure 1. Example of a droplet at rest resulting from mixing of two different components. Diffusion is the only mechanism that can help mixing in this case.

Quantification of Droplet Mixing

As a crucial step in guiding our wet-lab practices, the first challenge we faced was the rigorous quantification of the notion of "droplet mixing": how can we tell that the two drops are well mixed? We had to choose between a discrete approach, counting individual molecules in the entire system, or a continuous method, quantifying molecules in elementary volume units. While probabilistic methods like Monte-Carlo simulations were suitable for the discrete approach, today's tools offer superior efficiency in processing and computation speed to provide satisfying solutions of non-linear flow mechanics equations. As a result, we opted for a continuous approach using computational fluid dynamics (CFD) that captures a broader range of parameters, including droplet movements, velocity, and geometries of our problem.

Metric Construction

To construct the metric, our first consideration was to define an elementary volume unit. We aimed for the best-case scenario, where the concentration of each component within every volume unit is uniform and thus equal to the mean concentration over the droplet. For instance, for component A, it is represented as \( CA = \frac{n_{tot,A}}{V_{tot}} \) where \( V_{tot} \) is the volume of the merged droplets and \( n_{tot,A} \) the mole of component A brought by the first droplet. To quantify the difference over the volume between this ideal uniform concentration and the local concentration, we focused on the following quantity:

\[ \int(C_A(x, y, z, t) -\overline{C_A})^2\mathrm{d}V \]

in an example of spherical coordinates. This quantity is minimal (equal to zero) when \( C_A(r, \theta, \phi, t) = \overline{C_A} \) for all volume units, indicating perfect mixing, and is maximal when the all A molecules are in a \( \mathrm{d}V \) elementary volume unit (inexistent mixing), i.e. when \( C_A(r, \theta, \phi, t) \) is a Dirac (\( C_A(r, \theta, \phi, t) = \overline{C_A} \) if \( r, \theta, \phi = r_0, \theta_0, \phi_0 \), 0 otherwise). In that case, this quantity is equal to \( C_A^2 \times V_{tot} \).

Based on this observation and to simplify the interpretation, we normalized this approach into a mixing index, ranging from 0 (no mixing) to 1 (perfect mixing), for both component A and component B from that quantity:

\[ m_i(t) = \int (C_i(r, \theta, \phi, t) -\overline{C_i})^2\mathrm{d}V \]

where \( i \) is either \( A \) or \( B \), then normalized (from 0% to 100%):

\[ \bar{m}_i(t) = \frac{m_i(t)-\min(m_i(t))}{\max(m_i(t))-\min(m_i(t))} \in [0,1] \]

Thus, if the concentration of A has reached its ideal value \( C_A \) in 60% of the droplet, then the mixing index of A will be equal to 60%. Likewise, if the maximum value of the mixing index is reached everywhere in our volume units, then the mixing index will be close to one, while a poor mixing will reflect with a value of \( m_i \) close to zero. To compute this index, we need \( C_i(r, \theta, \phi, t) \) for both A and B components, which means we need to compute the concentration of each component in every volume unit, at every time point. Such precise information is complex, but can be obtained from fluid mechanics equations using appropriate powerful computational tools, such as the CFD Ansys software we used. Let’s see how Ansys helps us understand the merge of two droplets containing particles A and B respectively. We will compare two mixing processes: one involving only passive diffusion, and another one that moves the droplet back and forth on a minimal space (a 2x1 rectangle on the chip’s grid) as represented in Figure 2.

Figure 2. Representation of two options to ensure optimal droplet mixing.

Hypotheses

The fluid mechanics equations used are those of Navier-Stokes, as viscosity is considered here. Viscosity affects the quality and speed of mixing when the droplet is set in motion. The Navier Stokes equations for incompressible fluids are the following:

\[ \rho \underbrace{\left( \frac{\partial \vec{u}}{\partial t} + \vec{u} \cdot \nabla \vec{u}\right)}_{\text{Acceleration}} = \underbrace{- \nabla p}_{\text{Pressure}} + \underbrace{\nu \Delta \vec{u}}_{Viscosity} \] \[ \underbrace{\nabla \cdot \vec{u} = 0}_{\text{Continuity equation}} \] \[ \frac{\partial \rho}{\partial t} + \frac{\partial (\rho u)}{\partial x} + \frac{\partial (\rho v)}{\partial y} + \frac{\partial (\rho w)}{\partial z} = 0 \]

The simulations will be carried out using RANS modeling, as this is the least costly model in terms of computing time. However, its results remain less accurate than those of other types of modeling, such as LES (large eddy simulation) or DNS (direct numerical simulation). In this case, the emphasis is on computation time, which is considerably reduced with this method.

The reconstruction of the interface between the different phases employed (a gaseous air phase and a liquid mixing phase) will be carried out using the Volume Of Fluid (VOF) method coupled with the level set method. This coupling of the two methods offers greater modeling accuracy.

Figure 3. Multiphase model.

Calculation domain

The aim is to compare the homogenization of two drops made up of species A and B respectively. To do this, we first model the fusion of two drops, and then add a translational motion to the drop just after fusion, and analyze the effect of this motion on homogenization.

Geometry and boundary conditions

The study area is represented by a 50 mm square. The top and bottom sides are considered as walls, and the left and right sides as velocity inlet and pressure outlet respectively. In both cases, the pressure outlet will be set to 0. Choosing a pressure outlet value of 0 means that the same pressure will be imposed both inside and outside the study area, so that no air is drawn in or out.

The two drops will be represented by two tangent circles of radius 5 mm.

Mesh

The mesh used is a homogeneous structured mesh made up of square elements with sides of 0.5 mm (Figure 4).

Figure 4. Mesh considered for the simulation.

Simulation parameters

Phases

Two phases are used in the model. The gaseous phase corresponds to inert air and the second corresponds to the fluid phase of which the drops are composed. To do this, it is necessary to create a Mixture phase on Ansys Fluent, composed of three species: water, species A and species B. The mass fraction ratios will be specified when the simulation is initialized, and will be used to characterize the components of the two drops.

Thus, a mass fraction of 0.5 of water is imposed for drop 1, and a mass fraction of 0.5 of species A, and of 0.95 of water and 0.05 of species B for drop 2. This corresponds to a modelization of a bacterial particle with a size 100 times bigger than for the plasmid.

Figure 5. Initial configuration: drop 1 on the left, composed of 50% of water and 50% of A, and drop 2 on the right composed of 95% of water and 5% of B.

Phases properties and interactions

The surface tension between the gas phase and the mixture is taken to be 73 mN/m (value for water/air interaction).

The diffusion coefficients of A and B are taken to be identical to those of water. The time step was imposed to verify the \( CFL<1 \) condition.

Figure 6. Computational Fluid Dynamics (CFD) Animation on Ansys Studio (professional fluid dynamics software)

Analysis and conclusion

It can be noticed that numerical errors appear due to the numerical schemes and the approach used for the simulation. For instance the VOF+level set method is accurate but not conservative which could be a problem for mixture mass conservation.

Retrieving the concentration data from the simulation, we plotted the normalized mixing index as a function of time for both passive diffusion and back-and-forth moves (Figure 7).

Figure 7. Normalized mixing index as a function of time for both passive diffusion and back-and-forth moves

This shows the evolution of the mixing index of both plasmids and bacteria in the merged two droplets, under different experiments (with only passive diffusion vs with active back-and-forth moves). We can make few comments about those results:

the mixing index is lower, at steady-state, for the diffusion-based system by -20%
the time-to-steady-state is lower for droplets moving to mix of 1 minute per merged drop, which represent a non-negligeable gain given the number of droplets our device uses
One interesting (and quite unexpected) observation is that, when setting the conditions we modeled particles such that the volume difference reflects the size difference between bacteria and plasmid (we set this ratio to a factor x100). After simulation, we observe that there is no significant difference between plasmid and bacteria respective mixing index: they mix in the droplet at the same speed, which enables us to consider whichever index is representative of the whole system.

We conclude from these first modeling results that moving the droplet at a speed of 1.5m/s is a reliable technique to enhance component mixing in the merged droplets while using the minimum space in the lab-on-chip grid, which was a requirement from the software team in charge of grid rules.

Once we forwarded these results to the wet lab team and software team to fine-tune their lab-on-chip protocols, we focused on one application to this quantitative model: the simulation of the transformation of E. coli cells with a pSEVA plasmid expressing mutated opsin proteins prior to light exposure. Let’s explore how this enhanced mixing index helps with this crucial biological step for our wet lab team. It relies on a three-step process (Figure 8):

Mixing droplets of bacteria and plasmids on the electrowetting-on-device chip
Bacterial transformation with the opsin-encoding pSEVA plasmids
Waiting for opsin expression in transformed (plasmid-containing) bacteria

Figure 8. Lab experiment plan.

To understand how the opsin production can be maximized, we had to model the dynamics of plasmid-carrying bacteria and see what parameters influence it. The interplay between plasmids and bacterial hosts has been a longstanding focus of research. Diverse models have been developed, differing primarily in their level of complexity. In one approach, the dynamics of this interaction can be described by a three-reaction model. When plasmids and bacteria coexist in the same environment, three possibilities emerge: they may not interact at all, the plasmid may attach to the bacterial cell, or the plasmid may detach from the bacterial cell. This dynamics unfolds within a single time frame, but over an extended experimental duration, bacterial populations can proliferate, introducing further intricacies to the model. Indeed, plasmid-bearing bacteria may or may not pass on their plasmids to their daughter bacteria, adding another layer of complexity.

To capture as much biological complexity as possible and to focus on the population of plasmid-bearing E. coli as a function of time once the plasmids are inside the bacteria, we used one of the main and first models of plasmid-bacteria dynamics developed by Neill S. Cooper et al. in 1987 [1]. This model introduces the following assumption:

Let’s denote \( p+ \) is the population of plasmid-bearing E. coli, and \( p- \) the plasmid free bacteria population with the condition:

\[ \forall t, P_+(t) + P_-(t) = 1 \]

\( R \) is the probability of a plasmid-bearing E. coli to become plasmid-free prior to any replication. It’s an indicator about how unstable the bacterium becomes once it carries the plasmid, regardless of proliferation.
\( d\mu \) is a segregation factor that competes the growth factors between plasmid-bearing and plasmid-free bacteria: it tells us how disadvantaged plasmid-bearing bacteria are to replicate with a plasmid
\( D \) the dilution rate of the media

\[ \mathrm{d}\mu = \mu_- - \mu_+ \]

Under such hypothesis, the following system of differential equations emerges:

\[ \begin{cases} \frac{\mathrm{d}P_+}{\mathrm{d}t} = - DP_+ + \mu_+ P_+ - RP_+ \\ \frac{\mathrm{d}P_-}{\mathrm{d}t} = -DP_- + \mu_- P_- + RP_+ \end{cases} \]

Interpretation of Each Term of the Variation Over a Time Unit of \( P+ \) and \( P- \)

Equation:

Term	Description
1st term	\( \frac{\mathrm{d}P_+}{\mathrm{d}t} = -DP_+ \) : proportion of plasmid-bearing bacteria lost due to dilution in the media
2nd term	\( +(\mu_+)*P_+ \) : proportion of replicated plasmid-bearing bacteria at growth rate \( u+ \)
3rd term	\( -R*P_+ \): proportion of plasmid-bearing bacteria becoming plasmid free with probability R
	\( \frac{\mathrm{d}P_-}{\mathrm{d}t} = -DP_- \) : proportion of plasmid-free bacteria lost due to dilution in the media
	\( +(\mu_-) * P_- \) : proportion of replicated plasmid-free bacteria at growth rate \( u- \)

After rearrangement, we obtain the following Bernoulli differential equation for \( P_+ \):

\[ \frac{\mathrm{d}P_+}{\mathrm{d}t} = P_+^2 \mathrm{d}\mu - P_+ (\mathrm{d}\mu + R) \]

Leading to the general solution:

\[ \begin{cases} P_-(t) = \frac{(P_-(t=0) \cdot \mathrm{d}\mu + R) e^{(\mathrm{d}\mu + R)t} - R(1-P_-(t=0))}{(P_-(t=0) \cdot \mathrm{d}\mu + R)e^{(\mathrm{d}\mu + R)t} + (1-P_-(t=0))} \\ P_+(t) = 1 - P_-(t) \end{cases} \]

While this formula is exhaustive, it needs some further simplification based on hypothesis and experiment values to get a more easy to understand and interpret the population model. What we can see is that the general shape of the solution is a function of a competition between two different phenomena as shown in the exponential function: \( dμ \) (growth rate disadvantage of plasmid bearing E. coli vs plasmid free E. coli) and R (probability of a plasmid-bearing bacteria to lose its plasmid prior to replication). In other words, the model can be simplified based on which is the fastest way for a plasmid-carrying E. coli to lose its plasmid between bacteria instability or replication.

Exploration of Techniques

Let’s now explore techniques we used to balance with the “\( \mathrm{d}\mu + R\)" factor and to see how it affects the \( P_+ \) population we want to maximize to synthesize as many opsin proteins as possible over time.

The first step is to consider a poor mixing index based only on diffusion. Under such conditions, the initial proportion of plasmid-free bacterias at the beginning is almost 1, which means:

\[ P_-(t=0) \approx 1 \]

In that case, the system becomes:

\[ \begin{cases} P_-(t) \approx \frac{(P_-(t=0) \cdot \mathrm{d}\mu + R) e^{(\mathrm{d}\mu + R)t}}{(P_-(t=0) \cdot \mathrm{d}\mu + R)e^{(\mathrm{d}\mu + R)t}} = 1\\ P_+(t) = 1 - P_-(t) \approx 0 \end{cases} \]

Indicating what we were expecting, which is that the plasmid-free population proportion will stay constant equal to 1 and nothing will happen. This further emphasizes why mixing the droplets remains crucial to ensure a high-yield plasmid-bacteria bearing process by setting optimal initial conditions.

On the contrary, when the droplets are mixed with a back-and-forth motion, the proportion of the initial population of plasmid-bearing bacteria is non-zero, and experiments on current lab practices show that the maximum this proportion can be after lab work is 30%. Thus, we assumed that we can achieve such a target with our enhanced mixing process which sets \( P_{(t=0)} \) to 100%-30% = 70%.

Without any other experimental techniques but only our mixing enhancement step, the above functions can be plotted using experimental data available in research papers [2]. The value of the \( \mathrm{d}\mu \) growth rate difference was found equal to -0.002 per hour. Data was lacking for the value of R for our specific plasmid, and thus we assumed it to be the mean of values found for plasmids used with E. coli [3]: 0.01 per hour, which means that each hour, 10% of the bacteria carrying plasmids lose them without replication loss. Using E. coli and pSEVA plasmids, the plots presented in Figure 9 obtained for the population of plasmid-carrying bacteria.

Figure 9. Mixing index evolution.

The interpretation of this plot is that even if we use an optimized mixing process that can ensure 30% of bacteria carrying plasmids after heat shock, this is not enough to ensure bacteria to keep bearing target-opsin-synthesizing plasmids. Indeed, after only 10 minutes, less than half of initial bacterial hosts of opsin-producing plasmid have survived, leading to a very low yield of opsin production.

While the growth rate difference \( \mathrm{d}\mu \) is fixed by the plasmid/bacteria combination, we can still try to see and investigate the influence of \( R \), the probability of a bacteria already carrying plasmid to lose it (regardless of replication). For that, in collaboration with the wet lab team, we searched for lab techniques that could help us with this parameter. That’s when we went through a widely employed strategy that involves utilizing a medium enriched with antibiotic agents encoded by plasmids, conferring drug resistance to plasmid-carrying bacteria compared to plasmid-free E. coli, thereby diminishing the likelihood of plasmid loss, i.e. reducing our parameter \( R \).

Such a process is helpful to reduce this probability, with an effect on the value of \( P_+ \) over time. Let’s explore how reducing \( R \) enhances the number of bacteria bearing plasmids after heat shock when using antibiotic enriched media (Ampicilin, Kanamycin):

Figure 10. Mixing index evolution.

In Figure 10, we only plotted \( P_+(t) \). Note here how reducing \( R \) drastically reduces plasmid loss amidst plasmid-bearing bacteria during opsin production: for \( R = 0.001 \) per hour, after 10 min waiting, only -7.5% of useful bacteria have lost their plasmid, compared to the -50% loss before. Indeed, thanks to plasmids producing antibiotic resistance proteins, bacteria need them to survive. The limit case is when \( R \) tends to zero (\( 1e-7 \) per hour), which actually shows a slow increase of plasmid-bearing bacteria proportion over time thanks to replication conserving plasmids. This can be implemented on our lab-on-chip device using an appropriate media.

Thus, conferring resistance for bacteria hosting plasmids can help keep a better opsin production yield, as shown in Figure 11.

Figure 11. Resistance in Bacteria Hosting Plasmids.

Measures Taken by the Wet Lab Team

As a result of this model, the following measures have been taken by the wet lab team, and helped us improve the yield of our opsin production rate:

While lab-on-chip devices reveal promising findings in the development of automated reactions, it is important to make sure mixing enhancement processes are adapted for such a small device to keep high-yield reactions. This was proved to be achievable by active mixing such as back-and-forth movement that creates internal microfluidic currents helping with the mixing of components (Figure 12). We made sure our findings are purposefully generalizable for any reaction between two components.

Figure 12. Results of the modeling on our EWOD platform

In the context of our project, we have demonstrated that the mixing process, while important, is not solely sufficient to ensure optimal opsin production within plasmid-carrying bacteria. To maintain the presence of plasmids in these bacteria, which is necessary for sustained opsin production, it is helpful to reduce the probability of plasmid loss. This can be achieved through the application of laboratory techniques, such as antibiotic resistance.

2. Bioinformatics Analysis

Searching for New Bacterial Opsins

As an attempt to get new, potentially more sensitive bacterial opsins, we searched for opsin sequence from the Tara Oceans database Ocean Gene Atlas (OGA) [4,5]. We used the Hidden Markov Model method with HMMer from EBI InterPro server on the opsin peptide sequences we already got for CatCh, NpHR, ChrimsonR and Jaws bacterial opsins. This method returned several hundreds of matches, with associated sequence in FASTA format.

Prokaryotic Rhodopsins Phylogenetic Tree

We constructed a phylogenetic tree of prokaryotic and viral rhodoopsins (Figures 13 and 14). We gathered opsins from iGEM registry, Tara Ocean’s OM-RGCv2+G library, which contains only prokaryotic sequences, from the high confidence high quality fractions of the IMG/VR library, which contains only viral sequences (mainly phages) and from SWISSPROT. We also included the five opsin sequences we used in our project.

We collected 3002 sequences homologous to the PF01036 pfam HMM with a maximum of 90% similarity between them. We aligned the sequences with MAFFT. Spurious alignments were removed using TrimAI. On the rooted tree (Figure 13), we marked the origin of the sequence in a colored ring. This tree illustrates the large amount of sequences yet to be explored: only a narrow proportion of all detected clades are represented in iGEM registry or in SWISSPROT. Other poorly studied opsins, in ocean prokaryotes and phages, could offer better sensitivity or interesting properties that could be of particular interest for future works on opsins.

Figure 13. Rooted phylogenetic tree of prokaryotic and viral rhodoopsins.

Figure 14. Unrooted phylogenetic tree of prokaryotic and viral rhodoopsins.

Opsins suitable for cloning in E. coli

To select an opsin sequence suitable for engineering in E. coli chassis, the protein should not have a disulfide bond. So we tried to verify whether the listed protein sequence have these bonds. To do so, we used two alternative methods. First, we used ChimeraX on protein PDB structures to check for disulfide bond using `` command. When PDB where not available, we ran AlphaFold2/ColabFold on amino-acid sequences to get a predicted PDB. Secondly, we used a home-made Python script to query the UniProt database for disulfide bond annotations absence in protein metadata REST API. The later method proves fast, but is prone to error due to potential lacks of annotations in UniProt database.

Opsin Structure Prediction with AlphaFold2/ColabFold

A modification of the ColabFold google colab notebook was done allowing to predict peptid 3D structure of a batch of amino acid sequence fasta files. Using this script, a prediction of 3D structure of several amino-acid sequences of opsins from the iGEM registry was performed.

Figure 15. AlphaFold2/ColabFold structure model of NpSRII rhodopsin (BBa_K317000), rendered with ChimeraX.

Querying the iGEM registry relational database

To retrieve the sequence of parts from the registry having metadata of interest, we first explored the web interface, which provide thorough information in natural language for some parts, but less complete information on others. Fortunately, a database SQL (or XML) dump from 2022 is available on the registry website, allowing faster and more complex queries using SQL language. For instance, a single SQL query allows extracting all sequence fragments of composite parts being labelled as “OmpR binding site”, which may be converted easily from CSV file to FASTA file using awk. It has to be noted, however, that the parts metadata of similar parts may vary from parts entry to other parts entry, rendering the database data extraction more difficult.

Conclusion on Bioinformatics Work

Our bioinformatics analyses allowed us to glimpse the potential diversity of wild prokaryotic rhodopsins, especially from sea organisms and phages. During this iGEM session, we did not manage to test whether these other opsins were suitable for culture in our chassis, nor if their physical characteristics were indeed particularly interesting. However, we hope our preliminary bioinformatics analyses can bring inspiration to future iGEM teams to study opsins from more diverse clades than it is done in iGEM for now.

References

[1] Cooper NS, Brown ME, Caulcott CA. A mathematical method for analysing plasmid stability in micro-organisms. Journal of General Microbiology (1987) 133: 1871–1880.
[2] Fedorec AJH, Robinson CM, Wen KY, Barnes CP. FlopR: an open source software package for calibration and normalization of plate reader and flow cytometry data. ACS synthetic biology (2020) 9: 2258–2266.
[3] Popov M, Petrov S, Nacheva G, Ivanov I, Reichl U. Effects of a recombinant gene expression on ColE1-like plasmid segregation in Escherichia coli. BMC biotechnology (2011) 11: 18.
[4] Villar E, Vannier T, Vernette C, Lescot M, Cuenca M, Alexandre A, Bachelerie P, Rosnet T, Pelletier E, Sunagawa S, Hingamp P. The Ocean Gene Atlas: exploring the biogeography of plankton genes online. Nucleic Acids Research (2018) 46: W289–W295.
[5] Vernette C, Lecubin J, Sánchez P, Tara Oceans Coordinators, Sunagawa S, Delmont TO, Acinas SG, Pelletier E, Hingamp P, Lescot M. The Ocean Gene Atlas v2.0: online exploration of the biogeography and phylogeny of plankton genes. Nucleic Acids Research (2022) 50: W516–W526.