In our project, we modify yeast Saccharomyces cerevisiae to produce short interfering RNAs (siRNAs), which can block the expression of bee pathogen genes. Our engineered yeast can serve as a dietary supplement for bees, enhancing their ability to defend against viruses by equipping them with siRNAs even before they encounter the virus. Furthermore, it can help bees combat existing Deformed Wing Virus (DWV) infections by further stimulating the bee's innate RNA interference (RNAi) machinery. This ultimately leads to the suppression of viral protein expression.
To enhance our efforts in achieving the goal of engineering yeast to produce shRNAs for enhancing antiviral defense in bees, we employed modeling techniques to strategically plan and optimize our project.
Modeling can offer valuable insights into the viability of utilizing yeast to boost the antiviral response in bees by pinpointing potential obstacles, identifying the bottlenecks, and proposing strategies to improve efficiency. We divided our project into four stages, enabling us to model each stage individually and conduct a comprehensive evaluation to ensure its success:
Modeling shRNA production in yeast is essential for optimizing various aspects of the process, including the choice of promoters for shRNA and Dicer expression, potential concentration, efficiency of shRNA folding and its stability. Through mathematical and computational models, we can determine the optimal conditions for shRNA production to achieve the desired results.
After the expression and folding process, shRNA is processed by the Dicer protein, yielding siRNA. Since S. cerevisiae lacks native RNAi machinery, heterologous expression of Dicer and Argonaute proteins was established in our engineered yeast. shRNA cleavage by Dicer is a critical step in the RNAi pathway. Computational modeling of this step offers the potential to fine-tune the expression of the Dicer. This involves the selection of promoters and manipulation of plasmid copy numbers, which can significantly enhance the efficiency of siRNA production in our engineered yeast.
We propose that our siRNA-producing yeast can be used as a food supplement for bees. This raises the questions about the stability of siRNA within the bee's digestive system and the capacity of surrounding cells to take up siRNA. Consequently, in the next phase, we developed a model to study both siRNA uptake by cells and the siRNA degradation process. It allows us to analyze the effectiveness of the proposed siRNA-based dietary supplement for bees, to consider possible side effects, and to optimize siRNA delivery system if necessary.
Once siRNA enters bee cells, it undergoes further processing by the Argonaute protein, an integral component of the RISC (RNA-Induced Silencing Complex). In the final stage, we combine the available information with the model describing siRNA degradation. It allows us to estimate active concentrations of siRNA and Argonaute for efficiently fighting the DWV infection. The detailed description of the modeling process is provided below.
We performed a computational search based on the sequence determinants that make an effective siRNA against the DWV genome. Since we initially express these siRNAs as small hairpin RNAs (shRNAs), our model encompasses the processes of shRNA production, degradation, and the folding of our ten shRNAs into a double-stranded hairpin structure within yeast.
Our model combines ordinary differential equations (ODEs) to describe shRNA production with the stochastic Gillespie algorithm to simulate the folding events of single-stranded RNA (ssRNA) into shRNA.
Determining the production and degradation rates, as well as shRNA concentration;
The rate of change of shRNA concentration ([R]) over time is described by:
Where [R(t)] is the concentration of shRNA at time t, kprod is the
production rate constant, Y(t) is the yeast population at time t, kdeg is the degradation rate of shRNA
estimating the folding events based on stochastic nature and the speed of folding of individual molecules;
To investigate the folding events, we employ the Gillespie algorithm (Dykeman, 2015) that allows us:
We focus on the folding of single-stranded RNA (ssRNA) into shRNA, and the propensity (a) for this reaction is calculated as follows:
where kfold is the rate constant for ssRNA folding into shRNA, and R is the concentration of shRNA.
Determining the time of the next folding event in a random manner, based on the kfold constant.
The time to the next folding event (τ) is determined using the Gillespie algorithm:
where r1 is a random number between 0 and 1. When an event occurs, it represents the folding of an ssRNA molecule into dsRNA.
For the equations and the final model, we introduced the set of constants (k_prod, k_deg and k_fold) for each siRNA version. All constants available in the table:
Table 1. The values of K_prod, k_deg and k_fold for each siRNA version
Constant | shRNA 1 | shRNA 2 | shRNA 3 | shRNA 4 | shRNA 5 | shRNA 6 | shRNA 7 | shRNA 8 | shRNA 9 | shRNA 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
k_prod (ng/(ml*min)) | 0.00756 | 0.00756 | 0.00189 | 0.002835 | 0.00756 | 0.004725 | 0.00756 | 0.00756 | 0.004725 | 0.00756 | |
k_deg (min^-1) | 0.084 | 0.084 | 0.021 | 0.0315 | 0.084 | 0.0525 | 0.084 | 0.084 | 0.0525 | 0.084 | |
k_fold (min^-1) | 64.65405566 | 37.69417324 | 149.5186658 | 78.90681233 | 42.69218927 | 47.16369143 | 46.77382026 | 78.25454168 | 19.24303282 | 37.38258033 |
The degradation constant k_deg was derived by analyzing the dinucleotide sequence distribution within four siRNA stability categories, as outlined in the study of Hong et al 2010 (Fig. 1).
Figure 1. The distribution of di-nucleotide sequence frequencies in siRNAs grouped into different stability categories. Figure from Hong et al 2010.
We used the constant k_deg from the iGEM registry (Team:Edinburgh UG/Modelling Collaboration - 2018.Igem.Org, n.d.) , which was equal to 0.042. This was used as the standard k_deg for RNA. Based on Fig. 1, we created a Python script to categorize each of our 10 shRNAs into one of the four stability groups. This classification was performed by multiplying the standard k_deg constant by the respective factor associated with the stability group (x0.5, x0.75, x1.25, x2).
In our engineered yeast, the shRNAs are produced under the control of the pGAL1 promoter. Due to the absence of experimental constants specific to shRNA production in yeast, we utilized the expression levels of GFP from the pGAL1 promoter as a closest empirical constant (Štagoj et al, 2005). It's important to note that this approach is not ideal, as GFP fluorescence requires both transcription and translation processes, whereas shRNA production depends only on transcription. In our extrapolation, we made the assumption that the ideal scenario occurs, where 100% of the transcripts are translated into GFP protein, followed by successful maturation and folding.
We applied the steady-state approximation method to determine the production rate constants (k_prod) for our modified yeast strains.The steady-state approximation is a widely adopted simplification in kinetics, especially when analyzing the dynamics of intermediate components within a reaction pathway. In a given system, when the rate of production of an intermediate (or species) equals the rate of its consumption or degradation, that species is said to be in a "steady-state". Within a specific system, if the rate of production of an intermediate or species equals the rate of its consumption or degradation, this species is described as being in a "steady-state." Despite significant flux passing through it, the concentration of this species remains relatively constant over time.
In a steady-state, rate of siRNA Production equals rate of siRNA Degradation
Given that:
Equating the two gives:
k_fold constants were found using thermodynamic equation:
, where
R - universal gas constant
T - temperature in Kelvins
ΔG - free energy of folding event for each shRNA determined by RNAup WebServer. At 30 C°:
k_fold = EXP((delta_G/2*1000)/(1.987*303.15))
In our combined model, we adjust the folding parameter from the Gillespie algorithm at each time step of the ODE simulation. This integration allows us to observe how stochastic folding events influence dsRNA concentration over time in the presence of ODE-based production and degradation processes.
Yeast population growth
Yeast culture growth rate was estimated using the model
where N represents the number of cells at any time (t), and N0 represents the number of cells at the beginning of the interval being analyzed, growth constant k is estimated in terms of the doubling time of the culture. In this rendering,
(T = the doubling time of the culture).
shRNA production model predicts considerable variation in the accumulation of different shRNAs
The output of the model shows yeast culture density and the concentration of unfolded (Fig. 2A) or folded shRNA (Fig. 2B) over a 24-hour period, starting from initial OD of 0.1. Interestingly, 24 hours after inducing shRNA expression, the model predicts an over three-fold variance in the accumulation of different folded shRNAs, with the highest levels observed in shRNAs V1, V2, V5, V7, V8, and V10, and the lowest in V3. This indicates that substantial variations in shRNA production can be attributed to sequence-specific characteristics of shRNAs that impact their production, degradation, and folding. Importantly, the model suggests that shRNA folding does not represent a limiting factor in shRNA production, as the concentration of unfolded shRNA remains approximately three orders of magnitude lower than that of folded shRNA throughout the experiment. The major determinant of shRNA accumulation is the shRNA stability, as seen by Hong et al 2010. The pronounced distinctions among shRNAs observed in this study underscore the necessity of incorporating such modeling when designing shRNAs.
Figure 2. Model illustrating shRNA production in yeast. Plots showing the growth of yeast culture density and the concentration of unfolded (panel A) or folded (panel B) shRNA over a 24h period after shRNA expression induction. Time 0 on the X axis indicates the moment of pGAL1 promoter induction with galactose.
Following shRNA synthesis, the next key molecular process in our project is processing of the shRNA into the functional siRNA. This is catalyzed by the enzyme Dicer that cleaves the hairpin loop from the shRNA molecules. Dicer is not present in wild-type S. cerevisiae, so we had to decide either to introduce Dicer to S. cerevisiae or to have yeast only produce shRNA, in which case the processing of shRNA would have to occur in the bee cells. siRNA processing could be established in S. cerevisiae by introducing Dicer enzyme from S. castellii. To investigate the potential of this approach, we modeled the siRNA processing in S. cerevisiae cells that express Dicer from pTEF1 promoter.
We assume that the rate of cleavage of shRNA by Dicer is proportional to both the concentration of Dicer and the concentration of shRNA. Thus, the reaction is pseudo-first order with respect to each reactant.
The rate of change of shRNA concentration over time can be described by equation:
, where
, is the rate of change of shRNA concentration over time.
k is the rate constant for the Dicer cleavage reaction.
[Dicer] is the concentration of Dicer.
[shRNA] is the concentration of shRNA.
In order to obtain the concentration of shRNA after a given time, the equation was integrated over the desired time period.
Equation was solved for [shRNA] by separation of the variables and integration:
, Upon integration, we get:
where C is the integration constant which can be found using the initial concentration of shRNA.
Now, using the initial condition that at t = 0, [shRNA] = [shRNA]0
After substitution of C into the equation, it looks like this:
After taking the exponential of both sides:
, where:
[shRNA]0 is the initial concentration of shRNA.
t is time.
k is the rate constant which would need to be experimentally determined or provided.
The amount of Dicer equals 749032 molecules in a yeast cell (Ho et al 2018), which can be converted to concentration based on the volume of the yeast cell.
The rate at which shRNA is cleaved by Dicer can be given by the Michaelis-Menten kinetics equation:
, where:
, is the rate of cleavage of shRNA.
k is the catalytic rate constant for Dicer.
Km is the Michaelis constant for Dicer acting on shRNA. It represents the concentration of shRNA at which the rate of cleavage is half its maximum value.
[Dicer] is the concentration of Dicer.
[shRNA] is the concentration of shRNA.
The above equation becomes a first-order rate equation in the limit of [shRNA] << Km (that is, when the shRNA concentration is much smaller than the Michaelis constant):
Introducing Dicer activity to the yeast shRNA synthesis model revealed that Dicer, expressed from pTEF1 is able to cleave most of the shRNA as it is being synthesized (Fig. 3). The folded shRNA levels reach a steady-state at mere 0.0014 ng/ml for the highest-expressing shRNAs, whereas the cleaved siRNA accumulates to 700 ng/ml, to 5 orders of magnitude higher level. This model suggests that Dicer activity is not a limiting factor in the efficiency of this project. Further, the high activity of Dicer here supports carrying out the siRNA processing in yeast. Driven by this, we engineered our yeast strains to express S. castellii Dicer from pTEF1 promoter.
Figure 3. Dicer-catalyzed processing of shRNA to siRNA. An ODE model based on Michaelis-Menten equation was used to model shRNA processing, plotted by the decrease of shRNA concentration in time. Time 0 on the X axis indicates the moment of galactose addition to initiate the expression of shRNA from the pGAL1 promoter.
Concentrations of folded and unfolded shRNAs after 24h from the start of production predicted by the model are listed in the Table
Table 2. Concentrations of folded and unfolded shRNAs after 24h from the start of production predicted by the model.
Following the siRNA production, our yeast is set to be fed to the bees. Thus, the next critical event is the uptake of siRNAs by the bee cell from the lumen of the bee stomach. During this step, the ingested siRNA is converted into effective siRNA within the cell, being bound to the RISC complex.
The first organism where proteins involved in the uptake of siRNA were identified is Caenorhabditis elegans>. In this model species, two proteins associated with the passive transport of the siRNA through the membrane were discovered: systemic RNA interference defective 1 and 2, known as Sid1 and Sid2, respectively (Winston et al., 2002; Winston et al., 2007). However, no Sid homologs have been identified in insects; only SID-like genes have been found (Cappelle et al., 2016). Additionally, it has been shown that, along with Sid-like proteins, clathrin-mediated endocytosis plays an important role in siRNA uptake by insect cells. (Cappelle et al., 2016). Considering the lack of empirical constants for non-model bee cells and the fact that the overall mechanism of siRNA uptake by bee cells is not fully understood, we calculated kinetic constants for RNA internalization and degradation based on the work of Roh et al., 2022. The single aggregated internalization rate constant k_int was used to simplify the simulation of the process of siRNA internalization into bee cells.
Modeling siRNA uptake and degradation in bee cells:
We derived M(t) from the differential equation using the method of integrating factors.
Given:
where M is the concentration of siRNA in bee cells after internalization, kint is the RNA internalization constant, S0 the extracellular siRNA concentration, kdeg is the RNA degradation constant. The constants were obtained from Roh et al., 2022.
Standard form for a first-order linear ODE:
Using an integrating factor:
Multiplication through by the integrating factor:
Notice that the left-hand side is the derivative of M(t) times e^{k_{deg} t} . This means:
Now, integrate both sides with respect to t:
Using the boundary condition M(0) = 0, we find:
C = 0
So, the solution is:
The siRNA uptake was modeled using the output of Dicer-mediated siRNA production as the value for the initial concentration of different siRNAs (Fig. 3). The intracellular siRNA concentration peaked at 20 minutes from the start of the process and all of the siRNA was internalized and degraded within 100 minutes (Fig. 4). This simulation indicates that siRNA uptake happens relatively quickly, and it suggests that a continuous supply of siRNA-containing yeast might be required to maintain a sufficiently high intracellular siRNA concentration.
It's worth noting that our model does not account for the amplification of the siRNA by RNA-dependent RNA polymerases, a process observed in RNAi responses that can amplify and even drive a systemic response (Tsai et al, 2015). Therefore, it's possible that intracellular concentrations of siRNA are higher than the predictions made by our model.
This modeling indicates that the best siRNAs among our 10 candidates could reach 8 nM concentration in the bee cell, while the more unstable siRNAs peak at 2 nM (Fig. 4). Importantly, siRNA concentrations in the lower nanomolar range have been found to be sufficient for effective RNAi, while higher concentrations are more prone to cause off-target effects (Chiu et al, 2004, Ki et al, 2010). Therefore, the simulation suggests that we are in the correct range of siRNA concentrations using this design.
Figure 4. siRNA uptake from bee stomach to bee cells. Plot shows the intracellular concentration of siRNA in bee cells. In this simulation, at time 0, the bee cell encounters extracellular siRNA at the concentration based on the simulations above (Fig. 2 and 3), followed by internalization of the siRNA and its degradation. The nM concentration denoted in the legend indicates the concentration achieved in yeast (Fig. 3).
The Argonaute protein is an integral component of the RISC (RNA-Induced Silencing Complex), which plays a crucial role in the further processing of double-stranded siRNA molecules, leading to the production of single-stranded guide siRNA. The guide strand remains associated with the Argonaute in the RISC and serves as a targeting molecule to direct it toward viral RNA. To simulate the function of the Argonaute protein, we combined two models: Michaelis-Menten-based model describing Argonaute activity and a model for estimation of the fluctuations in concentration of viral RNA in the cell over time.
Given the initial conditions:
Given that our initial [Argonaute - siRNA] is zero, at the very beginning:
is maximal, because [Argonaute] = [Argonaute]initial and [siRNA] = [siRNA]initial.
However, as [Argonaute - siRNA] increases, [Argonaute] and [siRNA] both decrease, slowing the rate of formation of the complex.
The issue arises from my previous simplification where I directly calculated the rate using only initial values of [Argonaute] and [siRNA], rather than solving for [Argonaute - siRNA] over time.
The ODE integration is meant to give us the value of [Argonaute - siRNA] at each time point, including 12 minutes, by considering the continuous changes in [Argonaute] and [siRNA] as [Argonaute - siRNA] forms.
The constants k1+, k1-and k2 were obtained from Martin et al 2021.
The model would be incomplete without considering the continuous replication of the viral genome within infected cells. To account for this, we calculated the rate constant for DWV viral genome replication based on the data presented in Hamiduzzaman et al 2015. This was done by analyzing the slope of virus quantification in bees over time.
The concentration of viral RNA in the cell was calculated by formula:
, where
RNAinitial - initial amount of viral RNA (equals 1)
mviral RNA - mass of viral RNA
Vcell - average volume of the bee larvae cell
, where
mnt - mass of each specific nucleotide in RNA sequence
We simulated the effect of siRNA concentration on the rate of viral RNA cleavage by RISC with the aim to determine what concentration of siRNA would be sufficient to inhibit the virus. This highlights the importance of siRNA concentration on the strength of RNAi response. In the simulation, subnanomolar siRNA concentrations were slow in directing degradation of DWV RNA, while with 2 nM siRNA, the majority of DWV RNA is degraded within a few minutes (Fig. 5). Importantly, the siRNA production and uptake modeled above suggests that our most highly expressed siRNAs accumulate in bees at 8 nM concentration, indicating that they would drive strong antiviral response.
Since the antiviral response is strongly dependent on siRNA concentration, maintaining a sufficient level of anti-DWV siRNAs in the bee cells could be crucial to ensure constant suppression of DWV. This supports the setup of our project, as even though bees are capable of producing antiviral siRNAs themselves, the continuous supply of optimal siRNAs is expected to have a strong beneficial effect. Further, as achieving high levels of supplemented siRNAs in bees is not trivial, it highlights the need to also consider siRNA stability and processing when designing the siRNAs.
Figure 5. The effect of siRNA concentration on RNAi-mediated DWV RNA degradation. The rate of DWV RNA suppression by the RISC complex was simulated at different siRNA concentrations. Time 0 on the X-axis indicates the starting point of siRNA internalization within the infected bee cells.
Figure 6. Suppression of DWV RNA by the 10 designed siRNAs. Time 0 on the X-axis indicates the starting point of siRNA internalization within the infected bee cells.
To visualize the suppression of DWV RNA by our 10 candidate siRNAs, we conducted the same simulation as above, using the siRNA concentrations obtained from the siRNA uptake model in Fig. 4. We note that all siRNA achieve complete silencing of the DWV RNA within 4 minutes of the siRNAs entering the cell. For further optimization of the process, it would be important to test the off-target effects of these siRNA. As off-target effects are more likely at higher siRNA concentrations, a minimal concentration ensuring sufficient on-target RNAi would be desirable.
Computational simulations played a driving role in our project, allowing us to predict the outcomes of modifying yeast in specific ways to produce shRNA and its potential impact on bee antiviral response. In order to facilitate the modeling of a complex process like siRNA expression and production in one organism, followed by the uptake and response in another organism, we divided our project into four distinct steps. Each of these steps was modeled separately, with the output from one model serving as the input for the subsequent model. This modular approach allowed us to break down the process into manageable components and ensure the flow of information between them. The simulations led us to the decision to establish siRNA processing in our engineered yeast enabling us to directly supply siRNAs, rather than shRNAs, to bees. Furthermore, they emphasized the importance of achieving sufficient siRNA concentrations for a strong antiviral response and provided information about the effect of shRNA stability on siRNA accumulation. Also, the modeling revealed a strong concentration-dependence in RNAi in bees, supporting our approach of providing bees with extra siRNAs in order to achieve a strong response against pathogens.
All code and visualizations are available at our modeling GitHub:
Supplementary table.The full list of constants used for computational modeling.
Supplementary table.