| SZU-China - iGEM 2023

Overview

Cost, environmental protection and convenience are often the problems that traditional pesticides can not be avoided, but also the primary problem affecting the promotion of biological pesticides. We have built multiple models to guide or solve the problems and difficulties faced by our products in these three aspects. First, to further characterize how shRNAs function during RNAi, we built a model of RNAi-DDEs based on previous studies. Then on the basis of this model, we further proposed a model to optimize the application rate of shRNA pesticides at the time of application. We also focused on plant fungal epidemics and developed mechanistic models to help prevent epidemics. Finally, we also guide the application of biocontrol bacteria to ensure the effectiveness and cost performance of the product.

shRNA inhibiting fungal growth -- Delayed Differential Equation Model (DDEs)

Figure 1. The process by which shRNA acts

Ordinary differential equations(ODEs)

First, we apply ordinary differential equations to describe the process by which shRNA acts in vivo.

[mRNA]:

\frac{d X_{m}}{d t} = k_{m} - d_{m} X_{m} - δ (X_{m}, X_{s})

K_{m} : p r o m o t e r t r a n s c r i p t i o n r a t e;

d_{m} : b a s a l m R N A d e g r a d a t i o n r a t e;

δ (X_{m}, X_{s}) : R a t e o f m R N A d e g r a d a t i o n b y R N A i

[protein]:

\frac{d X_{p}}{d t} = k_{p} X_{m} - d_{p} X_{p}

k_{p} : p r o t e i n t r a n s l a t i o n r a t e;

d_{p} : b a s a l p r o t e i n d e g r a d a t i o n r a t e

Determination of the δ parameter

Natural mRNA degradation pathways in vivo:

[m R N A] \overset{\deg}{⟶} \deg m R N A

Binding of mRNA to shRNA complexes:

[m R N A] + [sh R N A] \overset{λ}{⟶} [m R N A - sh R N A]

Cleavage reaction occurs in the mRNA-shRNA complex:

[m R N A - sh R N A] \overset{k}{⟶} [cleavedmRNA] + [shRNA]

Law of mass action:

\frac{d [degmRNA]}{d t} = \deg [m R N A]

(1)

\frac{d [m R N A]}{d t} = - \deg [m R N A] - λ [shRNA] [m R N A]

(2)

\frac{d [cleavedmRNA]}{d t} = k [m R N A - sh R N A]

(3)

The law of conservation of mass follows:

0 = \frac{d [m R N A]}{d t} + \frac{d [cleavedmRNA]}{d t} + \frac{d [degmRNA]}{d t} + \frac{d [m R N A - shRNA]}{d t}

(4)

Since the recycling of shRNA complexes is thought to be extremely rapid, (Filipowicz, 2005; Haley and Zamore, 2004; Holen et al., 2002; Hutvagner and Zamore, 2002; Martinez and Tuschl, 2004; Tang et al., 2003; Tomari and Zamore, 2005), that is :

\frac{d [m R N A - s h R N A]}{d t} = 0

(5)

Equation (4) can be rewritten as follows.

- \frac{d [m R N A]}{d t} = \frac{d [cleavedmRNA]}{d t} + \frac{d [degmRNA]}{d t}

Further deformation from (2) and (3) can be obtained as follows.

\deg [m R N A] + λ [shRNA] [m R N A] = \frac{d [cleavedmRNA]}{d t} + \frac{d [degm R N A]}{d t}

λ [shRNA] [m R N A] = \frac{d [cleaved m R N A]}{d t}

λ [shRNA] [m R N A] = k [m R N A - sh R N A]

(6)

Note that:

[sh R N A_{total}] = [shRNA] + [m R N A - sh R N A]

That is:

λ ([shRNA A_{total}] - [m R N A - s h R N A]) [m R N A] = k [m R N A - s h R N A]

[m R N A - shRNA] = \frac{λ [shRNA A_{total}] [m R N A]}{k + λ [m R N A]}

(7)

δ (X_{s}, X_{m}) = \frac{d [cleavedmRNA]}{d t} = k [m R N A - shRNA] = \frac{k λ [shRNA A_{total}] [m R N A]}{k + λ [m R N A]}

Consider the degradation of shRNA, there are:

\frac{d [s h R N A_{t o t a l}]}{d t} = - α [s h R N A_{t o t a l}], [s h R N A_{t o t a l}] |_{t = 0} = [s h R N A_{u s e d}]

That is:

\frac{d X_{m}}{d t} = k_{m} - d_{m} X_{m} - δ (X_{m}, X_{s}) = k_{m} - d_{m} X_{m} - \frac{k λ e^{- α t} [s h R N A_{u s e d}] X_{m}}{k + λ X_{m}}

(8)

\frac{d X_{p}}{d t} = k_{p} X_{m} - d_{p} X_{p}

(9)

DDEs (Delay Differential Equations) model:

Since eukaryotic mRNA transcription and translation exist in two different locations. Therefore, the time required for mRNA to be transported out of the nucleus and the time required for RnaI-induced mRNA degradation need to be considered. Consider introducing two delay times:

τ_{1} : T i m e r e q u i r e d f o r R n a I - i n d u c e d m R N A d e g r a d a t i o n

τ_{2} : T h e t i m e r e q u i r e d t o t r a n s p o r t m R N A f r o m t h e n u c l e u s t o t h e c y t o p l a s m

Then the system of equations is deformed as follows.

\frac{d X_{m}}{d t} = k_{m} - d_{m} X_{m} (t) - \frac{k λ e^{- α t} [shRNA used] X_{m} (t - τ_{1})}{k + λ X_{m} (t - τ_{1})}

(10)

\frac{d X_{p}}{d t} = k_{p} X_{m} (t - τ_{2}) - d_{p} X_{p}

(11)

Simulation results:

Figure 2. Simulation and experimental results

The prediction results have a good fit with the experimental results, which not only demonstrates the accuracy of the model, but also can further guide our experiment. The mean square error of the model was 13%, and the parameters as well as the codes are in the attachment.

DDEs based shRNA dosage optimization:

Cost is often a primary concern when applying RNA pesticides in agricultural production. If the concentration of RNA applied is too high, the eradication of pathogenic fungi may be effective, but the cost is too high. However, if the concentration is too low, although it reduces the cost, it will not play a role in preventing and treating plant diseases.

Define the cost function:

J （ [s h R N A_{t o t a l}]) = P_{s h R N A} [s h R N A_{t o t a l}] T + \int_{0}^{T} X_{p} (t) d t

P_{s h R N A} : C o s t p e r u n i t o f s h R N A T : T i m e o f o b s e r v a t i o n P (t) : T h e c u r r e n t p r o t e i n c o n c e n t r a t i o n

The first part is the cost of applying the shRNA (terminal indicator), and the second part is the integral of the concentration of the protein (process indicator).

Using the Hamiltonian function:

H (t) = P (t) + λ_{1} (k_{m} - d_{m} X_{m} - \frac{k λ e^{- α t} [s h R N A_{total}] X_{m} (t - τ_{1})}{k + λ X_{m} (t - τ_{1})} - \frac{d X_{m} (t)}{d t}) + λ_{2} (k_{p} X_{m} (t - τ_{2}) - d_{p} X_{p} - \frac{d X_{p} (t)}{d t})

(13)

Note that in the time delay condition:

λ (t) \dot{M} (t - τ) = λ (t + τ) \dot{M} (t)

The costate equation is as follows.

\frac{d λ_{1}}{d t} = - \frac{\partial H}{\partial X_{m}} = λ_{1} d_{m} - \frac{λ_{1} (t + τ_{1}) λ e^{- α t} k^{2} [s h R N A_{t o t a l}]}{{(k + λ X_{m} (t - τ_{1}))}^{2}} - λ_{2} (t + τ_{2}) k_{p}

(14)

\frac{d λ_{2}}{d t} = - \frac{\partial H}{\partial X_{p}} = λ_{2} (t) d_{p} - 1

(15)

Setting boundary conditions:

λ_{1} (t) = 0, λ_{2} (t) = 0, t > T

The gradient of the cost function for [shRNA_total] is:

\frac{\partial J ([s h R N A_{t o t a l}])}{\partial [s h R N A_{t o t a l}]} = P_{s} T - \int_{0}^{T} \frac{λ_{1} λ k e^{- α t} X_{m} (t - τ_{1})}{k + λ X_{m} (t - τ_{1})} d t

matlab optimizer was used to optimize the amount of shRNA applied:

Figure 3. Comparison of effects before and after optimization(the cost per unit shRNA was set to 0.1)

The optimization results indicated that the optimal shRNA application amount should be 2.6 times the experimental amount under the set cost. In the actual production and application, we can further consider the best application rate under different process costs and transportation costs.

Epidemiological models of plant mycoses

Background on Epidemiology:

Gray mold gets its name from the large number of gray mold layer conidia and hyphae produced in the diseased part. The disease is an airborne multi-cycle disease, which can damage all plant organs. Infected plants initially show watery spots, then decay and soften, accompanied by collapse of the diseased parts. At the same time, under appropriate temperature and humidity conditions, the diseased parts form a layer of gray to brown powdery conidia, which can be spread again by air currents and rain, infecting healthy plant seedlings, stems, leaves, flowers and fruits and causing decay.

Incidence prediction model:

The incidence of plant fungal diseases is often related to the season. Furthermore, it is the difference of temperature and humidity in the season that leads to the different possibility and scale of fungal outbreaks in different seasons. A temperature-humidity dependent nonlinear regression model was developed to predict the incidence of mycosis.

Decompose the infection rate into multiple multiplicative modules:

β = β_{0} * T * H + ϵ, T \in [0, 1], H \in [0, 1]

β_{0} : U n d e r l y i n g i n f e c t i o n r a t e

T : T e m p e r a t u r e i n f l u e n c e m o d u l e

H : H u m i d i t y i n f l u e n c e m o d u l e

ϵ : C o e f f i c i e n t o f c o r r e c t i o n

Temperature module

For the temperature module, the normal distribution is adopted to describe the influence module (This is reasonable, because neither too high nor too low a temperature is suitable for fungal growth.):

T = \frac{1}{\sqrt{2 π} δ} \exp (- \frac{(T E M - O p t i m u m T E M)^{2}}{2 δ^{2}})

OptimumTEM: Theoretical optimum temperature

TEM: Temperature of observation

According to the current study, the theoretical optimal temperature was set at 20℃.

Humidity module:

In general, when humidity is high, it is conducive to the reproduction and spread of gray mold, while when humidity is low, it will play a limiting role. So, The influence of humidity factor on rice disease prevalence can be represented by S-shaped curve, and Logistic function is used as the basic form for modeling in H module. For humidity influence module, logistic function is used for simulation:

H = 1 - \frac{1}{1 + \exp (\frac{H U M - O p t i n u m H U M}{τ})}

Use the fmincon() optimizer in matlab to solve the problem:

Define the function RMSE (Root Mean Square Error) :

R M S E = \sqrt{\frac{1}{N} \sum_{t = 1}^{N} {({observed}_{t} - {predicted}_{t})}^{2}}

The above problems can be written in the form of mathematical equations:

{\begin{cases} min R M S E (X) \\ X = {[β_{0}, δ, OptinumHUM, τ, k_{a}]}^{T} \\ X \in R \end{cases}

The mean square error of the model in the training set was about 2.5%. This indicates that our model can better predict the incidence probability at this time, and can help to prevent fungal diseases in agricultural production.The training set, test set and parameters are shown in the attachment.

Field simulation of sporo-cellular automata:

The timing of pesticide application is determined, and the important point of variance is the range of pesticide application. If the range is too large, it will not only increase the cost, but also affect the local soil and involve some biosecurity issues. However, the application range is too small to achieve the expected effect of disease prevention and treatment.

To determine the extent of pesticide application, we need to use models to simulate the spread of plant mycoses in real fields in relation to the specific location of the plant. According to the epidemiological model of general infectious diseases, in a field, the plants are divided into four types:

S: Susceptible plant

E: Exposed plant

I: Infected plant

R: The plant is recovered or removed

Assumption 1: Hypothesis 1: Since gray mold is a fungal disease with both spore and soil-borne modes of transmission, considering the large range of spore transmission, there is no susceptible plant S and only exposed plant E.

Assumption 2: It is believed that every plant at the same time is physiologically identical, so the infection rate is only related to the spore concentration of Botrytis cinerea.

Assumption 3: It is assumed that the spores of ash mold are released uniformly in all directions with respect to time.

Because the general model of infectious diseases doesn't take into account the relative location of individuals, but not only does that have an impact on the range of pesticides we apply, but it's a very important consideration in plant diseases, because plants don't move like animals do. If this positional relationship can be effectively described, the amount of pesticide used can be reduced and soil pesticide residues can be reduced. A plant epidemic-cell machine model is proposed to solve this problem.

Set the rules of the cell machine:

Infected cells will release spores to the surrounding cell, and spore concentration is related to distance from the infected cell. Infected cells die after D of survival and stop releasing spores. The concentration of spores in each square was superimposed by the concentration of spores released by multiple infected cells. New cells are infected every day, and infection is a random event, depending on the spore concentration here.

Spore transmission function:

Assuming that for every small time δt, the concentration of spores produced is a constant concentration α, the concentra tion of spores will gradually decrease after spreading around. If the limit of air propagation speed is not taken into a ccount, the concentration of spores is only a function of distance:

S (x) = \frac{α}{x}

Where x is the Euclidean distance with respect to the infected plant.

Infection probability function:

Since the probability of plant infection is in (0, 1) for any spore concentration, it is necessary to map the spore concentration to the interval (0, 1), and consider using the sigmoid curve for simulation:

β (S) = 1 - \frac{1}{1 + \exp (\frac{\sum_{n} S - O p t i n u m S}{τ})}

\sum_{n} S i s t h e s u m o f s p o r e c o n c e n t r a t i o n s a t t h e c u r r e n t c e l l p o s i t i o n

The parameters were estimated using the pattern search algorithm in matlab:

The mean square error of the model was approximately 11%. The parameters as well as the model code are in the attachment.

Figure 4. Model predictions and experimental results

Calculation of pesticide application radius:

Consider the case where one plant is found to be diseased. If the plant is diseased, should the neighboring plants with a radius of several be sprayed with pesticide? Consider the incidence probability P for each cell under this model:

P (x, d a y s) = 1 - (\frac{1}{1 + e x p \frac{\frac{α}{x} - O p t i n u m S}{τ}})^{d a y s}

Figure 5. 3D image of probability of morbidity -distance -days of onset

Based on this model, it is only necessary to determine the agricultically acceptable incidence and onset time to obtain the radius of pesticide application.

Dispersal model in soil after biocontrol application

For the same purpose, we need to develop guidance models for the application of biocontrol bacteria

Assumption 1: Biocontrol strains will not die due to suicide genes before spreading and colonizing on plant roots.

Assumption 2: The individual biocontrol bacteria moved randomly in the diffusion process, and there was no bias in a certain direction. In fact, the movement of bacteria in soil is affected by many factors.

Consider the case in two dimensions.Assuming that the biocontrol bacteria move a distance d in time δt, the number of moves in time T is

N = \frac{T}{δ t}

Because the direction of movement is considered random and there are only two directions (which are perpendicular to each other).So that:

{X_{i}, Y_{i}}^{2}

and

X_{i}, Y_{i} \sim P (\frac{N}{4})

When N is large enough:

P (\frac{N}{4}) \sim N (\frac{N}{4}, \frac{N}{4})

Define:

Z_{i} = d (X_{i} - Y_{i})

that is

Z_{i} \sim N (0, \frac{d^{2} N}{4})

Then, after the diffusion of time T, the average distance of biocontrol bacteria from the application origin is:

⟨ R ⟩ = ⟨ \sqrt{Z_{1}^{2} + Z_{2}^{2}} ⟩ = ⟨ d \sqrt{\frac{N}{2} \sqrt{\frac{d Z_{1}^{2} + d Z_{2}^{2}}{d^{2} N}}} ⟩ = d \sqrt{2 N} Γ (\frac{3}{2}) = d \sqrt{\frac{2 T}{δ t}} Γ (\frac{3}{2})

Only by determining the time T and step d, we can obtain the approximate diffusion range of biocontrol bacteria, which can guide the application of our bead to save cost and improve efficiency.

The parameters in models:

RNAi-DDEs:

Parameters	Value
$k_{m}$	0.00648
$d_{m}$	0.00000648
$k_{p}$	0.0064
$d_{p}$	0.00548
$λ$	0.0001
$k$	0.0001
$τ_{1}$	3.51e-05(estimate)
$τ_{2}$	2.34e-05(estimate)
$S$	0.1(estimate)

The parameters in the model (except those estimated) were derived from previous teams and papers

Epidemiological models:

Parameters	Value
$β_{0}$	27.6284
$δ$	61.58
$O p t i n u m H U M$	12.0052
$τ$	1.6743
$k_{a}$	9.0711
$RMSE (Training set)$	6.923%
$RMSE (test set)$	2.5125%
$O p t i n u m S$	11.61
$τ (C e l l u l a r a u t o m a t a)$	1.25
$α$	6.875
$T$	12
$RMSE(Cellular automata)$	11.35%

The parameters of the model are all derived from estimates.

All the model application code has been uploaded to gitlab.

References
1. [1]. Shi Shouding, & MA Zhanhong. (2005). Climate-based regional classification for overwintering of puccinia striiformis in china with GIS and geostatistics. Chih Wu Pao Hu Hsueh Pao, 32(1), Chih Wu Pao Hu Hsueh pao, 2005, Vol.32 (1).
  [2]. G. Neofytou, Y.N. Kyrychko, K.B. Blyuss, Mathematical model of plant-virus interactions mediated by RNA interference, Journal of Theoretical Biology, Volume 403, 2016, Pages 129-142, ISSN 0022-5193,
  [3]. M. Alejandra Guerrero-Rubio, Samanta Hernández-García, Francisco García-Carmona, Fernando Gandía-Herrero, Extension of life-span using a RNAi model and in vivo antioxidant effect of Opuntia fruit extracts and pure betalains in Caenorhabditis elegans, Food Chemistry, Volume 274, 2019, Pages 840-847, ISSN 0308-8146,
  [4]. Bansal Shweta, Grenfell Bryan T and Meyers Lauren Ancel 2007When individual behaviour matters: homogeneous and network models in epidemiologyJ. R. Soc. Interface.4879–891
  [5]. Horn, T., Sandmann, T., Fischer, B. et al. Mapping of signaling networks through synthetic genetic interaction ana lysis by RNAi. Nat Methods 8, 341–346 (2011). https://doi.org/10.1038/nmeth.1581

Menu