Skip to main content

Simulation (Modeling)

info

This page is mainly talking about how we simulate the data from wet lab. If you're interested in the way we find the genes and weights for our models using Pyhon & Jupyter Notebook, please check πŸ”—Software for more details.

Kinetic Simulation of Molecular Computational Processes​

Modeling Purpose and Ideas​

Establishing a molecular kinetic model holds immense significance:

  1. Validating the rationality and feasibility of wet experiments.

  2. Optimizing the protocol for wet experiments.

  3. Conserving human and material resources.

  4. Investigating reaction mechanisms and offering guidance for their future analysis and design.

To validate the feasibility of our designed reaction system, we developed a chemical reaction dynamics model to investigate its reaction kinetics and mechanism. In our detection system, the central reaction in the wet experiment is the polymerase-mediated strand displacement reaction. Thus, during the modeling process, we initially adjusted the rate constants associated with each reaction process based on various experimental data. By comparing this curve with the experimental data, we identified the most appropriate parameters and derived a simulation curve that closely aligns with the experimental results.

Modeling Proceduresstrand​

Chemical Kinetics Modeling​

We simplified the entire DNA reaction process and employed chemical reaction kinetics equations, drawing parallels with the steady-state approximation method, to describe each reaction process. Adhering to the principle of substance concentration changes, we formulated concentration change equations for all reactants, intermediates, and products within the reaction system. These equations were further utilized to derive the necessary system of differential equations for subsequent modeling.

We condensed the entire DNA strand displacement process into a corresponding group of chemical reaction equations. As an illustration, considering the molecular calculation process of the target with a weight of 3, the specific set of differential equations is presented below:

The first step is the binding of Target (weight=3, A) and probe (BD).

A+BD→k1ABDA + BD\overset{k_{1}}{\rightarrow}ABD

img

The second step is the binding of a large fragment of Bst DNA polymerase (C) to the substrate (a complex of Target and probe, ABD)

ABD+C→k2ABCDABD + C\overset{k_{2}}{\rightarrow}ABCD

img

The third step involves a strand displacement reaction catalyzed by Bst DNA polymerase, wherein it utilizes the long strand of the Probe as a template for synthesizing the daughter strand while displacing the weighted strand (D).

ABCD→k3ABC+3DABCD\overset{k_{3}}{\rightarrow}ABC + 3D

img

The fourth step is a toe-mediated strand displacement reaction, in which the weighted strand (D) displaced in the previous step reacts with a fluorescent probe (FQ), leading to the release of the fluorescent strand (F) and emission of fluorescence.

D+FQ→k4F+DQD + FQ\overset{k_{4}}{\rightarrow}F + DQ

img

In summary, the reaction equation and its corresponding rate equation are shown below:

Reaction Equation​

  1. A+BD→k1ABDA + BD\overset{k_{1}}{\rightarrow}ABD

  2. ABD+C→k2ABCDABD + C\overset{k_{2}}{\rightarrow}ABCD

  3. ABCD→k3ABC+3DABCD\overset{k_{3}}{\rightarrow}ABC + 3D

  4. D+FQ→k4F+DQD + FQ\overset{k_{4}}{\rightarrow}F + DQ

Differential Equation​

d[A]dt=βˆ’k1[A][BD]d[BD]dt=βˆ’k1[A][BD]d[ABD]dt=k1[A][BD]βˆ’k2[ABD][C]d[C]dt=βˆ’k2[ABD][C]d[ABCD]dt=k2[ABD][C]βˆ’k3[ABCD]d[ABC]dt=k3[ABCD]d[D]dt=k3[ABCD]β‹…3βˆ’k4[D][FQ]d[FQ]dt=βˆ’k4[D][FQ]d[F]dt=k4[D][FQ]d[DQ]dt=k4[D][FQ]\begin{align*} \frac{d[A]}{dt} &= -k1[A][BD] \\ \frac{d[BD]}{dt} &= -k1[A][BD] \\ \frac{d[ABD]}{dt} &= k1[A][BD] - k2[ABD][C] \\ \frac{d[C]}{dt} &= -k2[ABD][C] \\ \frac{d[ABCD]}{dt} &= k2[ABD][C] - k3[ABCD] \\ \frac{d[ABC]}{dt} &= k3[ABCD] \\ \frac{d[D]}{dt} &= k3[ABCD] \cdot 3 - k4[D][FQ] \\ \frac{d[FQ]}{dt} &= -k4[D][FQ] \\ \frac{d[F]}{dt} &= k4[D][FQ] \\ \frac{d[DQ]}{dt} &= k4[D][FQ] \end{align*}

Experimental Data Processing​

In our reaction system, DNA molecules were modified with fluorescent dyes, and their fluorescence intensity (FI) exhibited a positive correlation with the concentration of the DNA molecules. We established the positive relationship between FI and DNA concentration and derived the corresponding linear equation through fitting: y=ffmax⁑×nmax⁑y = \frac{f}{f_{\max}} \times n_{\max}, Where f represents the value of the fluorescence curve corresponding to a gene of a certain weight; fmax⁑f_{\max} represents the maximum fluorescence value of FI in the image; and nmax⁑n_{\max} represents the highest molar concentration value in the reaction. Based on this equation, we can convert the FI data to the molar concentration of DNA molecules. For example, take the following chart as an example, the molar concentration of the DNA molecule (weights=3) could be calculated using the equation fΓ·50Γ—30.

The graph on the left shows the fluorescence curve and the graph on the right shows the molar concentration curve. To facilitate comparison with the fitted curves, we also subtracted the values of the control group when performing the transformation.

imgimg

Solving for parameter values in chemical reaction equations​

The rate constants k1​ for target-probe binding, k2 for enzyme-substrate binding, and k4 for TMSD are determined empirically or fitted to experimental values. Typically, the rate constants for TMSD fall within the range of 0.001-0.009(nMβˆ’1sβˆ’1nM^{βˆ’1}s^{βˆ’1}). The rate equations for the polymerase-mediated strand substitution reaction must be fitted based on the experimental curves. The Ode45 function can be employed to solve the differential equations and plot the solution curves. All conditions, including the rate constants of the reactions and the concentrations of each reactant, intermediate, and product, are known, except for the rate constant of the polymerase-mediated strand substitution reaction, denoted as krealk_{real} which is unknown and needs to be determined through fitting.

MATLAB was employed to read the experimental data of molar concentration over time. The fminsearch function was utilized to minimize the difference between the solutions of the differential equations solved with different krealk_{real} values and the experimental curves. Furthermore, the fminsearch function was utilized to output the optimal value of krealk_{real} when the difference between the theoretical and experimental curves is minimized, representing the rate constant of the polymerase-mediated strand substitution reaction.

By repeating the above process, the rate constants of the reactions corresponding to the targets with different weights can be obtained, and the corresponding reaction curves can be plotted.

img

img

img

Solving for the kreal value that fits the target relatively well for all weights​

The krealk_{real} values obtained from the experimental curves are input into the ode45 function to generate the corresponding plots. By comparing the disparities between these simulation curves and the experimental data, the krealk_{real} value that best aligns with the final rate constant for the polymerase-mediated strand substitution reaction can be selected. The first step involved aligning the experimental curves with the curves obtained from the ode45 function, followed by calculating the differences between them. The MATLAB interpolation function interp1 was employed to align the two curves. Subsequently, the squares of the differences between the values of the experimental curve and the curve derived from the differential equation were calculated for corresponding time points. Finally, the sum of the squares of these differences for each time point was calculated.

The specific modeling steps are shown below:

First, we need to interpolate the time t of the experimental data and the molar concentration value (y1) to the time t of the solution of the differential equation to obtain the interpolated y1_interp.

% interpolate
[t_ode,y_ode] = ode45(@diss, t, y0);
y1_interp = interp1(t, y1, t_ode);

We can then directly compute the difference between y1_interp and the solution of the differential equation, and subsequently we square and sum this value to get the final difference.

% calculation of interpolation
difference = (y_ode(:, 10) - y1_interp).^2;
dd = sum(difference(:));
% output difference results
disp(['diff: ', num2str(dd)]);

A smaller final difference indicates a closer alignment between the two curves. During the interpolation process, there might be instances where the experimental data's timet does not perfectly match that of the solution from the differential equation, leading to NaN (Not a Number) values. To address this issue, the isnan function is employed to identify these NaN values. Subsequently, the interp1 function is used for interpolation, filling in these NaN values. This approach effectively handles NaN values, ensuring accurate calculations for subsequent analyses.

% processing of NaN values
nan_index = isnan(y1_interp);
y1_interp(nan_index) = interp1(t(~nan_index), y1_interp(~nan_index), t_ode(nan_index));

Following the aforementioned steps, the finalized krealk_{real} value is 5.7072 (nMβˆ’1sβˆ’1nM^{βˆ’1}s^{βˆ’1}). Taking Weight=3 as an example, the simulated curve closely matches the experimental curve (fitted curve in blue, experimental curve in orange).

img

Plotting of Polygenic Curves​

From the calculations in the previous step, a suitable rate constant for the polymerase-mediated strand displacement reaction can be derived for each weight of the target gene. By incorporating these values into the ode45 function, which solves the differential equation for each weighted target gene, reaction rate curves can be plotted for different weights. To display all rate curves in the same image, the "hold on" code can be used

% plot
t = data05(:,1)
y1 = data05(:,2)
plot(t,y1)

img