Math
Our mathematical model solved the decomposition of curcumin under light conditions (the change of content with time) and the change of curcumin concentration with external conditions, aiming to provide data support for further experimental improvement.
Curcumin decomposition data
Curcumin decomposition data processing based on machine learning
To predict the curcumin decay status, we intend to create a prediction model using the experimental data of curcumin decomposition we obtained. Thus, we plotted the data using Python to observe its approximate trend. The measurement data at the same time point should be within a certain confidence interval, otherwise it is considered invalid data, and the true value falls within this interval, making it difficult to determine the internal connections of the data. Therefore, we use machine learning to predict the relationships between the above data. Firstly, we set y as the concentration of curcumin and x as the decomposition time. So the model we built is:
The coefficient can be expressed as:
Finally,we calculate the mean square error:
From the above results, it can be seen that the mean square error of the model decreases with n (i.e. the degree of the polynomial) and reaches its minimum at n=5. The degree of the polynomial ranges from 4 to 5, and the decrease in MSE is not significant. Therefore, in order to avoid Runge Phenomenon to some extent, we selectedthe result with n=4. The final prediction outcome is obtained below:
that is:
ODE based on curcumin yield analysis
In order to determine the relationship between the concentration of ferulic acid added to the engineering bacteria medium and the concentration of curcumin finally produced, we established an ODE model to analyze their synchronous changes. We used the MATLAB toolkit Simbiology Model Builder to construct a mathematical pathway based on the metabolic pathway map. First of all, we defined two spaces, the environment in which the bacteria strain lives and the bacteria strain itself. Thehe variable is the bacteria strain, and the unit of parameters was set as milliliter for convenience. Then, we defined the properties of small molecules in the metabolic pathway. The concentration of ferulic acid is determined by the external environment (that is, the variable is directly manipulated by the performer) with the unit as millimole/ml. We then defined the various reactions in the metabolic pathways based on whether the reaction is reversible, the reaction rate determined by the mass action, and the reaction rate constant K. Finally, we used the initial assignment method to define the algebraic equation, added the total constraints appropriately, improved the metabolic pathway map, verified the reliability of the model, and eventually outputed the ODE model.
The following is a demonstration of the results:
The reaction rate defined in the model can be determined by consulting the literature. The model can successfully provide the change of the final curcumin concentration when the concentration of ferulic acid is given, and the specific concentration of curcumin can also be obtained after integration.
Reference
[1]. Drugs and Therapies - Drug Development; New Drug Development Data Have Been Reported by Investigators at Genentech [gPKPDSim: a SimBiology()-based GUI application for PKPD modeling in drug development][J]. Biotech Business Week,2018.
[2]. Drugs and Therapies - Pharmacometrics and Pharmacology; Studies in the Area of Pharmacometrics and Pharmacology Reported from Genentech Inc. (gQSPSim: A SimBiology-Based GUI for Standardized QSP Model Development and Application)[J]. Biotech Business Week,2020.
Appendix
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
x = np.array([0,0,0,0,0,20,20,20,20,20,40,40,40,40,40,60,60,60,60,60,80,80,80,80,80,100,100,100,100,100])
y = np.array([1.051,1.047,1.039,1.042,1.045,0.752,0.775,0.740,0.780,0.765,0.544,0.582,0.543,0.582,0.574,
0.447,0.480,0.456,0.489,0.471,0.407,0.417,0.410,0.441,0.442,0.375,0.379,0.383,0.410,0.383])
plt.plot(x,y,'b.')
plt.xlabel('time')
plt.ylabel('concentration')
plt.axis([0,100,0.3,1.2])
plt.show()
poly_features = PolynomialFeatures(degree = 4,include_bias = False)
x_ploy = poly_features.fit_transform(x.reshape(-1,1))
# print(x_ploy)
lin_reg = LinearRegression()
lin_reg.fit(x_ploy,y)
print(lin_reg.coef_)
print(lin_reg.intercept_)
x_new = np.linspace(0,100,100).reshape(100,1)
x_new_ploy = poly_features.transform(x_new)
y_new = lin_reg.predict(x_new_ploy)
y_predict = lin_reg.predict(x_ploy)
plt.plot(x,y,'b.')
plt.plot(x_new,y_new,'r--',label='prediction')
plt.axis([0,100,0.3,1.2])
plt.legend()
plt.show()
print(cross_val_score(lin_reg, x_ploy, y, cv = 5))
def MSE (y, y_predicted):
sq_error = (y_predicted - y) ** 2
sum_sq_error = np.sum(sq_error)
mse = sum_sq_error/y.size
return mse
print(MSE(y,y_predict))