# Model

## Overview

Three saffron-based substances (crocetin, crocin, and picrocrocin) are known to work as antidepressants. However, these substances have only been extracted from saffron flowers, and there are many unstable issues due to the cultivation environment and genetic factors. Therefore, we used a modeling approach to understand the antidepressant effects of these substances at the molecular scale and their production in *E. coli*.

## Our approach

We modeled two main aspects: docking simulation and material production.

### Docking Simulation

#### Introduction

Docking simulation predicts the interaction between a protein and a ligand based on its 3D structure. In this docking simulation, we predicted the mechanism of reuptake inhibition of norepinephrine (NE) by saffron agents (crocetin, crocin, and picrocrocin) at a molecular scale.

#### Material and Method

First, we obtained the structure of the target protein norepinephrine transporter (NET, ID: P23975) from AlphaFold DB[1] in PDB format. The structure of crocin, crocetin, and picrocrocin, which are expected to inhibit the reuptake of NE when used as ligands, was obtained from Chem DB. The structure of type I crocin was used because it was the only type of crocin from types I to V in the database.

We first predicted the docking sites in NET from these open data using the FPOCKET WEB SERVER. This is a method for screening ligand docking sites by retaining only the alpha spheres defined in the tight atomic packing zones and discarding the other alpha spheres. The potential docking sites obtained from FPOCKET were then narrowed down to the vicinity of residues F72, D75, A145, and Y152 based on a previous study of NET-NE interactions [2], and docking simulations were performed using AutoDock Vina. AutoDock Vina is a software program that calculates binding energies between ligands and proteins and outputs the lowest binding energy as a potential binding pattern. In this simulation, AutoDock Vina was set with the grid center as (X, Y, Z) = (1, -3, 0) and the grid size as (X, Y, Z) = (15, 15, 15).

#### Results

The results are presented in this order: crocetin, crocin, and picrocrocin. Crocetin is an apocarotenoid dicarboxylic acid found in the saffron family and a precursor sugar chain to crocin [Fig. 1-A][3]. To predict the possibility of crocetin-NET interaction, we first calculated the similarity in tertiary structure between crocetin and NE. We found that the similarity is low, with an RMSD of 4.2360Å. Therefore, we surrounded the vicinity of the NE binding site of NET with the docking grid of Autodock Vina and calculated the five final conformations from the one with the finest score. We concluded that the structure in Fig. 2-C is stable at the distance closest to the neighboring binding sites; the distance to A145 is 2.6 Å each. Although these values are barely less than 2.72 Å, within the range of hydrogen bonding. The binding distances to the amino acid residues of the other binding sites were far away, suggesting that crocetin may not bind to NET [Fig. 2].

**Fig. 1) Secondary structure of three saffron-based substances**

**A is crocetin, B is crocin and C is picrocrocin. The crocin used in this analysis is type I, so B is the structure of type I crocin.**

**Fig. 2) Results of Crocetin-NET docking simulations by Autodock Vina.**

**The five most likely binding conformations predicted by Autodock Vina are shown. Cyan is NET, green is crocetin, and red is the NE binding site. The score is the binding free energy; it is thought that the higher the negative value, the better. As a positive control, the score between NE and NET is -6.1, which does not necessarily mean that a higher value is the correct binding conformation.**

As described above, crocin is obtained by glycosylating crocetin, and there are five types of crocin structures (I-V) depending on the mode of glycosylation [Fig. 1B][3]. However, since Chem DB only contained a tertiary structure for a type I crocin, the type I crocin was subjected to the docking simulation in this analysis. Since the similarity of the tertiary structure of crocin is expected to be similar to that of crocetin, the similarity analysis was not performed for crocin. Surprisingly, the docking simulation results confirmed the possibility of glucose binding at both ends [Fig. 3]. The best Vina score was -5.1, indicating that crocin, like crocetin, tends to have low bondability.

**Fig .3) Results of the crocin-NET docking simulation by Autodock Vina.**

**The top five most likely binding conformations predicted by Autodock Vina are shown. Cyan is NET, green is crocetin, and red is the NE binding site. The score shows the binding free energy; the larger the negative value, the better.**

Picrocrocin is a saffron compound that partially differs from crocetin and crocin in its modification process. It comprises a safranal molecule attached to both ends of zeaxanthin, an intermediate common to all saffron compounds with a glucose molecule added [Fig. 1-C]. Picrocrocin has a tertiary structure highly similar to NE, with an RMSD of 1.8303Å. Like other molecules, we performed docking simulations for picrocrocin using Autodock Vina. The results indicated that in three-fifths of the conformations, a glucose residue is bound to the NE binding site like crocin, but in the remaining two-fifths, a safranal residue may be bound to the NE binding site [Fig. 4]. Specifically, it has been indicated that the safranal residue of picrocrocin and the aromatic rings of Y152 and F72 may bind to the NE binding site via π-π stacking interactions [Fig. 4][4].

**Fig. 4) Results of Picrocrocin-NET docking simulation by Autodock Vina.**

**The top five most likely binding conformations predicted by Autodock Vina are shown. Cyan is NET, Green is crocetin, and Red is the NE binding site. The score is the binding free energy; the larger the negative value, the better.**

#### Engineering Cycle

The model has not yet been specifically analyzed using CryoEM, and further clarification of the molecular action mechanism is expected to accelerate neuroscientific research into the cause of depression and the development of effective therapeutics.

#### Discussion

These results suggest that crocetin by itself is unlikely to interact with the NE binding site; the structural modification to crocin following glycosylation binds glucose bound to both ends of crocin and the NE binding site on NET. In addition, glucose is known to decrease the expression of NET itself [5], and since crocin retains four molecules of glucose per molecule, it is expected to contribute to the decrease in overall NET expression. Furthermore, crocin was predicted to directly and indirectly inhibit NE reuptake when it interacts with NET. This is due to the large molecular size of crocin, which increases the probability of contact with NET, and crocin is not a potent antagonist of NE like cocaine. Picrocrocin is suggested to act as an antagonist of NE through direct interaction with NET through safranal residues. Picrocrocin also retains glucose, and glucose itself may interact with the NE binding site and suppress the expression of NET, although not to the same extent as crocin, but is expected to reduce NET expression. For the above reasons, crocin and picrocrocin are thought to exert antidepressant effects and to be particularly effective in inhibiting NE reuptake.

### Material Production Model

#### Introduction

The material production model is a method of predicting how the quantity of products changes over time using ordinary differential equations. The advantage of this model is that it employs a differential equation and can be used to analyze chronological data with only simple mathematical functions. In our wet lab, the production of crocetin was confirmed. However, the production of crocin was not confirmed, so the time-dependent changes and the time when the maximum amount of crocin can be obtained were unknown. Therefore, we fitted a simple material production model to the type I-V crocin production data from previous research [3] and analyzed the crocin production over time.

Furthermore, due to the absence of prior research on the production quantity of picrocrocin and our team’s inability to perform time-dependent quantification, we have not conducted any studies on it.

Material and Methodology

First, we parameterized a simple ordinary differential equation model with two compartments, crocetin and crocin (type I-V), as dependent variables with respect to time, with arbitrary values only for normal cells [Formula. 1]. This model was used because it is suitable for capturing changes in the crocetin-crocin relationship over time and is easy to understand without using complex functions. Type I-V of crocin is regulated by two genes, UGT74F8 and UGT94E13. In the previous study, crocin levels were observed in three patterns: UGT74F8 only, UGT74F8-UGT94E13 only, and UGT94E13-UGT74F8 only, so we, too, decided to them separately. Therefore, we first estimated the parameters that fit the real data based on the data from the previous studies. The least squares method estimated the parameters using the initially set parameters as the base parameters (a=0.1, b=0.1). Then, the optimized parameters were plotted over the real data and the solution curves of the differential equations. Here, “a” represents the rate at which crocin is synthesized from crocetin at a given time, and “b” represents the spontaneous extinction rate of crocin per momentary time.

Furthermore, we analyzed the relationship between crocetin and crocin using dynamic systems theory. Equation. 1 is a simple ordinary differential equation, and since it has linearity, we performed linear transformation and drew a phase space (vector field) by focusing on the coefficient matrix. Dynamical systems theory is a mathematical theory and is a simple way to express the final solution of a differential equation under initial conditions, so it was a good way to understand how much crocetin and crocin will eventually be produced after an infinite time. In this study, all numerical computations of ordinary differential equations were performed using the fourth-order Runge-Kutta method of the deSolve library in the R language. The parameter estimation was performed using the least-squares method and the Nelder-Mead method for the optim function.

**Formula. 1:Ordinary differential equations for the crocetin-crocin production relationship.**

#### Results

First, we will discuss the results obtained with the vector expressing only UGT74F8. It is known that UGT74F8 is a gene mainly required for the production of type III crocin and type V crocin and is not deeply involved in the production of other types of crocin [3].

Fig. 5-A models the changes in type III and type V crocin production over time. Under the expression of UGT74F8, type III crocin showed a monotonic increase from 1h to 6h. Additionally, type V crocin showed a monotonic increase up to 2h from the start and reached its maximum production level (approximately 27 μmol/L) at 3-4 hours, followed by a gradual decline. The descent rate appeared shallow, suggesting the possibility of type V crocin accumulating within the vector for a certain duration [Fig. 5-A].

Next, we depicted the phase space of crocetin and type III and type V crocin. Fig. 5-B represents the relationship between crocetin and type III crocin. If crocetin is rich and type III crocin is poor at an initial condition, the approximate initial coordinates are approximated to be (crocetin, crocin) = (4, 0). Following the vectors emerging from each point, it can be observed that crocetin converges to 0 while crocin is abundantly produced. Similarly, in Fig. 5-C, which represents the relationship between crocetin and type V crocetin, the same coordinate (4, 0) is considered the initial coordinate. Tracing the arrows in the vicinity of this point, it becomes evident that Crocetin ultimately converges to 0, with a circuitous path leading to the intersection of the blue and orange nullclines at (0, 0). This suggests that Crocin(V) tends to remain within the vector for an extended period and undergoes gradual decomposition [Fig. 5B, C].

**Fig.5) Time course of Crocin and Crocetin-Crocin relationship in vectors expressing only UGT74F8.**

**A is a plot of time on the horizontal axis and molar concentration of Crocin on the vertical axis. Blue represents Corcin(III) and purple represents Crocin(V). B is the phase space of Crocetin-Crocin(III). The initial value starts from (Crocetin, Crocin) = (4, 0), and the solutions jump in the direction of the arrow. C is also a Crocetin-Crocin(V) phase space. Similarly to B, Crocin-Crocin(V) follows the arrow from (4, 0), and finally converges to the intersection of the blue and orange nullclines (equilibrium solution), which is different from B, while making a detour.**

Next, we present the results of Crocetin and Crocin levels in vectors expressing UGT74F8-UGT94E13 in sequence. In Fig. 6.A, we show the temporal changes in Crocin (I-V) production, similar to when only UGT74F8 was expressed. First, Crocin (I) and Crocin (V) were found to reach their maximum levels (5.5 μmol/L and 4.5 μmol/L, respectively) approximately one hour after quantification initiation. Subsequently, they exhibited a gradual decrease, as demonstrated in Fig. 5.A. Additionally, Crocin (II) reached its maximum production level (approximately 15 μmol/L) around two hours after quantification, followed by a gradual decrease similar to other forms of Crocin. Interestingly, Crocin (III) was produced at about twice the maximum level of Crocin (II) and showed a broadly monotonic increase over the 6-hour quantification period. Furthermore, Crocin (IV) was barely produced.

Next, we present the phase space for Crocetin and Crocin (I-V) in Fig. 6.B to Fig. 6.F, respectively. First, the relationship between Crocetin and Crocin (I&V) showed convergence to (0, 0) as an initial value, similar to when only UGT74F8 was expressed. This aligns with the behavior of Crocin (I&V) in Fig. 6.A, where Crocin (I&V) production gradually decreased and eventually reached zero after its peak. The Crocetin-Crocin (II) relationship was generally similar to that of Crocin (I&V), but it showed a longer path to convergence to (0, 0) from the initial point (4, 0) compared to Crocin (I&V). This is consistent with the behavior of Crocin (II) in Fig. 6.A, suggesting a gradual convergence of Crocin production to zero.

Moving on, the phase space of Crocetin-Crocin (III) showed a longer path to convergence to (0, 0) from the initial point (4, 0) compared to the other substances, indicating that production continued beyond 6 hours, as observed in Fig. 6.A, and then gradually decreased. Additionally, the phase space of Crocin-Crocin (III) showed a steeper slope near (4, 0) compared to other forms of Crocin, indicating a higher production level.

Finally, the phase space of Crocetin-Crocin (IV) remained near (4, 0) with little movement, demonstrating that production levels remained close to zero.

**Fig.6) Time course of Crocin and Crocetin-Crocin relationship in the vector expressing UGT74F8-UGT94E13.**

**A is a plot of time on the horizontal axis and molar concentration of Crocin on the vertical axis. Yellow represents Crocin(I), red represents Crocin(II), blue represents Corcin(III), gray represents Crocin(IV), and purple represents Crocin(V). B-C is the phase space of Crocetin-Crocin(I-V). The initial value starts from (Crocetin, Crocin) = (4, 0), and the solution eventually jumps in the direction of the arrow.**

Finally, we present the results of Crocetin and Crocin levels in a vector where UGT94E13-UGT74F8, which had the order of expression swapped compared to Fig. 6, was expressed sequentially. As shown in Fig. 7.A, this represents the chronological changes in all Crocin (I-V) levels, similar to the previous data. Firstly, Crocin (I) shows a quantity of expression that is almost unchanged compared to the data in Fig. 6. It reaches its peak production (approximately 5.4 μmol/L) in about 1 hour, consistent with the quantification mentioned earlier, and then gradually decreases. Furthermore, Crocin (IV) also shows a similar trend to the data in Fig. 6, with no significant increase in production. Similarly, Crocin (II) and Crocin (III) yield results that are generally consistent with the previous data. However, surprisingly, Crocin (V) produces a quantity similar to Crocin (II), unlike the data in Fig. 6.

Focusing on Crocin (V) and examining the phase space in Fig. 7F, we can observe a slight increase in the slope of the orange nullcline, and it is evident that the vectors near (4, 0) have a steeper slope. From this, we can intuitively confirm the increase in production levels.

Fig.7) Time variation of Crocin and Crocetin-Crocin relationship in vectors expressing UGT94E13-UGT74F8.

As in the previous Figures, A is a plot of time on the horizontal axis and molar concentration of Crocin on the vertical axis.Blue represents Crocin(III) and purple represents Crocin(V).B-C is the phase space of Crocetin-Crocin(I-V).The initial value starts from (Crocetin, Crocin) = (4, 0), and the solution eventually jumps in the direction of the arrow.

#### Engineering Cycle

In this analysis, we were able to devise an efficient method of obtaining Crocin in our team’s wet, using data from previous studies, and were able to predict the production with a simple model and parameters, thus improving the reproducibility.

#### Discussion

From these results, firstly, it was found that UGT74F8 is involved in the production of Crocin(III) and Crocin(V), particularly resulting in a significant increase in Crocin(V) production. When UGT74F8 was expressed individually, it was observed to be the most efficient enzyme for obtaining Crocin(V), with a maximum yield of approximately 27 μmol/L. Additionally, the co-expression of UGT94E13 and UGT74F8 is known to produce Crocin(I), which in turn leads to the production of Crocin(II) and Crocin(III). As a result, Crocin(I) is expected to have a lower production level compared to Crocin(II&III) since it is primarily utilized for the conversion of Crocin(I) to Crocin(II) and Crocin(III)[3]. Furthermore, when expressed in the order of UGT74F8-UGT94E13, the production of Crocin(V) was lower than Crocin(I), whereas when expressed in the order of UGT94E13-UGT74F8, there was a suggestion of an increase in Crocin(V) levels. This may imply that Crocin(V) production is inherently suppressed by other Crocin forms, apart from Crocin(III), and the degree of suppression varies depending on the expression order.

It is noteworthy that in this study, all Crocin(I-V) exhibited a relatively slow degradation rate, suggesting that Crocin is resistant to degradation. In conclusion, Crocin production can vary significantly depending on the order of UGT plasmid expression, while some forms like Crocin(III) exhibit stable production regardless of the order. Moreover, all Crocin forms appear to have a slow degradation rate.

## References

- [1] Varadi, Mihaly, et al. “AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.” Nucleic acids research 50.D1 (2022): D439-D444.
- [2] Schlessinger, Avner, et al. “Structure-based discovery of prescription drugs that interact with the norepinephrine transporter, NET.”
*Proceedings of the National Academy of Sciences*108.38 (2011): 15810-15815. - [3] Pu, Xiangdong, et al. “In vivo production of five crocins in the engineered Escherichia coli.” ACS Synthetic Biology 9.5 (2020): 1160-1168.
- [4] Ali, Mohd Sajid, and Hamad A. Al-Lohedan. “Spectroscopic and molecular docking investigation on the noncovalent interaction of lysozyme with saffron constituent “Safranal”.” ACS omega 5.16 (2020): 9131-9141.
- [5] Straznicky, N. E., et al. “Norepinephrine transporter expression is inversely associated with glycaemic indices: a pilot study in metabolically diverse persons with overweight and obesity.” Obesity science & practice 2.1 (2016): 13-23.