Starting from the beginning, even before our project was written, we began our research to understand how the CBD synthesis pathway functioned in the Cannabis sativa plant. This research was fruitful, getting us a consistent foundation of the metabolic pathway necessary for the production of CBD, along with an understanding of each stage and the mechanism of each enzyme involved in the process, as shown below.
Figure 1: Biosynthetic pathway of cannabinoids in C. sativa.
Source: 2017, Zirpel et al.
Note. Image from “Engineering yeasts as platform organisms for cannabinoid biosynthesis”, by B. Zirpel, F. Degenhardt, C. Martin, and O. Kayser, and F. Stehle, 2017, Journal of Biotechnology 259, p.205 ( https://doi.org/10.1016/j.jbiotec.2017.07.008 ).
Although there was the possibility of considering various fungi or plants for the project, the use of the yeast Saccharomyces cerevisiae seemed quite obvious to us, since not only it has the necessary machinery for synthesis and modification of the needed enzymes, it already carries out the mevalonate metabolic pathway, one of the two key pathways necessary for the production of CBD, which would be a big step towards the construction of our project (Zirpel et al. 2017). Adding to that, we already had easy access to a very competent strain of S. cerevisiae, with several auxotrophies we could explore, and other iGEM teams from previous editions had deposited sequences for all of the needed enzymes optimized for S. cerevisiae (See: iGEM22 Waterloo). Even though iGEM22 Waterloo team had some success carrying out their project, we felt there was still room for improvement and a more simplified approach to the challenge of synthesizing cannabidiol in yeast cells. What we ended with is an elegant, modifiable and easy-to-use strategy for cloning and expressing the metabolic pathway needed for CBD synthesis, with two “blocks” (see Biological Circuits in this section) that can be studied individually, in concert and further improved and exchanged into other circuits, cassettes and chassis.
Now, with the project written based on the metabolic pathways found, the next step was to study and research the current bibliography of each stage and each enzyme present in the metabolic pathway, with the aim of understanding the structure and functionality of each enzyme. Among the characteristics sought, we can mention the presence or absence of signal peptides, the need for cofactors, the functioning of the catalytic site, the position it is found in the cell, and, in general, any information that could guide the modeling of these proteins.
In this case, the following characteristics were found for each enzyme researched:
de, J. C., Levisson, M., Beekwilder, J., Berkel, van, & Jean‐Paul Vincken. (2020). Plant Aromatic Prenyltransferases: Tools for Microbial Cell Factories. Trends in Biotechnology, 38(8), 917–934. https://doi.org/10.1016/j.tibtech.2020.02.006
Degenhardt, F., Stehle, F., & Kayser, O. (2017). The Biosynthesis of Cannabinoids. Handbook of Cannabis and Related Pathologies, 13–23. https://doi.org/10.1016/b978-0-12-800756-3.00002-8
Stout, J. M., Boubakir, Z., Ambrose, S. J., Purves, R. W., & Page, J. E. (2012). The hexanoyl-CoA precursor for cannabinoid biosynthesis is formed by an acyl-activating enzyme in Cannabis sativa trichomes. The Plant Journal, no-no. https://doi.org/10.1111/j.1365-313x.2012.04949.x
Tahir, M. N., Shahbazi, F., Rondeau-Gagné, S., & Trant, J. F. (2021). The biosynthesis of the cannabinoids. Journal of Cannabis Research, 3(1). https://doi.org/10.1186/s42238-021-00062-4
Taura, F., Sirikantaramas, S., Shoyama, Y., Yoshikai, K., Shoyama, Y., & Morimoto, S. (2007). Cannabidiolic-acid synthase, the chemotype-determining in the fiber-typeCannabis sativa. FEBS Letters, 581(16), 2929–2934. https://doi.org/10.1016/j.febslet.2007.05.043
Taura, F., Tanaka, S., Taguchi, C., Fukamizu, T., Tanaka, H., Shoyama, Y., & Morimoto, S. (2009). Characterization of olivetol synthase, a polyketide synthase putatively involved in cannabinoid biosynthetic pathway. FEBS Letters, 583(12), 2061–2066. https://doi.org/10.1016/j.febslet.2009.05.024
Zirpel, B., Degenhardt, F., Martin, C., Kayser, O., & Stehle, F. (2017). Engineering yeasts as platform organisms for cannabinoid biosynthesis. Journal of Biotechnology, 259, 204–212. https://doi.org/10.1016/j.jbiotec.2017.07.008
The biological circuits for the genes of interest used can be assembled in any expression vector that has a replication origin for E. coli and yeast, as well as a strong inducible promoter for yeast. Using Saccharomyces cerevisiae as a chassis, the olivetolic acid and CBDA synthesis pathways require the insertion of its synthesis pathway by means of biological circuits compatible with the expression system.
To ensure that the five enzymes are synthesized with the expected function in the circuit, two possibilities were tested: the use of fusion proteins and the use of 2A sequences. The activity of the possible proteins was assessed by structural and kinetic modeling in order to find the proteins with the best efficiency in producing CBDA
The genes for CBDA synthesis were cloned into two separate biological circuits, so that the resulting inserts and expression vectors are of a reasonable size and, when translated, are more efficient at converting substrates and producing CBDA.
Using the fusion proteins, pairs of constructs containing two or three genes, in different combinations, were evaluated for the biosynthesis of two fusion proteins. The genes were fused into an insert, spaced with flexible linkers of 10 amino acids to maintain the activity and structure of the individual proteins. Each insert was cloned into pRS vectors.
For the use of 2A sequences, two inserts were constructed, of two and three genes, spaced by the T2a sequence (gagggcaggcagtctgctgacatgcggtgacgtggaagagaatcccggccct). The genes were combined until they formed and generated two inserts of approximately the same size.
SWISS-MODEL and AlphaFold2 servers were used for structural modeling of the native proteins with T2a appendages and fusion proteins. The three-dimensional structures and energy parameters of the proteins were evaluated using SWISS-MODEL, the PyMOL software, and the QMean server.
For kinetic modeling, the CHARMM-GUI NMR structure calculator software was used to obtain energetically refined protein models for molecular docking with the HADDOCK, ClusPro 2, and HDOCK software, together with the appropriate substrates and cofactors added to these servers (See at Structural Modeling section above).
Structural and kinetic modeling of the proteins separated using 2A sequences or fusion proteins allowed the construction of expression cassettes already suitable for the chassis, which will be cloned into the pRS425 and pRS426 vectors using the Gibson Assembly method.
The biological circuits were constructed using the cassettes developed previously, with Gal1 promoters, efficient ADH1 terminators for the chassis, the insert developed with genes intercalated by 2A sequences, and selection markers.
Figure 1: Diagram of the device for CsHCS, OS, and OAC expression.
Source: 2023, Authors.
Figure 2: Diagram of the device for CBDAS and CsPT4 expression.
Source: 2023, Authors.
The expression cassettes were cloned into the pRS vectors mentioned beforehand.
Several biological circuits were tested before coming to the conclusion of which ones would be best for the project. In this case, the plasmids pRS423, pRS424, pRS425, pRS426, pYES2, and pET26b were used for testing.
These vectors required expression cassettes to be made, consisting of a promoter - CDS - terminator. The following sequences were therefore tested for the assembly of each cassette:
Table 1: Sequences tested for the cassettes’ assembly.
Promoter | CDS | Terminator |
---|---|---|
GAL1 |
CSHCS
OS
OAC
|
ADH1 |
GAL1 |
CSHCS
OS
CsaPT4
|
ADH1 |
GAL1 |
CBDAS
OS
OAC
|
ADH1 |
GAL1 |
CBDAS
OS
CsaPT4
|
ADH1 |
GAL1 |
PT4
CBDAS
-
|
ADH1 |
GAL1 |
PT4
OAC
-
|
ADH1 |
Source: 2023, Authors.
The pYES2 vectors and plasmids for E. coli do not need cassettes, since they already possess a promoter-terminator pair.
The assembled plasmids were visualized using the SnapGene Viewer software, so we could analyze their sequence, features and genetic constructs.
After all of our cassettes were constructed, the best plasmid candidates for our project were the following:
Figure 3: pRS423
Source: 2023, Authors.
Figure 4: pRS424
Source: 2023, Authors.
Figure 5: pRS425
Source: 2023, Authors.
Figure 6: pRS426
Source: 2023, Authors.
Figure 7: pYES2
Source: 2023, Authors.
Figure 8: pET-26b(+)
Source: 2023, Authors.
After we obtained these plasmids, it was possible to build a total of 72 different circuits with different combinations between the enzymes’ coding sequences. With this, we could also evaluate all of them, optimize and choose two biological circuits, using the pRS425 and the pRS426 plasmid, shown in the following figures.
Figure 9: pRS426 biological circuit - CBDSub.
Source: 2023, Authors.
Figure 10: pRS425 biological circuit - CBDSyn.
Source: 2023, Authors.
Biological circuits designed with SnapGene software (www.snapgene.com)
Thinking of a way to represent both the gene expression of our biological circuits and the action of the metabolic pathway, since these are five interdependent enzymes, we looked for models in the literature that could translate cell behavior. In this context, mixed-effects models (ME) are a class of statistical models introduced to describe the response of different individuals in a population to known or predicted stimuli (Llamosi et al. , 2016).
As described by Llamosi et al. (2016), we consider a simplified dynamic case of gene expression, in which we consider the translational parameter to be dependent on the transcriptional parameter, in turn linked to the various factors that influence the start of transcription. Once the cell has been transformed and, by Gibson Assembly, the plasmid has been added to its genetic material, we would have a u(t) that represents the activity of the transcriptional factors (in our case, it could be, for example, the kinetics of RNA polymerase). In addition, there are different constants: ki and gi, which describe the production and decay rates, either of the mRNA (described by the equations as m(t)), or of the proteins (in the equations as p(t)). Thus, we would have:
m(t)=kmu(t)-gmm(t)
p(t)=kpu(t)-gpp(t)
In this way, we obtained a model to be followed in experimental analysis. We can observe the different conditions for each sample point and understand the expression profile from our yeast profile, and, therefore, understand the possible bottlenecks of transformed cellular metabolism. Thus, we can obtain, through regressions, the value of the constants k and g used for each equation, creating the model, based on the determination of the sample values (m(t) - mRNA concentration, p(t) - protein concentration and u(t ) - concentration of transcriptional factors) and facilitating the understanding of the production of the product of desire (CBDA).
Having described the profile considered for gene expression, we used Boolean logic based on biomolecules to describe a model of the logic gates of our metabolic pathway (Miyamoto et al. , 2013). With the understanding that the necessary substrates will be added, namely: galactose to activate the GAL1 promoter, as seen in the Biological Circuits section, and hexanoic acid, which, despite being present in the metabolic pathway, an addition of this substrate to the pathway will be evaluated, some substrates already present in the metabolic pathway of mevalonate and other secondary metabolisms of Saccharomyces cerevisiae will be considered as excess and, therefore, non-determinants of the Boolean logic gates.
For the first logic circuit, the transformation of both plasmids on a cell from the yeast population is considered. Thus, the only substrate added - apart from galactose - is hexanoic acid (in the circuit being described as "Hexanoic acid A"). It should be noted that both GPP (geranyl diphosphate), malonyl-CoA and other cofactors such as Mg2+ ions were considered to be in excess in the yeast intracellular medium.
Figure 1: Complete Boolean-based logic circuit for CBDA metabolic pathway.
Source: 2023, Authors (made with VisualParadigm).
It’s important to remember that the intermediate substrates are obtained from the reaction system. The substrate added is represented by the inputs: Galactose, which activates the promoter GAL1 (responsible for introducing the CDS sequences that express our enzymes), and Hexanoic Acid A, the CsHCS substrate. Besides, we also have the alternative Hexanoic Acid B, coming from the S. cerevisiae secondary metabolism, that corresponds to the “OR” gate represented above. All other logic gates would be “YES” ones, where they would dock to the corresponding ligand.
We also evaluate the circuits as binary models, based on this Boolean Logic applied. So, we applied 0 for “false” and 1 for “true” ones on the inputs and outputs responses. The Figure 2 below compiles the logic binary information for our metabolic pathway shown.
Figure 2: Binary inputs and outputs for the CBDA metabolic pathway.
Source: 2023, Authors.
Or, on a compiled way,
Figure 3: Compiled binary inputs for the CBDA metabolic pathway.
Source: 2023, Authors.
Where, on the “CsHCS - OR gate”, the input “A” represents the hexanoic acid added and the “B” one represents the acid already present at the cell metabolism. That allowed us to make a probabilistic study, giving 0.5 to each “YES” gate and 0.75 to the “OR” gate, as seen on Figure 4.
Figure 4: Probability for each substrate.
Source: 2023, Authors.
The numbers represent each gate used, respectively, as shown in Figure 1. This study is just to show how dependent each gate is from the one before, being important to follow the straight biologic way.
The second part of this study observes each circuit individually, in which we consider a population "S. cerevisiae A” of Saccharomyces transformed with the pRS426-CBDSub and one “S. cerevisiae B'' transformed with pRS425-CBDSyn. To the first of these, the same substrates discussed are added. However, we intend extract the Olivetolic Acid (OA) after carrying out the successive reactions, as shown at the Figure 5 below.
Figure 5: Boolean-based logic circuit for Olivetolic Acid metabolic pathway.
Source: 2023, Authors (made with VisualParadigm).
Similarly to the previous circuit, we also have the alternative Hexanoic Acid B, coming from the yeast metabolism, that corresponds to the “OR” gate and the “YES” logic gates. However, in this case, we intend to secrete the Olivetolic Acid, determinant substrate to produce the CBGA, immediately before CBDA.
In binary mode, we could obtain the following results,
Figure 6: Binary inputs and outputs for the OA metabolic pathway.
Source: 2023, Authors.
Where the “OR” gate works identically mentioned before. We also made the probabilistic study in this case, presented in Figure 7.
Figure 7: Probability for each substrate in the OA metabolic pathway.
Source: 2023, Authors.
As expected, with the number of gates having decreased, we observed a high probability of obtaining the product, in this case, the Olivetolic Acid.
Meanwhile, the cells with pRS425-CBDSyn plasmide will be a difference: instead we enter with the hexanoic acid as one of the substrates, the olivetolic acid obtained from the last population will become the real decisive substrate, as we can see at the following figure.
Figure 8: Boolean-based logic circuit from Olivetolic Acid to CBDA pathway.
Source: 2023, Authors (made with VisualParadigm).
In this case, the Galactose activates our promoter that makes possible the consequent protein expression, while the Olivetolic Acid is a determinant substrate for CsPT4. That’s why we have here a new gate, called an “AND” one, in need of both substrates.
Also seeing the binary mode of this circuit,
Figure 9: Binary inputs and outputs for the metabolic pathway from the OA to CBDA.
Source: 2023, Authors.
Unlike the previous circuits, this one presents another kind of gate, an “AND”, in which the input “A” means both promoter activation and the consequence enzyme production, and the “B” means the Olivetolic Acid. So, the figure 48 compiles the probabilistic study, as shown below.
Figure 10: Probability for each substrate in the metabolic pathway from OA to CBDAS.
Source: 2023, Authors.
It’s important to note that at the gate “AND”, numbered as 5, we found a probability of 0.25, since only one case in four inputs gives us a positive response.
All of this study gives us a better understanding of how the expression can be measured and evaluated at the Wet Lab stage.
Llamosi, A., Gonzalez-Vargas, A. M., Versari, C., Cinquemani, E., Ferrari-Trecate, G., Hersen, P., & Batt, G. (2016). What population reveals about individual cell identity: single-cell parameter estimation of models of gene expression in yeast. PLoS computational biology, 12(2), e1004706.
Miyamoto, T., Razavi, S., DeRose, R., & Inoue, T. (2013). Synthesizing biomolecule-based Boolean logic gates. ACS synthetic biology, 2(2), 72-82.
Visual Paradigm. © 1999-2023 by Visual Paradigm. All rights reserved.