Parts design
Enzymatic PET degradation
Enzymatic biodegradation of polyethylene terephthalate (PET) as a tool to combat accumulating plastic has been an intense area of research during recent years1. Multiple PET degrading enzymes such as Thc_Cut1 from Thermofibida Fusca2, or leaf-branch compost cutinase (LCC) found by a metagenomic screening of a compost sample are known3. However, arguably the most promising and certainly the most studied enzyme is the PET hydrolase IsPETase (EC 3.1.1.101), discovered from a novel strain I. sakaiensis, isolated from a Japanese landfill in 20164. Since its discovery, a multitude of studies have shown that its enzymatic properties can be improved both by rational5 and machine learning6 based engineering.
To complement IsPETase, I. sakaiensis also produces a secondary enzyme called MHETase, which allows for the degradation of the intermediate products all the way to the two constituent monomers of PET: terephthalic acid (TPA), and ethylene glycol (EG). The function of PETase, and MHETase is highly synergistic with reports on MHETase also being active on bis-(hydroxyethyl) terephthalate (BHET) in addition to mono(2-hydroxyethyl) terephthalate (MHET), and even short PET oligomers7. While some research has been conducted into engineering MHETase, most research has instead focused on improving the activity of PETase, as the initial depolymerization of the PET structure is the rate-limiting step in the overall process8. The pathway is summarized in Figure 1.
Figure 1. Overview of the PET degradation pathway. (A) PETase cuts the end of a PET-chain yielding BHET, MHET, or TPA depending on which bond is cleaved. (B) PETase is mostly responsible for further degrading BHET into MHET and EG, although it has been observed that MHETase is also active against BHET, to a lesser extent as against MHET7. (C) MHETase hydrolyses MHET into TPA and EG. Thus PETase and MHETase working together yield TPA and EG at the end of the process.
FAST-PETase
One of the most promising variants of IsPETase, coined FAST-PETase has been shown to exhibit improved PET degradation activity over previous IsPETase and LCC variants at a wide range of temperatures from 30°C to 50°C6. At sufficiently high concentrations, FAST-PETase is capable of fully degrading untreated consumer PET containers of moderate crystallinity in 48 hours. The improved properties are due to five swapped residues with respect to IsPETase: N233K/R224Q/S121E/D186H/R280A.
Of course, PETase needs to come into contact with PET for any degradation to occur. This part attempts to mimic I. sakaiensis, which secretes the enzyme. It was recently shown that by engineering the pelB signal sequence, IsPETase secretion from E. coli can be improved9. This is important, as the species is otherwise not particularly known for its secretory capabilities. The 2022 Uppsala iGEM team previously adopted a similar strategy. In characterizing this part, we hope to build on this work and provide a part that can be cloned into E. coli for efficient secretion of a recently discovered PET hydrolase.
Cycle 1
Insert design
The five mutated residues in FAST-PETase were introduced to the IsPETase sequence. Furthermore, the native signal peptide was replaced with the pelB signal sequence, as it has been shown to facilitate secretion better in E. coli9. To allow for efficient expression, codons of the FAST-PETase sequence were optimized with the codon optimization tool from IDT. Figure 2 shows the initial part design with flanking restriction sites for cloning.
Figure 2. The first design of the FAST-PETase insert.
Cloning
The insert can be cloned into the multiple cloning site (MCS) of the pET-22b expression vector using NdeI and XhoI as shown in Figure 3. The insert is placed downstream of an IPTG inducible T7lac promoter meaning that enzyme production can be controlled.
Figure 3. The result of cloning our initial FAST-PETase insert into pET-22b with NdeI and XhoI.
Cycle 2
Insert design
Due to problems with cloning, we had to change the target vector. We were supplied with pRSFDuet-1 vectors. While the sequence upstream of the MCS is very similar to pET-22b, the downstream 6xHis-tag is missing. As multiple cloning attempts had depleted our stocks of inserts, we decided to tweak the existing design and reorder from IDT.
A 6xHis-tag was added to the C-terminus, separated by a short two amino acid Gly/Ser-linker. A stop codon was introduced immediately downstream of the 6xHis-tag yielding the final construct as shown in Figure 4.
Figure 4. Second design of the FAST-PETase insert. The sequence without the flanking restriction sites has been submitted to the parts registry as a new composite part BBa_K4701300
Cloning
The insert can be cloned into the secondary MCS of the pRSFDuet-1 expression vector using NdeI and XhoI as shown in Figure 5. Once again, the insert is placed downstream of an IPTG inducible T7lac promoter.
Figure 5. The result of cloning our FAST-PETase insert into pRSFDuet-1 with NdeI and XhoI.
Before ordering the insert, the translation rate was predicted with the RBS calculator from the Salis lab11. As shown in Figure 6, the translation rate for the desired ORF is about 12060, which is about 172-fold higher than that of the next most translated ORF.
Figure 6. Predicted translation rates on the relative arbitrary unit scale for the FAST-PETase insert (BBa_K4701300) cloned into pRSFDuet-1 with NdeI and XhoI. Indexing begins from the first nucleotide after the T7lac promoter, with position 43 being the desired start codon.
The final sequence has been added as part BBa_K4701300.
The part is compatible with all the commonly used BioBrick assembly standards.
MHETase
While PETase is responsible for breaking down the structure of PET, the secondary enzyme MHETase is needed to further degrade the intermediate products to complete the PET degradation pathway.
Like in the case of FAST-PETase, our aim was to design a part that would allow us to secrete MHETase out of E. coli. Several signal peptides for the secretion of MHETase out of E. coli were tested in a recent study7. While the pelB signal sequence failed to secrete MHETase, the lamB signal sequence seemed promising. For this reason, we adopt lamB as the signal sequence. Generally speaking, the overall design in both cycles is very similar to the FAST-PETase parts, as they were designed in tandem both times.
Cycle 1
Insert design
As the activity of PETase forms the rate-limiting step in the PET degradation process, we used the wild-type protein sequence for MHETase. The native signal sequence was swapped for the lamB signal sequence as shown in Figure 7. Furthermore, codons were optimized with a tool from IDT, and BioBrick restriction sites were removed. As the synthesis of this part initially failed due to complexities introduced by repeats, the sequence was further manually tweaked and synthesis by IDT succeeded on the second attempt.
Figure 7. The first design of the MHETase insert.
Cloning
The insert can be cloned into the MCS of the pET-22b expression vector with NdeI and XhoI as shown in Figure 8. The insert is placed downstream of an IPTG inducible T7lac promoter meaning that enzyme production can be controlled.
Figure 8. The result of cloning our initial MHETase insert into pET-22b with NdeI and XhoI.
Cycle 2
Insert design
Due to the issues with cloning, the MHETase part too was updated to fit the pRSDuet-1 vector by adding a 6xHis-tag as shown in Figure 9.
Figure 9. The second design of the MHETase insert. The sequence without the flanking restriction sites has been added to the parts registry as a new composite part. BBa_K4701301.
Cloning
The updated insert can be cloned into the secondary MCS of pRSFDuet-1 using NdeI and XhoI as shown in Figure 10. Once again, the insert is placed downstream of an IPTG inducible T7lac promoter.
Figure 10. The result of cloning our MHETase insert into pRSFDuet-1 with NdeI and XhoI.
Again, translation rates were checked with the RBS calculator prior to ordering as shown in Figure 11.
Figure 11. Predicted translation rates on the relative arbitrary unit scale for the MHETase insert (BBa_K4701301) cloned into pRSFDuet-1. Indexing begins from the first nucleotide after the T7lac promoter, with positions 43 and 46 being the desired start sites. The translation rates are about 6810 and 9910 respectively with all other start codons being translated with a rate under 5.
The final sequence has been added as part BBa_K4701301.
The part is compatible with all the commonly used BioBrick assembly standards.
SpyCatcher-SpyTag fusion
As PETase and MHETase work synergistically to depolymerize PET, combining them into a fusion protein has successfully been shown to increase PET depolymerization8. Inspired by this, we also designed FAST-PETase and MHETase inserts containing the SpyCatcher/SpyTag system12. The general idea is that by prefixing FAST-PETase and MHETase with the SpyCatcher and SpyTag sequences respectively, the following protein products will spontaneously form an isopeptide bond, fusing the two enzymes together.
Briefly, we removed the signal peptides from both sequences, and further removed some unstructured sequence from the N-terminus of MHETase. Next, we prefixed FAST-PETase with a codon-optimized SpyCatcher00213 sequence, and prefixed MHETase with a SpyTag sequence designed for use in N-terminus fusions expressed from T7-based systems, followed by a short linker with the amino acid sequence GSGESGSG as recommended by the inventors of the system14. Figure 11 shows the parts. As a cloning strategy, we intended to use NdeI/XhoI restriction enzymes like with the other inserts described here.
Figure 11. Designed SpyCatcher-FAST-PETase fusion (BBa_K4701302), and SpyTag-MHETase fusion (BBa_K4701303) parts.
As mentioned before, additional N-terminus sequence was removed from MHETase. This is because as we were investigating the formation of the fusion protein with ColabFold in multimer mode15, we noticed an unstructured region. Figure 12 shows predicted conformations before and after the removal. Note that the predicted fold has the two enzymes sticking together, but this interaction is likely weak, and the structure would be more dynamic in any realistic environment. Unfortunately, due to time constraints, we were not able to test the inserts in the wet lab or conduct further dry lab modeling of the fusion protein.
Figure 12. ColabFold prediction of the SpyCatcher/SpyTag fusion of FAST-PETase and MHETase. Left: MHETase with unstructured N-terminus. Right: unstructured MHETase N-terminus removed. (cyan: MHETase, magenta: linkers, yellow: SpyCatcher002, red: SpyTag, orange: FAST-PETase, gray: His-tags)
References
- Qi X, Yan W, Cao Z, Ding M, Yuan Y. Current Advances in the Biodegradation and Bioconversion of Polyethylene Terephthalate. Microorganisms. 2021;10(1):39. doi:10.3390/microorganisms10010039
- Herrero Acero E, Ribitsch D, Steinkellner G, Gruber K, Greimel K, Eiteljoerg I, Trotscha E, Wei R, Zimmermann W, Zinn M, et al. Enzymatic Surface Hydrolysis of PET: Effect of Structural Diversity on Kinetic Properties of Cutinases from Thermobifida. Macromolecules. 2011;44(12):4632–4640. doi:10.1021/ma200949p
- Sulaiman S, Yamato S, Kanaya E, Kim J-J, Koga Y, Takano K, Kanaya S. Isolation of a Novel Cutinase Homolog with Polyethylene Terephthalate-Degrading Activity from Leaf-Branch Compost by Using a Metagenomic Approach. Applied and Environmental Microbiology. 2012;78(5):1556–1562. doi:10.1128/AEM.06725-11
- Yoshida S, Hiraga K, Takehana T, Taniguchi I, Yamaji H, Maeda Y, Toyohara K, Miyamoto K, Kimura Y, Oda K. A bacterium that degrades and assimilates poly(ethylene terephthalate). Science. 2016;351(6278):1196–1199. doi:10.1126/science.aad6359
- Son HF, Cho IJ, Joo S, Seo H, Sagong H-Y, Choi SY, Lee SY, Kim K-J. Rational Protein Engineering of Thermo-Stable PETase from Ideonella sakaiensis for Highly Efficient PET Degradation. ACS Catalysis. 2019;9(4):3519–3526. doi:10.1021/acscatal.9b00568
- Lu H, Diaz DJ, Czarnecki NJ, Zhu C, Kim W, Shroff R, Acosta DJ, Alexander BR, Cole HO, Zhang Y, et al. Machine learning-aided engineering of hydrolases for PET depolymerization. Nature. 2022;604(7907):662–667. doi:10.1038/s41586-022-04599-z
- Sagong H-Y, Seo H, Kim T, Son HF, Joo S, Lee SH, Kim S, Woo J-S, Hwang SY, Kim K-J. Decomposition of the PET Film by MHETase Using Exo-PETase Function. ACS Catalysis. 2020;10(8):4805–4812. doi:10.1021/acscatal.9b05604
- Knott BC, Erickson E, Allen MD, Gado JE, Graham R, Kearns FL, Pardo I, Topuzlu E, Anderson JJ, Austin HP, et al. Characterization and engineering of a two-enzyme system for plastics depolymerization. Proceedings of the National Academy of Sciences. 2020;117(41):25476–25485. doi:10.1073/pnas.2006753117
- Shi L, Liu H, Gao S, Weng Y, Zhu L. Enhanced Extracellular Production of Is PETase in Escherichia coli via Engineering of the pelB Signal Peptide. Journal of Agricultural and Food Chemistry. 2021;69(7):2245–2252. doi:10.1021/acs.jafc.0c07469
- Chen H, Bjerknes M, Kumar R, Jay E. Determination of the optimal aligned spacing between the Shine – Dalgarno sequence and the translation initiation codon of Escherichia coli m RNAs. Nucleic Acids Research. 1994;22(23):4953–4957. doi:10.1093/nar/22.23.4953
- Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nature Biotechnology. 2009;27(10):946–950. doi:10.1038/nbt.1568
- Zakeri B, Fierer JO, Celik E, Chittock EC, Schwarz-Linek U, Moy VT, Howarth M. Peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin. Proceedings of the National Academy of Sciences. 2012;109(12). doi:10.1073/pnas.1115485109
- Keeble AH, Banerjee A, Ferla MP, Reddington SC, Anuar INAK, Howarth M. Evolving Accelerated Amidation by SpyTag/SpyCatcher to Analyze Membrane Dynamics. Angewandte Chemie. 2017;129(52):16748–16752. doi:10.1002/ange.201707623
- Keeble AH, Howarth M. Insider information on successful covalent protein coupling with help from SpyBank. In: Methods in Enzymology. Vol. 617. Elsevier; 2019. p. 443–461. doi:10.1016/bs.mie.2018.12.010
- Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nature Methods. 2022;19(6):679–682. doi:10.1038/s41592-022-01488-1
Figure 1 was created with BioRender.com.
Plasmid maps were created with SnapGene
EG assimilation
Assimilation pathway for ethylene glycol
Apart from exploring wild-type strains capable of utilizing our PET monomers as carbon sources, we also investigated the possibility of engineering assimilation pathways into hosts unable to naturally metabolize them. For ethylene glycol (EG), we decided to pursue engineering an assimilation pathway by overexpressing two genes fucO, and aldA in E. coli, which have been shown to work in two recent studies1,2.
fucO codes for a propanediol oxidoreductase, which has also been shown to exhibit promiscuous activity toward ethylene glycol, converting it into glycolaldehyde. aldA codes for an aldehyde dehydrogenase, which further catalyzes a reaction to glycolate. From hereon, the pathway continues from glyoxylate through various steps to 2-phosphoglyceric acid (2PG), which is an intermediate in the standard glycolysis pathway, the idea is summarized in Figure 1.
Figure 1. The EG assimilation pathway is enabled by the overexpression of fucO and aldA, which bridge the gap from ethylene glycol to glycolate. From there, the pathway continues to glyoxylate and through some additional steps to 2-phosphoglycerate, which is an intermediate in standard glycolysis. Thus the carbon flux leads from ethylene glycol to pyruvate, allowing EG to be used as a carbon source2.
Insert design
For EG to assimilate into central carbon metabolism, the two genes fucO, and aldA need to be intracellularly expressed. The general strategy is to clone the two sequences into the dual-expression vector pRSFDuet-1.
For fucO, we use the variant with mutations I7L/L8V, as it has been shown to exhibit improved activity2. Codons were optimized with the tool from IDT, and restriction sites used in BioBrick assembly standards were removed.
Cloning
Both genes are cloned into pRSFDuet-1 one after the other. fucO is cloned into the first MCS with NcoI and EcoRI, while aldA is cloned into the second MCS with NdeI and XhoI as shown in Figure 2. Both genes are placed under the control of IPTG inducible T7 promoters allowing us to control the intracellular expression levels.
Figure 2. The result of cloning both fucO and aldA into the dual expression vector pRSFDuet-1.
Before ordering the inserts, we needed to make sure that both products would be expressed in similar quantities. This is important as each enzyme is needed once in the assimilation pathway, and either bottlenecking the flux by expressing too little aldA, or causing metabolic burden by overly expressing fucO are not desired outcomes. The first scenario might be especially problematic as the accumulation of intermediate glycolaldehyde could potentially be toxic to the E. coli cells3.
The promoter sequences upstream of the two genes are identical. This means that, at least on paper, the transcription rates should be identical for both genes. However, expression levels are also affected by translation initiation rates. Even though the two Shine-Dalgarno sequences upstream of our genes are the same, the differences in the standby, spacing, and 5’ ends of the coding sequences affect mRNA folding. This is important as stable secondary structures, or the lack of them, affect the total ΔG of ribosome binding, affecting translation rates4.
We used the RBS calculator from the Salis lab4 to predict the translation rates and received predictions of about 81600 and 3000 for fucO and aldA respectively on the relative arbitrary unit scale. To balance out the translation rates as well as possible, we used Vienna RNAfold5 to analyze ΔG of the mRNA folding around the ribosome binding sites and introduced silent mutations to the 5’ ends of the coding sequences. After many rounds of trial and error, we managed to decrease the predicted translation rate of fucO to around 29400, and increase the rate for aldA to around 10600. As the upstream sequence on the backbone is fixed, we weren’t able to fully balance out the predicted rates as the number of available silent mutations is also rather limited. On the other hand, the translation rates were now at least roughly of the same magnitude.
The final sequences for fucO and aldA have been added as basic parts BBa_K4701210 and BBa_K4701211 respectively.
Due to time constraints, we were unfortunately unable to attempt the assembly in the wet lab.
References
- Panda S, Fung VYK, Zhou JFJ, Liang H, Zhou K. Improving ethylene glycol utilization in Escherichia coli fermentation. Biochemical Engineering Journal. 2021;168:107957. doi:10.1016/j.bej.2021.107957
- Pandit AV, Harrison E, Mahadevan R. Engineering Escherichia coli for the utilization of ethylene glycol. Microbial Cell Factories. 2021;20(1):22. doi:10.1186/s12934-021-01509-2
- Tiso T, Winter B, Wei R, Hee J, De Witt J, Wierckx N, Quicker P, Bornscheuer UT, Bardow A, Nogales J, et al. The metabolic potential of plastics as biotechnological carbon sources – Review and targets for the future. Metabolic Engineering. 2022;71:77–98. doi:10.1016/j.ymben.2021.12.006
- Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nature Biotechnology. 2009;27(10):946–950. doi:10.1038/nbt.1568
- Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. ViennaRNA Package 2.0. Algorithms for Molecular Biology. 2011;6(1):26. doi:10.1186/1748-7188-6-26
Figure 1 was created with BioRender.com.
Plasmid maps were created with SnapGene
TPA assimilation
Assimilation pathway for terephthalic acid
Apart from exploring wild-type strains capable of utilizing our PET monomers as carbon sources, we also wanted to look into engineering assimilation pathways into hosts unable to naturally metabolize them. For terephthalic acid (TPA), we decided to design a synthetic operon expressing five different genes for Pseudomonas putida KT2440, which is generally regarded as a robust host for metabolic engineering1. The designed synthetic operon contains a tpaK TPA transporter from Rhodococcus jostii, and genes tphA2IIA3IIBIIA1, from Comamonas sp. E6, and is based both on the wild-type operon, previously used designs, as well as biophysical modeling.
The tph operon
One of the most thoroughly studied assimilation pathways for TPA is found in various strains of Comamonas that can utilize TPA as the sole carbon source2. Two similar operons tphI, and tphII code for enzymes enabling the uptake of TPA, and further metabolism to protocatechuic acid. For the rest of this text, we will specifically focus on the tphII operon.
The tphII operon, which can also be written as tphRIICIIA2IIA3IIBIIA1II has a total of six genes. tphRII serves a regulatory function, allowing Comamonas sp. E6 to express the catabolic operon tphCIIA2IIA3IIBIIA1II which is transcribed separately3. As we aim to engineer P. putida to grow in an environment where TPA is always abundant, we will not go further into the regulatory function of tphRII. We will instead focus on the catabolic operon, which is sufficient to allow the utilization of TPA, given that it is transcribed.
As TPA has a molecular mass of around 166 Da, it is too large to enter the cell by free diffusion. The first gene in the catabolic operon tphCII codes for a TPA transporter. We will, however, omit this gene from our operon as we will utilize tpaK for TPA transport as described later. After TPA has been transported across the cellular membrane, the four other enzymes will connect TPA to existing pathways within P. putida KT2440 as shown in Figure 1.
Briefly, TPA is metabolized into 1,2-dihydroxy-3,5-cyclohexadiene-1,4-dicarboxylic acid (DCD), by TPA 1,2-dioxygenase (TPADO)2, which is a protein complex encoded by tphA1IIA2IIA3II. DCD is further metabolized into protocatechuic acid (PCA) by a DCD dehydrogenase encoded by tphBII. PCA is a common intermediate utilized by various organisms, and in P. putida, the pathway continues towards central metabolism via the PCA-3,4-dioxygenase pathway4.
Figure 1. Sketch of the assimilation pathway. Structures for tpaK, tphB, and tphA1 are predicted by AlphaFold. The oxygenase subunit of TPADO formed by tphA2 and tphA3 is shown as a heterohexamer as determined from a crystal structure5.
In addition to literature, the feasibility of the pathway was also assessed with metabolic modeling. You can read more about those results on our modeling page.
tphCII:
tphCII encodes for a TPA transporter, which allows TPA to enter the cell (UniProt: Q3C1D6). In the original paper discussing the tphII operon, it was found that disrupting this gene caused Comamonas sp. E6 to lose its ability to grow on TPA2.
tphA1IIA2IIA3II:
The genes tphA1IIA2IIA3II are responsible for the first step in the assimilation pathway. Together they form a protein complex where tphA1II (UniProt: Q3C1D2) codes for the reductase component, while tphA2II (UniProt: Q3C1D5), and tphA3II (UniProt: Q3C1D4) code for the large and small oxygenase subunits respectively.
tphBII:
tphBII (UniProt: Q3C1D3) is responsible for coding for a DCD dehydrogenase, which catalyzes the second reaction in the pathway from DCD to PCA.
The catabolic tph operon in Comamonas sp. E6:
The total length of the catabolic operon in Comamonas sp. E6 is 4651bp with the genes appearing in the order tphCIIA2IIA3IIBIIA1II as shown in Figure 2. It can also be seen, that the genes appear one after another, and indeed they are transcribed together3. We suspect that the order of the genes is important for the correct co-folding of the TPADO complex, as the order of the genes seems to be conserved in other species capable of TPA utilization. For example, we analyzed a similar operon in a strain of Rhodococcus opacus as described in our modeling page. Thus we will conserve the gene order in the following synthetic operon.
Figure 2. The catabolic tph operon from Comamonas sp. E6. (GenBank: AB238679). The 3’ end of tphA2II overlaps with 5’ of tphA3II while its 3’ overlaps with the 5’ end of tphBII. tphBII is separated from tphA1II by a 9 bp gap.
tpaK transporter
Engineering the assimilation pathway enabled by the tph operon into P. putida KT2440 is not an entirely novel idea. A recent paper found that P. putida KT2440 was not able to utilize TPA as a sole carbon source with the tphC transporter6. Instead, another TPA transporter called tpaK (UniProt: Q0RWE8) from R. jostii was utilized. We thus took the same approach, but instead of transcribing tpaK separately from the other genes, we opted to simply replace tphC with tpaK in the same transcript giving us the final sequence of genes for our synthetic operon: tpaK-tphA2IIA3IIBIIA1II.
Expression cassette with tpaK-tphA2IIA3IIBIIA1II
Three expression cassettes with alternative promoter sequences were designed. The constructs have been added to the registry as composite parts BBa_K4701306, BBa_K4701307, and BBa_K4701308. As the design process includes many steps, discussions about different parts of the cassettes have been split into sections below.
Briefly, the cassettes were designed with the stability of the mRNA transcript in mind, as all of the five open reading frames need to be translated for the desired function. Furthermore, predicted translational rates of the ORFs were analyzed with biophysical models to ensure that no obvious mistakes prevent all the ORFs from being translated. Lastly, as the final length of the cassette is 5472bp, we adopt Gibson assembly as the method for plasmid construction. The choice of pSEVA231 as the backbone is also justified below.
Nucleotide sequences for tphA2IIA3IIBIIA1II were acquired from Comamonas sp. E6 (GenBank: AB238679). The sequence for tpaK was acquired from the full sequence of the pRHL2 plasmid from R. jostii RHA1 (GenBank: CP000433). Codons were optimized with the IDT codon optimization tool using the Pseudomonas putida codon usage tables such that all rare codons (frequency < 0.1) were eliminated. Furthermore, some manual tweaking was done to balance translation rates (see below), to weaken alternative transcription start sites, and to remove restriction sites used in some of the BioBrick assembly standards.
As the intention is to clone the operon into bacteria that will live with an abundant supply of TPA, we opted for a constitutive promoter to reduce the complexity of the already quite complicated construct. The promoter used should enable the generation of an optimal amount of transcripts as too low transcription rates will result in a small flux through the TPA assimilation pathway, while too high transcription rates result in unnecessary metabolic burden and possibly an evolutionary disadvantage. As choosing the correct promoter beforehand is essentially impossible, we chose to create three alternative constructs with promoters of different strengths.
The Anderson promoter collection is often used in iGEM projects to find a suitable promoter. Recently, the collection was benchmarked in P. putida KT24407. Based on these results, we picked three promoters with a wide range of expression levels. J23102 will be the strong promoter with an expression rate of around 24100 (arbitrary units), while J23105 and J21114 act as the medium, and weak promoters with expression rates of around 10200 and 1300 respectively.
As stated before, one design goal for the cassette is mRNA stability. This is important for two reasons. Firstly, we want mRNA to remain stable long enough to allow the translation of all ORFs. Secondly, we want to find the optimal expression levels by controlling transcription rates, and thus we want to minimize the amount of variability on the mRNA level. As there is a need for a spacer sequence between the promoter and the first ribosome binding site regardless, we attempted to rationally design the sequence instead of opting for a random sequence.
Many bacterial pathways for mRNA degradation are dependent on the 5’ UTR sequence of the transcript8. It has also been observed that the leading trinucleotide has an effect on the rate at which the 5’ end of an mRNA transcript is hydrolyzed, which acts as a starting point for many mRNA degradation pathways. ATG was observed to be the most stable leading trinucleotide, thus we follow the promoter ATGATG.
This sequence is followed by CGAC, forming the SaII restriction site GTCGAC. This is useful as it gives us the ability to remove the promoter from our fragments down the line if the need arises to swap the promoter to something else, perhaps an inducible one. Furthermore, SaII is not a part of the BioBrick assembly standards, so including it is not a problem from the perspective of iGEM either.
Before the first ribosome binding site, we also introduce an insulating hairpin. The goal here is to keep the secondary structure around the first RBS predictable and robust to degradation of the upstream sequence. Generally speaking, secondary structures around RBS are important as they affect the total ΔG of ribosome binding, which subsequently affects translation rates9. Furthermore, the translation rate of the first ORF is especially important as translation rates of downstream ORFs are also affected via translational coupling10 (See below). The secondary structure around the first RBS was predicted with Vienna RNAfold11, the structure is shown in Figure 3. Furthermore, we used the Promoter calculator12, to check that the insulating hairpin is predicted to be included in our transcripts as shown in Figure 4.
Figure 3. Predicted centroid secondary structure of the mRNA transcript around the first RBS. The insulating hairpin forms upstream of the first RBS which itself is predicted to stay in a linear form allowing for easy ribosome binding. Different parts of the RBS are annotated as used in the biophysical model behind the RBS calculator9.
Figure 4. Transcription rates by base as predicted by the Promoter calculator12. Note that currently the model is only available for E. coli, but as it relies on modeling the biophysics between the sequence and the RNAP/σ70 complex, the predictions should in theory apply to P. putida KT2440 too.
Another important factor to consider is the choice of the transcription terminator. This is important as leaky terminators can affect mRNA stability13. Furthermore, there is not much data on the efficiency of the T0 terminator, which would be located downstream of our insert on pSEVA231 (see the choice of plasmid below). A recent study benchmarked various terminator sequences in P. putida KT244014, and we opted to use the most effective one. The sequence is presented in the paper as “T1”, and has also been previously added to the iGEM registry as part BBa_K3675003.
The design of the ribosome binding sites and the flanking sequences is perhaps the most important part when it comes to the overall operon design as we want all ORFs to be translated on a sufficient level. RBS sequences in P. putida generally are not as well characterized as in model organisms like E. coli13. However, biophysical models have been developed to design RBS sequences and to predict translation rates9. Briefly, the RBS calculator, predicts mRNA secondary structure around the RBS sequence and calculates the total ΔG of ribosome binding. The general idea then is that the larger the decrease in the amount of free energy in ribosome binding is, the more spontaneously this binding will happen, directly affecting translation initiation rates.
Initially, we designed RBS sequences for all five ORFs separately using the RBS calculator in design mode with P. putida KT2440 as the host organism. However, the translation rates of ORFs within an operon cannot be predicted reliably in isolation. This is because the standby sequence upstream of the Shine-Dalgarno sequence affects mRNA structure, meaning that the upstream gene can affect the folding of the next RBS10. Additionally, the translation rate of downstream ORFs is partially controlled by the translation rates of upstream ORFs in the way of translational coupling. Briefly, stable mRNA structures that do not spontaneously unfold due to ΔG of ribosome binding being positive, or not favorable enough, can be unfolded by ribosomes translating the upstream gene as they have additional energy from GTP hydrolysis. The unfolding allows for new ribosomes to bind, or the upstream ribosome to reattach after disengaging.
The operon calculator is a biophysical model, that can be used to predict translational rates in operons, and takes translational coupling into account10. Furthermore, it gives other useful information about the sequence, such as predicted internal transcription start sites, repeated regions, intrinsic terminators, and RNAse sites. Its predictive power has also been demonstrated empirically15.
Many design rounds were done with the operon calculator. The output was examined and adjustments were made before running the algorithm again. Figures 5 and 6 show the suboptimal results given by the initial sequence. As can be seen, there are large differences in the translation initiation rates (TIR) between the genes. tpaK has a predicted TIR of around 9300, while for tphA3II TIR is only around 35. Furthermore, the sequence contained many transcription start sites with higher predicted rates in both strands than the desired start site. The low translation rates also mean that the transcript is classified as highly unstable, as highly translated mRNA gets covered by elongating ribosomes, that protect the transcript from RNases10.
Figure 5. Predicted translation initiation rates of genes tpaK-tphA2IIA3IIBIIA1II in the first design cycle.
Figure 6. Predicted transcription initiation rates within the operon by start position in the first design cycle.
Figures 7 and 8 show the predicted translation and transcription rates of the final construct respectively. As can be seen, all of the five ORFs are now predicted to be translated on a high level, with alternative ORFs having significantly lower translation rates. Furthermore, alternative transcription start sites have been eliminated for the most part. Overall, the transcript is classified as moderately stable, with the only issue being an RNase E site within the first RBS. However, this issue cannot really be solved, as the site exists because the insulating hairpin keeps the sequence of the first RBS highly accessible by design to allow high translation rates.
Figure 7. Predicted translation rates of the final construct. The range is between 27750 for tpaK to around 8070 for tphA1II.
Figure 8. Predicted transcription rates of the final construct, with promoter J23102.
All in all, many modifications were made to the sequence over the ten or so design cycles. Briefly, intrinsic terminators called by the operon calculator were removed. Alternative Shine-Dalgarno-like sequences were removed in both strands to minimize the translation of alternative ORFs. Sequences resembling the -35 promoter element or the Pridnow Box were eliminated to minimize alternative transcripts in both directions. Repeats were minimized to make sure fragment synthesis would be successful. The RBS sequences and their flanking sequences were tweaked many times to achieve as uniformly predicted translation rates as possible. All of this was achieved using silent mutations while taking care of keeping codons moderately optimized and keeping the sequence compliant with assembly standards RFC 10, and RFC 23. Figure 9 shows a sketch of the final design.
Figure 9. Sketch of the final design with the strong J23102 promoter. All RBS sequences are 25 bp long making the construct 5432 bp long. The SalI site can be used to change the promoter sequence if needed.
Apart from the insert design, the backbone choice is also important as the plasmid copy number (PCN) will have a large effect on overall expression levels. As the goal is not to produce copious amounts of a single product for downstream purposes, but instead augment existing metabolism, low to medium copy numbers would be preferable.
Unfortunately, many replicons that are considered low-copy in E. coli have been shown to produce higher PCN in P. putida KT244016. The lowest copy numbers of benchmarked replicons were achieved with the BBR1, and RK2 replicons with both having a copy number around 30.
We opted to use BBR1 out of necessity. We acquired P. putida KT2440 from the Microbial Domain Biological Resource Centre HAMBI at the University of Helsinki (HAMBI: 3694). This strain has been modified with plasmids containing the RK2 replicon in a previous study17. Thus using RK2 in our design would cause problems. As BBR1, and RK2 do not belong in the same incompatibility group, problems should not be caused by choosing BBR1 as the replicon16. As for the antimicrobial resistance marker, we chose kanamycin as it is widely accessible, and ampicillin is generally needed in high concentrations to be effective against P. putida13.
SEVA plasmids (Standard European Vector Architecture) are commonly used when engineering P. putida13. The pSEVA231 vector has the BBR1 replicon with the kanamycin resistance gene along with a single multiple cloning site, into which our cassette can be inserted. Furthermore, the consortium behind SEVA distributes the standardized backbones for free, making them an overall good resource for iGEM teams.
As the final length of the synthetic operon is 5472bp, it is too long to be synthesized in one go. Using restriction cloning to combine multiple fragments is rather clumsy, and as we had had issues with the technique with our other inserts, we planned to use another more suitable assembly method, known as Gibson assembly.
As the DNA fragments would be ordered from Twist Bioscience, which at the time of writing could synthesize fragments up to 1700 bp long, we needed to split our operon into four fragments. We planned to use the Gibson Assembly HiFi Cloning Kit from our sponsor ThermoFisher Scientific for assembly, and thus we turned to their manuals when designing the assembly fragments. Generally speaking, the kit can be used to combine the four fragments with the linearized pSEVA231 vector in one reaction given that the fragments share appropriate 40 bp overlaps. The overlaps should contain no GC-extremities, tandem repeats, or strong secondary structures. As Gibson assembly with our kit is done at 50 °C, melting temperatures below this for ssDNA should prevent secondary structures from causing issues. Secondary structures formed by the overlaps were analyzed with the IDT-provided interface to UnaFold18. Potential self-dimers were screened for with the IDT OligoAnalyzer tool.
Finally, a total of six fragments were ordered from Twist Bioscience as listed in Table 1. The first fragment was ordered with an upstream overlap with pSEVA231, and the last fragment was ordered with a downstream overlap with pSEVA231. Furthermore, three variants of the first fragment with different promoters were ordered. The overlaps are further listed in Table 2, with a sketch of the cloning strategy shown in Figure 10.
Table 1. Final Gibson fragments ordered from Twist Bioscience
Name | Description | Length |
---|---|---|
Fragment 1a | Fragment 1 with J23102 promoter / tpaK | 988bp |
Fragment 1b | Fragment 1 with J23105 promoter / tpaK | 988bp |
Fragment 1c | Fragment 1 with J23114 promoter / tpaK | 988bp |
Fragment 2 | Fragment 2 with tpaK/tphA2 | 1481bp |
Fragment 3 | Fragment 3 with tphA2/tphA3/tphB | 1688bp |
Fragment 4 | Fragment 4 with tphB/tphA1/pT1 terminator | 1435bp |
Table 2. Gibson overlaps chosen for the assembly
Overlap | Sequence | Range | GC-content | Tandem repeats | Tm |
---|---|---|---|---|---|
PacI/1 | TCTTTCGACTGAGCCTTTCGTTTTATTTGATGCCTTTAAT | 1-40 | 35% | No | 30.3°C |
1/2 | AGTTCTACGCCCTGCAAAGCTGGTTGCCGTCCATCATGAC | 949-988 | 55% | No | 43.5°C |
2/3 | TAACCCTCCAGATCCTCTCGGTATTCCCCGGTTTCGTCCT | 2390-2429 | 55% | No | 43.6°C |
3/4 | TACGTCGCAATGCTGCACGATCAGGGTCACATTCCTATCA | 4038-4077 | 50% | No | 39.1°C |
4/SpeI | CTAGTCTTGGACTCCTGTTGATAGATCCAGTAATGACCTC | 5433-5472 | 45% | No | 44.8°C |
Figure 10. Correspondence between the coding sequences and Gibson fragments. The first fragment can be changed to fragments 1b or 1c to reduce transcription rate. All the gibson overlaps are 40 bp in length.
After pSEVA231 is linearized with PacI and SpeI, successful Gibson assembly yields the construct shown in Figure 11.
Figure 11. Final design of the construct assembled with Gibson assembly. High transcription variant.
The composite parts have been added to the parts registry as BBa_K4701306, BBa_K4701307, and BBa_K4701308 for high, medium, and low transcription rates respectively.
See out parts list for the basic parts, including the synthetic RBS sequences used.
Due to time constraints, we were unfortunately unable to attempt the assembly in the wet lab.
References
- Nikel PI, De Lorenzo V. Pseudomonas putida as a functional chassis for industrial biocatalysis: From native biochemistry to trans-metabolism. Metabolic Engineering. 2018;50:142–155. doi:10.1016/j.ymben.2018.05.005
- Sasoh M, Masai E, Ishibashi S, Hara H, Kamimura N, Miyauchi K, Fukuda M. Characterization of the Terephthalate Degradation Genes of Comamonas sp. Strain E6. Applied and Environmental Microbiology. 2006;72(3):1825–1832. doi:10.1128/AEM.72.3.1825-1832.2006
- Kasai D, Kitajima M, Fukuda M, Masai E. Transcriptional Regulation of the Terephthalate Catabolism Operon in Comamonas sp. Strain E6. Applied and Environmental Microbiology. 2010;76(18):6047–6055. doi:10.1128/AEM.00742-10
- Salvador M, Abdulmutalib U, Gonzalez J, Kim J, Smith AA, Faulon J-L, Wei R, Zimmermann W, Jimenez JI. Microbial Genes for a Circular and Sustainable Bio-PET Economy. Genes. 2019;10(5):373. doi:10.3390/genes10050373
- 5. Kincannon WM, Zahn M, Clare R, Lusty Beech J, Romberg A, Larson J, Bothner B, Beckham GT, McGeehan JE, DuBois JL. Biochemical and structural characterization of an aromatic ring–hydroxylating dioxygenase for terephthalic acid catabolism. Proceedings of the National Academy of Sciences. 2022;119(13):e2121426119. doi:10.1073/pnas.2121426119
- Werner AZ, Clare R, Mand TD, Pardo I, Ramirez KJ, Haugen SJ, Bratti F, Dexter GN, Elmore JR, Huenemann JD, et al. Tandem chemical deconstruction and biological upcycling of poly(ethylene terephthalate) to β-ketoadipic acid by Pseudomonas putida KT2440. Metabolic Engineering. 2021;67:250–261. doi:10.1016/j.ymben.2021.07.005
- Pearson AN, Thompson MG, Kirkpatrick LD, Ho C, Vuu KM, Waldburger LM, Keasling JD, Shih PM. The pGinger Family of Expression Plasmids Bond DR, editor. Microbiology Spectrum. 2023;11(3):e00373-23. doi:10.1128/spectrum.00373-23
- Cetnar DP, Salis HM. Systematic Quantification of Sequence and Structural Determinants Controlling mRNA stability in Bacterial Operons. ACS Synthetic Biology. 2021;10(2):318–332. doi:10.1021/acssynbio.0c00471
- Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nature Biotechnology. 2009;27(10):946–950. doi:10.1038/nbt.1568
- Tian T, Salis HM. A predictive biophysical model of translational coupling to coordinate and control protein expression in bacterial operons. Nucleic Acids Research. 2015;43(14):7137–7151. doi:10.1093/nar/gkv635
- Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. ViennaRNA Package 2.0. Algorithms for Molecular Biology. 2011;6(1):26. doi:10.1186/1748-7188-6-26
- LaFleur TL, Hossain A, Salis HM. Automated model-predictive design of synthetic promoters to control transcriptional profiles in bacteria. Nature Communications. 2022;13(1):5159. doi:10.1038/s41467-022-32829-5
- Martin-Pascual M, Batianis C, Bruinsma L, Asin-Garcia E, Garcia-Morales L, Weusthuis RA, Van Kranenburg R, Martins Dos Santos VAP. A navigation guide of synthetic biology tools for Pseudomonas putida. Biotechnology Advances. 2021;49:107732. doi:10.1016/j.biotechadv.2021.107732
- Amarelle V, Sanches-Medeiros A, Silva-Rocha R, Guazzaroni M-E. Expanding the Toolbox of Broad Host-Range Transcriptional Terminators for Proteobacteria through Metagenomics. ACS Synthetic Biology. 2019;8(4):647–654. doi:10.1021/acssynbio.8b00507
- Ng CY, Farasat I, Maranas CD, Salis HM. Rational design of a synthetic Entner–Doudoroff pathway for improved and controllable NADPH regeneration. Metabolic Engineering. 2015;29:86–96. doi:10.1016/j.ymben.2015.03.001
- Cook TB, Rand JM, Nurani W, Courtney DK, Liu SA, Pfleger BF. Genetic tools for reliable gene expression and recombineering in Pseudomonas putida. Journal of Industrial Microbiology and Biotechnology. 2018;45(7):517–527. doi:10.1007/s10295-017-2001-5
- Leedjärv A, Ivask A, Virta M. Interplay of Different Transporters in the Mediation of Divalent Heavy Metal Resistance in Pseudomonas putida KT2440. Journal of Bacteriology. 2008;190(8):2680–2689. doi:10.1128/JB.01494-07
- Markham NR, Zuker M. UNAFold. In: Keith JM, editor. Bioinformatics. Vol. 453. Totowa, NJ: Humana Press; 2008. p. 3–31. (Walker JM, editor. Methods in Molecular BiologyTM). http://link.springer.com/10.1007/978-1-60327-429-6_1. doi:10.1007/978-1-60327-429-6_1
Figures apart from plasmid maps and those generated by the RBS calculator were created with BioRender.com