Parts design

Enzymatic PET degradation

Enzymatic biodegradation of polyethylene terephthalate (PET) as a tool to combat accumulating plastic has been an intense area of research during recent years¹. Multiple PET degrading enzymes such as Thc_Cut1 from Thermofibida Fusca², or leaf-branch compost cutinase (LCC) found by a metagenomic screening of a compost sample are known³. However, arguably the most promising and certainly the most studied enzyme is the PET hydrolase IsPETase (EC 3.1.1.101), discovered from a novel strain I. sakaiensis, isolated from a Japanese landfill in 2016⁴. Since its discovery, a multitude of studies have shown that its enzymatic properties can be improved both by rational⁵ and machine learning⁶ based engineering.

To complement IsPETase, I. sakaiensis also produces a secondary enzyme called MHETase, which allows for the degradation of the intermediate products all the way to the two constituent monomers of PET: terephthalic acid (TPA), and ethylene glycol (EG). The function of PETase, and MHETase is highly synergistic with reports on MHETase also being active on bis-(hydroxyethyl) terephthalate (BHET) in addition to mono(2-hydroxyethyl) terephthalate (MHET), and even short PET oligomers⁷. While some research has been conducted into engineering MHETase, most research has instead focused on improving the activity of PETase, as the initial depolymerization of the PET structure is the rate-limiting step in the overall process⁸. The pathway is summarized in Figure 1.

Figure 1. Overview of the PET degradation pathway. (A) PETase cuts the end of a PET-chain yielding BHET, MHET, or TPA depending on which bond is cleaved. (B) PETase is mostly responsible for further degrading BHET into MHET and EG, although it has been observed that MHETase is also active against BHET, to a lesser extent as against MHET⁷. (C) MHETase hydrolyses MHET into TPA and EG. Thus PETase and MHETase working together yield TPA and EG at the end of the process.

FAST-PETase

One of the most promising variants of IsPETase, coined FAST-PETase has been shown to exhibit improved PET degradation activity over previous IsPETase and LCC variants at a wide range of temperatures from 30°C to 50°C⁶. At sufficiently high concentrations, FAST-PETase is capable of fully degrading untreated consumer PET containers of moderate crystallinity in 48 hours. The improved properties are due to five swapped residues with respect to IsPETase: N233K/R224Q/S121E/D186H/R280A.

Of course, PETase needs to come into contact with PET for any degradation to occur. This part attempts to mimic I. sakaiensis, which secretes the enzyme. It was recently shown that by engineering the pelB signal sequence, IsPETase secretion from E. coli can be improved⁹. This is important, as the species is otherwise not particularly known for its secretory capabilities. The 2022 Uppsala iGEM team previously adopted a similar strategy. In characterizing this part, we hope to build on this work and provide a part that can be cloned into E. coli for efficient secretion of a recently discovered PET hydrolase.

Cycle 1

Insert design

The five mutated residues in FAST-PETase were introduced to the IsPETase sequence. Furthermore, the native signal peptide was replaced with the pelB signal sequence, as it has been shown to facilitate secretion better in E. coli⁹. To allow for efficient expression, codons of the FAST-PETase sequence were optimized with the codon optimization tool from IDT. Figure 2 shows the initial part design with flanking restriction sites for cloning.

Figure 2. The first design of the FAST-PETase insert.

Cloning

The insert can be cloned into the multiple cloning site (MCS) of the pET-22b expression vector using NdeI and XhoI as shown in Figure 3. The insert is placed downstream of an IPTG inducible T7lac promoter meaning that enzyme production can be controlled.

Figure 3. The result of cloning our initial FAST-PETase insert into pET-22b with NdeI and XhoI.

Cycle 2

Insert design

Due to problems with cloning, we had to change the target vector. We were supplied with pRSFDuet-1 vectors. While the sequence upstream of the MCS is very similar to pET-22b, the downstream 6xHis-tag is missing. As multiple cloning attempts had depleted our stocks of inserts, we decided to tweak the existing design and reorder from IDT.

A 6xHis-tag was added to the C-terminus, separated by a short two amino acid Gly/Ser-linker. A stop codon was introduced immediately downstream of the 6xHis-tag yielding the final construct as shown in Figure 4.

Figure 4. Second design of the FAST-PETase insert. The sequence without the flanking restriction sites has been submitted to the parts registry as a new composite part BBa_K4701300

Cloning

The insert can be cloned into the secondary MCS of the pRSFDuet-1 expression vector using NdeI and XhoI as shown in Figure 5. Once again, the insert is placed downstream of an IPTG inducible T7lac promoter.

Figure 5. The result of cloning our FAST-PETase insert into pRSFDuet-1 with NdeI and XhoI.

Before ordering the insert, the translation rate was predicted with the RBS calculator from the Salis lab¹¹. As shown in Figure 6, the translation rate for the desired ORF is about 12060, which is about 172-fold higher than that of the next most translated ORF.

Figure 6. Predicted translation rates on the relative arbitrary unit scale for the FAST-PETase insert (BBa_K4701300) cloned into pRSFDuet-1 with NdeI and XhoI. Indexing begins from the first nucleotide after the T7lac promoter, with position 43 being the desired start codon.

The final sequence has been added as part BBa_K4701300.

The part is compatible with all the commonly used BioBrick assembly standards.

MHETase

While PETase is responsible for breaking down the structure of PET, the secondary enzyme MHETase is needed to further degrade the intermediate products to complete the PET degradation pathway.

Like in the case of FAST-PETase, our aim was to design a part that would allow us to secrete MHETase out of E. coli. Several signal peptides for the secretion of MHETase out of E. coli were tested in a recent study⁷. While the pelB signal sequence failed to secrete MHETase, the lamB signal sequence seemed promising. For this reason, we adopt lamB as the signal sequence. Generally speaking, the overall design in both cycles is very similar to the FAST-PETase parts, as they were designed in tandem both times.

Cycle 1

Insert design

As the activity of PETase forms the rate-limiting step in the PET degradation process, we used the wild-type protein sequence for MHETase. The native signal sequence was swapped for the lamB signal sequence as shown in Figure 7. Furthermore, codons were optimized with a tool from IDT, and BioBrick restriction sites were removed. As the synthesis of this part initially failed due to complexities introduced by repeats, the sequence was further manually tweaked and synthesis by IDT succeeded on the second attempt.

Figure 7. The first design of the MHETase insert.

Cloning

The insert can be cloned into the MCS of the pET-22b expression vector with NdeI and XhoI as shown in Figure 8. The insert is placed downstream of an IPTG inducible T7lac promoter meaning that enzyme production can be controlled.

Figure 8. The result of cloning our initial MHETase insert into pET-22b with NdeI and XhoI.

Cycle 2

Insert design

Due to the issues with cloning, the MHETase part too was updated to fit the pRSDuet-1 vector by adding a 6xHis-tag as shown in Figure 9.

Figure 9. The second design of the MHETase insert. The sequence without the flanking restriction sites has been added to the parts registry as a new composite part. BBa_K4701301.

Cloning

The updated insert can be cloned into the secondary MCS of pRSFDuet-1 using NdeI and XhoI as shown in Figure 10. Once again, the insert is placed downstream of an IPTG inducible T7lac promoter.

Figure 10. The result of cloning our MHETase insert into pRSFDuet-1 with NdeI and XhoI.

Again, translation rates were checked with the RBS calculator prior to ordering as shown in Figure 11.

Figure 11. Predicted translation rates on the relative arbitrary unit scale for the MHETase insert (BBa_K4701301) cloned into pRSFDuet-1. Indexing begins from the first nucleotide after the T7lac promoter, with positions 43 and 46 being the desired start sites. The translation rates are about 6810 and 9910 respectively with all other start codons being translated with a rate under 5.

The final sequence has been added as part BBa_K4701301.

The part is compatible with all the commonly used BioBrick assembly standards.

SpyCatcher-SpyTag fusion

As PETase and MHETase work synergistically to depolymerize PET, combining them into a fusion protein has successfully been shown to increase PET depolymerization⁸. Inspired by this, we also designed FAST-PETase and MHETase inserts containing the SpyCatcher/SpyTag system¹². The general idea is that by prefixing FAST-PETase and MHETase with the SpyCatcher and SpyTag sequences respectively, the following protein products will spontaneously form an isopeptide bond, fusing the two enzymes together.

Briefly, we removed the signal peptides from both sequences, and further removed some unstructured sequence from the N-terminus of MHETase. Next, we prefixed FAST-PETase with a codon-optimized SpyCatcher002¹³ sequence, and prefixed MHETase with a SpyTag sequence designed for use in N-terminus fusions expressed from T7-based systems, followed by a short linker with the amino acid sequence GSGESGSG as recommended by the inventors of the system¹⁴. Figure 11 shows the parts. As a cloning strategy, we intended to use NdeI/XhoI restriction enzymes like with the other inserts described here.

Figure 11. Designed SpyCatcher-FAST-PETase fusion (BBa_K4701302), and SpyTag-MHETase fusion (BBa_K4701303) parts.

As mentioned before, additional N-terminus sequence was removed from MHETase. This is because as we were investigating the formation of the fusion protein with ColabFold in multimer mode¹⁵, we noticed an unstructured region. Figure 12 shows predicted conformations before and after the removal. Note that the predicted fold has the two enzymes sticking together, but this interaction is likely weak, and the structure would be more dynamic in any realistic environment. Unfortunately, due to time constraints, we were not able to test the inserts in the wet lab or conduct further dry lab modeling of the fusion protein.

Figure 12. ColabFold prediction of the SpyCatcher/SpyTag fusion of FAST-PETase and MHETase. Left: MHETase with unstructured N-terminus. Right: unstructured MHETase N-terminus removed. (cyan: MHETase, magenta: linkers, yellow: SpyCatcher002, red: SpyTag, orange: FAST-PETase, gray: His-tags)

References

Qi X, Yan W, Cao Z, Ding M, Yuan Y. Current Advances in the Biodegradation and Bioconversion of Polyethylene Terephthalate. Microorganisms. 2021;10(1):39. doi:10.3390/microorganisms10010039
Herrero Acero E, Ribitsch D, Steinkellner G, Gruber K, Greimel K, Eiteljoerg I, Trotscha E, Wei R, Zimmermann W, Zinn M, et al. Enzymatic Surface Hydrolysis of PET: Effect of Structural Diversity on Kinetic Properties of Cutinases from Thermobifida. Macromolecules. 2011;44(12):4632–4640. doi:10.1021/ma200949p
Sulaiman S, Yamato S, Kanaya E, Kim J-J, Koga Y, Takano K, Kanaya S. Isolation of a Novel Cutinase Homolog with Polyethylene Terephthalate-Degrading Activity from Leaf-Branch Compost by Using a Metagenomic Approach. Applied and Environmental Microbiology. 2012;78(5):1556–1562. doi:10.1128/AEM.06725-11
Yoshida S, Hiraga K, Takehana T, Taniguchi I, Yamaji H, Maeda Y, Toyohara K, Miyamoto K, Kimura Y, Oda K. A bacterium that degrades and assimilates poly(ethylene terephthalate). Science. 2016;351(6278):1196–1199. doi:10.1126/science.aad6359
Son HF, Cho IJ, Joo S, Seo H, Sagong H-Y, Choi SY, Lee SY, Kim K-J. Rational Protein Engineering of Thermo-Stable PETase from Ideonella sakaiensis for Highly Efficient PET Degradation. ACS Catalysis. 2019;9(4):3519–3526. doi:10.1021/acscatal.9b00568
Lu H, Diaz DJ, Czarnecki NJ, Zhu C, Kim W, Shroff R, Acosta DJ, Alexander BR, Cole HO, Zhang Y, et al. Machine learning-aided engineering of hydrolases for PET depolymerization. Nature. 2022;604(7907):662–667. doi:10.1038/s41586-022-04599-z
Sagong H-Y, Seo H, Kim T, Son HF, Joo S, Lee SH, Kim S, Woo J-S, Hwang SY, Kim K-J. Decomposition of the PET Film by MHETase Using Exo-PETase Function. ACS Catalysis. 2020;10(8):4805–4812. doi:10.1021/acscatal.9b05604
Knott BC, Erickson E, Allen MD, Gado JE, Graham R, Kearns FL, Pardo I, Topuzlu E, Anderson JJ, Austin HP, et al. Characterization and engineering of a two-enzyme system for plastics depolymerization. Proceedings of the National Academy of Sciences. 2020;117(41):25476–25485. doi:10.1073/pnas.2006753117
Shi L, Liu H, Gao S, Weng Y, Zhu L. Enhanced Extracellular Production of Is PETase in Escherichia coli via Engineering of the pelB Signal Peptide. Journal of Agricultural and Food Chemistry. 2021;69(7):2245–2252. doi:10.1021/acs.jafc.0c07469
Chen H, Bjerknes M, Kumar R, Jay E. Determination of the optimal aligned spacing between the Shine – Dalgarno sequence and the translation initiation codon of Escherichia coli m RNAs. Nucleic Acids Research. 1994;22(23):4953–4957. doi:10.1093/nar/22.23.4953
Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nature Biotechnology. 2009;27(10):946–950. doi:10.1038/nbt.1568
Zakeri B, Fierer JO, Celik E, Chittock EC, Schwarz-Linek U, Moy VT, Howarth M. Peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin. Proceedings of the National Academy of Sciences. 2012;109(12). doi:10.1073/pnas.1115485109
Keeble AH, Banerjee A, Ferla MP, Reddington SC, Anuar INAK, Howarth M. Evolving Accelerated Amidation by SpyTag/SpyCatcher to Analyze Membrane Dynamics. Angewandte Chemie. 2017;129(52):16748–16752. doi:10.1002/ange.201707623
Keeble AH, Howarth M. Insider information on successful covalent protein coupling with help from SpyBank. In: Methods in Enzymology. Vol. 617. Elsevier; 2019. p. 443–461. doi:10.1016/bs.mie.2018.12.010
Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nature Methods. 2022;19(6):679–682. doi:10.1038/s41592-022-01488-1

Figure 1 was created with BioRender.com.

Plasmid maps were created with SnapGene

EG assimilation

Assimilation pathway for ethylene glycol

Apart from exploring wild-type strains capable of utilizing our PET monomers as carbon sources, we also investigated the possibility of engineering assimilation pathways into hosts unable to naturally metabolize them. For ethylene glycol (EG), we decided to pursue engineering an assimilation pathway by overexpressing two genes fucO, and aldA in E. coli, which have been shown to work in two recent studies^1,2.

fucO codes for a propanediol oxidoreductase, which has also been shown to exhibit promiscuous activity toward ethylene glycol, converting it into glycolaldehyde. aldA codes for an aldehyde dehydrogenase, which further catalyzes a reaction to glycolate. From hereon, the pathway continues from glyoxylate through various steps to 2-phosphoglyceric acid (2PG), which is an intermediate in the standard glycolysis pathway, the idea is summarized in Figure 1.

Figure 1. The EG assimilation pathway is enabled by the overexpression of fucO and aldA, which bridge the gap from ethylene glycol to glycolate. From there, the pathway continues to glyoxylate and through some additional steps to 2-phosphoglycerate, which is an intermediate in standard glycolysis. Thus the carbon flux leads from ethylene glycol to pyruvate, allowing EG to be used as a carbon source².

Insert design

For EG to assimilate into central carbon metabolism, the two genes fucO, and aldA need to be intracellularly expressed. The general strategy is to clone the two sequences into the dual-expression vector pRSFDuet-1.

For fucO, we use the variant with mutations I7L/L8V, as it has been shown to exhibit improved activity². Codons were optimized with the tool from IDT, and restriction sites used in BioBrick assembly standards were removed.

Cloning

Both genes are cloned into pRSFDuet-1 one after the other. fucO is cloned into the first MCS with NcoI and EcoRI, while aldA is cloned into the second MCS with NdeI and XhoI as shown in Figure 2. Both genes are placed under the control of IPTG inducible T7 promoters allowing us to control the intracellular expression levels.

Figure 2. The result of cloning both fucO and aldA into the dual expression vector pRSFDuet-1.

Before ordering the inserts, we needed to make sure that both products would be expressed in similar quantities. This is important as each enzyme is needed once in the assimilation pathway, and either bottlenecking the flux by expressing too little aldA, or causing metabolic burden by overly expressing fucO are not desired outcomes. The first scenario might be especially problematic as the accumulation of intermediate glycolaldehyde could potentially be toxic to the E. coli cells³.

The promoter sequences upstream of the two genes are identical. This means that, at least on paper, the transcription rates should be identical for both genes. However, expression levels are also affected by translation initiation rates. Even though the two Shine-Dalgarno sequences upstream of our genes are the same, the differences in the standby, spacing, and 5’ ends of the coding sequences affect mRNA folding. This is important as stable secondary structures, or the lack of them, affect the total ΔG of ribosome binding, affecting translation rates⁴.

We used the RBS calculator from the Salis lab⁴ to predict the translation rates and received predictions of about 81600 and 3000 for fucO and aldA respectively on the relative arbitrary unit scale. To balance out the translation rates as well as possible, we used Vienna RNAfold⁵ to analyze ΔG of the mRNA folding around the ribosome binding sites and introduced silent mutations to the 5’ ends of the coding sequences. After many rounds of trial and error, we managed to decrease the predicted translation rate of fucO to around 29400, and increase the rate for aldA to around 10600. As the upstream sequence on the backbone is fixed, we weren’t able to fully balance out the predicted rates as the number of available silent mutations is also rather limited. On the other hand, the translation rates were now at least roughly of the same magnitude.

The final sequences for fucO and aldA have been added as basic parts BBa_K4701210 and BBa_K4701211 respectively.

Due to time constraints, we were unfortunately unable to attempt the assembly in the wet lab.

References

Panda S, Fung VYK, Zhou JFJ, Liang H, Zhou K. Improving ethylene glycol utilization in Escherichia coli fermentation. Biochemical Engineering Journal. 2021;168:107957. doi:10.1016/j.bej.2021.107957
Pandit AV, Harrison E, Mahadevan R. Engineering Escherichia coli for the utilization of ethylene glycol. Microbial Cell Factories. 2021;20(1):22. doi:10.1186/s12934-021-01509-2
Tiso T, Winter B, Wei R, Hee J, De Witt J, Wierckx N, Quicker P, Bornscheuer UT, Bardow A, Nogales J, et al. The metabolic potential of plastics as biotechnological carbon sources – Review and targets for the future. Metabolic Engineering. 2022;71:77–98. doi:10.1016/j.ymben.2021.12.006
Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nature Biotechnology. 2009;27(10):946–950. doi:10.1038/nbt.1568
Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. ViennaRNA Package 2.0. Algorithms for Molecular Biology. 2011;6(1):26. doi:10.1186/1748-7188-6-26

Figure 1 was created with BioRender.com.

Plasmid maps were created with SnapGene

TPA assimilation

Assimilation pathway for terephthalic acid

Apart from exploring wild-type strains capable of utilizing our PET monomers as carbon sources, we also wanted to look into engineering assimilation pathways into hosts unable to naturally metabolize them. For terephthalic acid (TPA), we decided to design a synthetic operon expressing five different genes for Pseudomonas putida KT2440, which is generally regarded as a robust host for metabolic engineering¹. The designed synthetic operon contains a tpaK TPA transporter from Rhodococcus jostii, and genes tphA2_IIA3_IIB_IIA1, from Comamonas sp. E6, and is based both on the wild-type operon, previously used designs, as well as biophysical modeling.

The tph operon

One of the most thoroughly studied assimilation pathways for TPA is found in various strains of Comamonas that can utilize TPA as the sole carbon source². Two similar operons tph_I, and tph_II code for enzymes enabling the uptake of TPA, and further metabolism to protocatechuic acid. For the rest of this text, we will specifically focus on the tph_II operon.

The tph_II operon, which can also be written as tphR_IIC_IIA2_IIA3_IIB_IIA1_II has a total of six genes. tphR_II serves a regulatory function, allowing Comamonas sp. E6 to express the catabolic operon tphC_IIA2_IIA3_IIB_IIA1_II which is transcribed separately³. As we aim to engineer P. putida to grow in an environment where TPA is always abundant, we will not go further into the regulatory function of tphR_II. We will instead focus on the catabolic operon, which is sufficient to allow the utilization of TPA, given that it is transcribed.

As TPA has a molecular mass of around 166 Da, it is too large to enter the cell by free diffusion. The first gene in the catabolic operon tphC_II codes for a TPA transporter. We will, however, omit this gene from our operon as we will utilize tpaK for TPA transport as described later. After TPA has been transported across the cellular membrane, the four other enzymes will connect TPA to existing pathways within P. putida KT2440 as shown in Figure 1.

Briefly, TPA is metabolized into 1,2-dihydroxy-3,5-cyclohexadiene-1,4-dicarboxylic acid (DCD), by TPA 1,2-dioxygenase (TPADO)², which is a protein complex encoded by tphA1_IIA2_IIA3_II. DCD is further metabolized into protocatechuic acid (PCA) by a DCD dehydrogenase encoded by tphB_II. PCA is a common intermediate utilized by various organisms, and in P. putida, the pathway continues towards central metabolism via the PCA-3,4-dioxygenase pathway⁴.

Figure 1. Sketch of the assimilation pathway. Structures for tpaK, tphB, and tphA1 are predicted by AlphaFold. The oxygenase subunit of TPADO formed by tphA2 and tphA3 is shown as a heterohexamer as determined from a crystal structure⁵.

In addition to literature, the feasibility of the pathway was also assessed with metabolic modeling. You can read more about those results on our modeling page.

Sequences

tphC_II:

tphC_II encodes for a TPA transporter, which allows TPA to enter the cell (UniProt: Q3C1D6). In the original paper discussing the tph_II operon, it was found that disrupting this gene caused Comamonas sp. E6 to lose its ability to grow on TPA².

tphA1_IIA2_IIA3_II:

The genes tphA1_IIA2_IIA3_II are responsible for the first step in the assimilation pathway. Together they form a protein complex where tphA1_II (UniProt: Q3C1D2) codes for the reductase component, while tphA2_II (UniProt: Q3C1D5), and tphA3_II (UniProt: Q3C1D4) code for the large and small oxygenase subunits respectively.

tphB_II:

tphB_II (UniProt: Q3C1D3) is responsible for coding for a DCD dehydrogenase, which catalyzes the second reaction in the pathway from DCD to PCA.

The catabolic tph operon in Comamonas sp. E6:

The total length of the catabolic operon in Comamonas sp. E6 is 4651bp with the genes appearing in the order tphC_IIA2_IIA3_IIB_IIA1_II as shown in Figure 2. It can also be seen, that the genes appear one after another, and indeed they are transcribed together³. We suspect that the order of the genes is important for the correct co-folding of the TPADO complex, as the order of the genes seems to be conserved in other species capable of TPA utilization. For example, we analyzed a similar operon in a strain of Rhodococcus opacus as described in our modeling page. Thus we will conserve the gene order in the following synthetic operon.

Figure 2. The catabolic tph operon from Comamonas sp. E6. (GenBank: AB238679). The 3’ end of tphA2_II overlaps with 5’ of tphA3_II while its 3’ overlaps with the 5’ end of tphB_II. tphB_II is separated from tphA1_II by a 9 bp gap.

tpaK transporter

Engineering the assimilation pathway enabled by the tph operon into P. putida KT2440 is not an entirely novel idea. A recent paper found that P. putida KT2440 was not able to utilize TPA as a sole carbon source with the tphC transporter⁶. Instead, another TPA transporter called tpaK (UniProt: Q0RWE8) from R. jostii was utilized. We thus took the same approach, but instead of transcribing tpaK separately from the other genes, we opted to simply replace tphC with tpaK in the same transcript giving us the final sequence of genes for our synthetic operon: tpaK-tphA2_IIA3_IIB_IIA1_II.

Expression cassette with tpaK-tphA2_IIA3_IIB_IIA1_II

Three expression cassettes with alternative promoter sequences were designed. The constructs have been added to the registry as composite parts BBa_K4701306, BBa_K4701307, and BBa_K4701308. As the design process includes many steps, discussions about different parts of the cassettes have been split into sections below.

Briefly, the cassettes were designed with the stability of the mRNA transcript in mind, as all of the five open reading frames need to be translated for the desired function. Furthermore, predicted translational rates of the ORFs were analyzed with biophysical models to ensure that no obvious mistakes prevent all the ORFs from being translated. Lastly, as the final length of the cassette is 5472bp, we adopt Gibson assembly as the method for plasmid construction. The choice of pSEVA231 as the backbone is also justified below.

SaII spacer/Insulating hairpin

As stated before, one design goal for the cassette is mRNA stability. This is important for two reasons. Firstly, we want mRNA to remain stable long enough to allow the translation of all ORFs. Secondly, we want to find the optimal expression levels by controlling transcription rates, and thus we want to minimize the amount of variability on the mRNA level. As there is a need for a spacer sequence between the promoter and the first ribosome binding site regardless, we attempted to rationally design the sequence instead of opting for a random sequence.

Many bacterial pathways for mRNA degradation are dependent on the 5’ UTR sequence of the transcript⁸. It has also been observed that the leading trinucleotide has an effect on the rate at which the 5’ end of an mRNA transcript is hydrolyzed, which acts as a starting point for many mRNA degradation pathways. ATG was observed to be the most stable leading trinucleotide, thus we follow the promoter ATGATG.

This sequence is followed by CGAC, forming the SaII restriction site GTCGAC. This is useful as it gives us the ability to remove the promoter from our fragments down the line if the need arises to swap the promoter to something else, perhaps an inducible one. Furthermore, SaII is not a part of the BioBrick assembly standards, so including it is not a problem from the perspective of iGEM either.

Before the first ribosome binding site, we also introduce an insulating hairpin. The goal here is to keep the secondary structure around the first RBS predictable and robust to degradation of the upstream sequence. Generally speaking, secondary structures around RBS are important as they affect the total ΔG of ribosome binding, which subsequently affects translation rates⁹. Furthermore, the translation rate of the first ORF is especially important as translation rates of downstream ORFs are also affected via translational coupling10 (See below). The secondary structure around the first RBS was predicted with Vienna RNAfold¹¹, the structure is shown in Figure 3. Furthermore, we used the Promoter calculator¹², to check that the insulating hairpin is predicted to be included in our transcripts as shown in Figure 4.

Figure 3. Predicted centroid secondary structure of the mRNA transcript around the first RBS. The insulating hairpin forms upstream of the first RBS which itself is predicted to stay in a linear form allowing for easy ribosome binding. Different parts of the RBS are annotated as used in the biophysical model behind the RBS calculator⁹.

Figure 4. Transcription rates by base as predicted by the Promoter calculator¹². Note that currently the model is only available for E. coli, but as it relies on modeling the biophysics between the sequence and the RNAP/σ⁷⁰ complex, the predictions should in theory apply to P. putida KT2440 too.

Synthetic ribosome binding sites and overall operon optimization

The design of the ribosome binding sites and the flanking sequences is perhaps the most important part when it comes to the overall operon design as we want all ORFs to be translated on a sufficient level. RBS sequences in P. putida generally are not as well characterized as in model organisms like E. coli¹³. However, biophysical models have been developed to design RBS sequences and to predict translation rates⁹. Briefly, the RBS calculator, predicts mRNA secondary structure around the RBS sequence and calculates the total ΔG of ribosome binding. The general idea then is that the larger the decrease in the amount of free energy in ribosome binding is, the more spontaneously this binding will happen, directly affecting translation initiation rates.

Initially, we designed RBS sequences for all five ORFs separately using the RBS calculator in design mode with P. putida KT2440 as the host organism. However, the translation rates of ORFs within an operon cannot be predicted reliably in isolation. This is because the standby sequence upstream of the Shine-Dalgarno sequence affects mRNA structure, meaning that the upstream gene can affect the folding of the next RBS¹⁰. Additionally, the translation rate of downstream ORFs is partially controlled by the translation rates of upstream ORFs in the way of translational coupling. Briefly, stable mRNA structures that do not spontaneously unfold due to ΔG of ribosome binding being positive, or not favorable enough, can be unfolded by ribosomes translating the upstream gene as they have additional energy from GTP hydrolysis. The unfolding allows for new ribosomes to bind, or the upstream ribosome to reattach after disengaging.

The operon calculator is a biophysical model, that can be used to predict translational rates in operons, and takes translational coupling into account¹⁰. Furthermore, it gives other useful information about the sequence, such as predicted internal transcription start sites, repeated regions, intrinsic terminators, and RNAse sites. Its predictive power has also been demonstrated empirically¹⁵.

Many design rounds were done with the operon calculator. The output was examined and adjustments were made before running the algorithm again. Figures 5 and 6 show the suboptimal results given by the initial sequence. As can be seen, there are large differences in the translation initiation rates (TIR) between the genes. tpaK has a predicted TIR of around 9300, while for tphA3_II TIR is only around 35. Furthermore, the sequence contained many transcription start sites with higher predicted rates in both strands than the desired start site. The low translation rates also mean that the transcript is classified as highly unstable, as highly translated mRNA gets covered by elongating ribosomes, that protect the transcript from RNases¹⁰.

Figure 5. Predicted translation initiation rates of genes tpaK-tphA2_IIA3_IIB_IIA1_II in the first design cycle.

Figure 6. Predicted transcription initiation rates within the operon by start position in the first design cycle.

Figures 7 and 8 show the predicted translation and transcription rates of the final construct respectively. As can be seen, all of the five ORFs are now predicted to be translated on a high level, with alternative ORFs having significantly lower translation rates. Furthermore, alternative transcription start sites have been eliminated for the most part. Overall, the transcript is classified as moderately stable, with the only issue being an RNase E site within the first RBS. However, this issue cannot really be solved, as the site exists because the insulating hairpin keeps the sequence of the first RBS highly accessible by design to allow high translation rates.

Figure 7. Predicted translation rates of the final construct. The range is between 27750 for tpaK to around 8070 for tphA1_II.

Figure 8. Predicted transcription rates of the final construct, with promoter J23102.

All in all, many modifications were made to the sequence over the ten or so design cycles. Briefly, intrinsic terminators called by the operon calculator were removed. Alternative Shine-Dalgarno-like sequences were removed in both strands to minimize the translation of alternative ORFs. Sequences resembling the -35 promoter element or the Pridnow Box were eliminated to minimize alternative transcripts in both directions. Repeats were minimized to make sure fragment synthesis would be successful. The RBS sequences and their flanking sequences were tweaked many times to achieve as uniformly predicted translation rates as possible. All of this was achieved using silent mutations while taking care of keeping codons moderately optimized and keeping the sequence compliant with assembly standards RFC 10, and RFC 23. Figure 9 shows a sketch of the final design.

Figure 9. Sketch of the final design with the strong J23102 promoter. All RBS sequences are 25 bp long making the construct 5432 bp long. The SalI site can be used to change the promoter sequence if needed.

Backbone choice

Apart from the insert design, the backbone choice is also important as the plasmid copy number (PCN) will have a large effect on overall expression levels. As the goal is not to produce copious amounts of a single product for downstream purposes, but instead augment existing metabolism, low to medium copy numbers would be preferable.

Unfortunately, many replicons that are considered low-copy in E. coli have been shown to produce higher PCN in P. putida KT2440¹⁶. The lowest copy numbers of benchmarked replicons were achieved with the BBR1, and RK2 replicons with both having a copy number around 30.

We opted to use BBR1 out of necessity. We acquired P. putida KT2440 from the Microbial Domain Biological Resource Centre HAMBI at the University of Helsinki (HAMBI: 3694). This strain has been modified with plasmids containing the RK2 replicon in a previous study¹⁷. Thus using RK2 in our design would cause problems. As BBR1, and RK2 do not belong in the same incompatibility group, problems should not be caused by choosing BBR1 as the replicon¹⁶. As for the antimicrobial resistance marker, we chose kanamycin as it is widely accessible, and ampicillin is generally needed in high concentrations to be effective against P. putida¹³.

SEVA plasmids (Standard European Vector Architecture) are commonly used when engineering P. putida¹³. The pSEVA231 vector has the BBR1 replicon with the kanamycin resistance gene along with a single multiple cloning site, into which our cassette can be inserted. Furthermore, the consortium behind SEVA distributes the standardized backbones for free, making them an overall good resource for iGEM teams.

Cloning strategy

As the final length of the synthetic operon is 5472bp, it is too long to be synthesized in one go. Using restriction cloning to combine multiple fragments is rather clumsy, and as we had had issues with the technique with our other inserts, we planned to use another more suitable assembly method, known as Gibson assembly.

As the DNA fragments would be ordered from Twist Bioscience, which at the time of writing could synthesize fragments up to 1700 bp long, we needed to split our operon into four fragments. We planned to use the Gibson Assembly HiFi Cloning Kit from our sponsor ThermoFisher Scientific for assembly, and thus we turned to their manuals when designing the assembly fragments. Generally speaking, the kit can be used to combine the four fragments with the linearized pSEVA231 vector in one reaction given that the fragments share appropriate 40 bp overlaps. The overlaps should contain no GC-extremities, tandem repeats, or strong secondary structures. As Gibson assembly with our kit is done at 50 °C, melting temperatures below this for ssDNA should prevent secondary structures from causing issues. Secondary structures formed by the overlaps were analyzed with the IDT-provided interface to UnaFold¹⁸. Potential self-dimers were screened for with the IDT OligoAnalyzer tool.

Finally, a total of six fragments were ordered from Twist Bioscience as listed in Table 1. The first fragment was ordered with an upstream overlap with pSEVA231, and the last fragment was ordered with a downstream overlap with pSEVA231. Furthermore, three variants of the first fragment with different promoters were ordered. The overlaps are further listed in Table 2, with a sketch of the cloning strategy shown in Figure 10.

Table 1. Final Gibson fragments ordered from Twist Bioscience

Name	Description	Length
Fragment 1a	Fragment 1 with J23102 promoter / tpaK	988bp
Fragment 1b	Fragment 1 with J23105 promoter / tpaK	988bp
Fragment 1c	Fragment 1 with J23114 promoter / tpaK	988bp
Fragment 2	Fragment 2 with tpaK/tphA2	1481bp
Fragment 3	Fragment 3 with tphA2/tphA3/tphB	1688bp
Fragment 4	Fragment 4 with tphB/tphA1/pT1 terminator	1435bp

Table 2. Gibson overlaps chosen for the assembly

Overlap	Sequence	Range	GC-content	Tandem repeats	T_m
PacI/1	TCTTTCGACTGAGCCTTTCGTTTTATTTGATGCCTTTAAT	1-40	35%	No	30.3°C
1/2	AGTTCTACGCCCTGCAAAGCTGGTTGCCGTCCATCATGAC	949-988	55%	No	43.5°C
2/3	TAACCCTCCAGATCCTCTCGGTATTCCCCGGTTTCGTCCT	2390-2429	55%	No	43.6°C
3/4	TACGTCGCAATGCTGCACGATCAGGGTCACATTCCTATCA	4038-4077	50%	No	39.1°C
4/SpeI	CTAGTCTTGGACTCCTGTTGATAGATCCAGTAATGACCTC	5433-5472	45%	No	44.8°C

Figure 10. Correspondence between the coding sequences and Gibson fragments. The first fragment can be changed to fragments 1b or 1c to reduce transcription rate. All the gibson overlaps are 40 bp in length.

After pSEVA231 is linearized with PacI and SpeI, successful Gibson assembly yields the construct shown in Figure 11.

Figure 11. Final design of the construct assembled with Gibson assembly. High transcription variant.

The composite parts have been added to the parts registry as BBa_K4701306, BBa_K4701307, and BBa_K4701308 for high, medium, and low transcription rates respectively.

See out parts list for the basic parts, including the synthetic RBS sequences used.

Due to time constraints, we were unfortunately unable to attempt the assembly in the wet lab.

References

Nikel PI, De Lorenzo V. Pseudomonas putida as a functional chassis for industrial biocatalysis: From native biochemistry to trans-metabolism. Metabolic Engineering. 2018;50:142–155. doi:10.1016/j.ymben.2018.05.005
Sasoh M, Masai E, Ishibashi S, Hara H, Kamimura N, Miyauchi K, Fukuda M. Characterization of the Terephthalate Degradation Genes of Comamonas sp. Strain E6. Applied and Environmental Microbiology. 2006;72(3):1825–1832. doi:10.1128/AEM.72.3.1825-1832.2006
Kasai D, Kitajima M, Fukuda M, Masai E. Transcriptional Regulation of the Terephthalate Catabolism Operon in Comamonas sp. Strain E6. Applied and Environmental Microbiology. 2010;76(18):6047–6055. doi:10.1128/AEM.00742-10
Salvador M, Abdulmutalib U, Gonzalez J, Kim J, Smith AA, Faulon J-L, Wei R, Zimmermann W, Jimenez JI. Microbial Genes for a Circular and Sustainable Bio-PET Economy. Genes. 2019;10(5):373. doi:10.3390/genes10050373
5. Kincannon WM, Zahn M, Clare R, Lusty Beech J, Romberg A, Larson J, Bothner B, Beckham GT, McGeehan JE, DuBois JL. Biochemical and structural characterization of an aromatic ring–hydroxylating dioxygenase for terephthalic acid catabolism. Proceedings of the National Academy of Sciences. 2022;119(13):e2121426119. doi:10.1073/pnas.2121426119
Werner AZ, Clare R, Mand TD, Pardo I, Ramirez KJ, Haugen SJ, Bratti F, Dexter GN, Elmore JR, Huenemann JD, et al. Tandem chemical deconstruction and biological upcycling of poly(ethylene terephthalate) to β-ketoadipic acid by Pseudomonas putida KT2440. Metabolic Engineering. 2021;67:250–261. doi:10.1016/j.ymben.2021.07.005
Pearson AN, Thompson MG, Kirkpatrick LD, Ho C, Vuu KM, Waldburger LM, Keasling JD, Shih PM. The pGinger Family of Expression Plasmids Bond DR, editor. Microbiology Spectrum. 2023;11(3):e00373-23. doi:10.1128/spectrum.00373-23
Cetnar DP, Salis HM. Systematic Quantification of Sequence and Structural Determinants Controlling mRNA stability in Bacterial Operons. ACS Synthetic Biology. 2021;10(2):318–332. doi:10.1021/acssynbio.0c00471
Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nature Biotechnology. 2009;27(10):946–950. doi:10.1038/nbt.1568
Tian T, Salis HM. A predictive biophysical model of translational coupling to coordinate and control protein expression in bacterial operons. Nucleic Acids Research. 2015;43(14):7137–7151. doi:10.1093/nar/gkv635
Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. ViennaRNA Package 2.0. Algorithms for Molecular Biology. 2011;6(1):26. doi:10.1186/1748-7188-6-26
LaFleur TL, Hossain A, Salis HM. Automated model-predictive design of synthetic promoters to control transcriptional profiles in bacteria. Nature Communications. 2022;13(1):5159. doi:10.1038/s41467-022-32829-5
Martin-Pascual M, Batianis C, Bruinsma L, Asin-Garcia E, Garcia-Morales L, Weusthuis RA, Van Kranenburg R, Martins Dos Santos VAP. A navigation guide of synthetic biology tools for Pseudomonas putida. Biotechnology Advances. 2021;49:107732. doi:10.1016/j.biotechadv.2021.107732
Amarelle V, Sanches-Medeiros A, Silva-Rocha R, Guazzaroni M-E. Expanding the Toolbox of Broad Host-Range Transcriptional Terminators for Proteobacteria through Metagenomics. ACS Synthetic Biology. 2019;8(4):647–654. doi:10.1021/acssynbio.8b00507
Ng CY, Farasat I, Maranas CD, Salis HM. Rational design of a synthetic Entner–Doudoroff pathway for improved and controllable NADPH regeneration. Metabolic Engineering. 2015;29:86–96. doi:10.1016/j.ymben.2015.03.001
Cook TB, Rand JM, Nurani W, Courtney DK, Liu SA, Pfleger BF. Genetic tools for reliable gene expression and recombineering in Pseudomonas putida. Journal of Industrial Microbiology and Biotechnology. 2018;45(7):517–527. doi:10.1007/s10295-017-2001-5
Leedjärv A, Ivask A, Virta M. Interplay of Different Transporters in the Mediation of Divalent Heavy Metal Resistance in Pseudomonas putida KT2440. Journal of Bacteriology. 2008;190(8):2680–2689. doi:10.1128/JB.01494-07
Markham NR, Zuker M. UNAFold. In: Keith JM, editor. Bioinformatics. Vol. 453. Totowa, NJ: Humana Press; 2008. p. 3–31. (Walker JM, editor. Methods in Molecular BiologyTM). http://link.springer.com/10.1007/978-1-60327-429-6_1. doi:10.1007/978-1-60327-429-6_1

Figures apart from plasmid maps and those generated by the RBS calculator were created with BioRender.com

Parts design

Enzymatic PET degradation

FAST-PETase

Cycle 1

Insert design

Cloning

Cycle 2

Insert design

Cloning

MHETase

Cycle 1

Insert design

Cloning

Cycle 2

Insert design

Cloning

SpyCatcher-SpyTag fusion

References

EG assimilation

Assimilation pathway for ethylene glycol

Insert design

Cloning

References

TPA assimilation

Assimilation pathway for terephthalic acid

The tph operon

tphCII:

tphA1IIA2IIA3II:

tphBII:

The catabolic tph operon in Comamonas sp. E6:

tpaK transporter

Expression cassette with tpaK-tphA2IIA3IIBIIA1II

References

tphC_II:

tphA1_IIA2_IIA3_II:

tphB_II:

Expression cassette with tpaK-tphA2_IIA3_IIB_IIA1_II