Results iGEM 2023 Tübingen

In the face of tackling the blood shortage crisis, we were able to contribute in the following ways. Via events and social media, we educated people on the problems of blood shortage the world is currently facing. We presented not just solutions to help solve this problem on a big scale, such as converting blood types to the universal O-antigen blood, but also showed how every single person can contribute to solving this shortage. We facilitated a bioinformatics-based workflow to screen for and assist in designing novel enzymes that in theory can specifically modify ligands, such as saccharides, on antigens or other proteins. We demonstrated this by designing candidate enzymes that specifically convert a galactose/ fucose residue to a keto-galactose residue on the type B blood antigen using publicly available software and scripts. This workflow has the potential to support researchers in their design of substrate modifying enzymes for pharmaceutical purposes. However, we feel obliged to note that this workflow currently lacks the proper wetlab-based proof of concept. To prove or dispute this workflow, we challenge ourselves as well as other scientists to make use of this toolchain and computational foundation we have provided and report on its accuracy and usefulness.

Drylab

We chose AA5 with the type enzyme Galactose Oxidase (GOX) as family to search for a suitable enzyme that could be capable of catalyzing the oxidation of blood antigen B, due to the number of well characterized members, solved structures, theknown reaction mechanism, as well as its simple active site.

GOX Mechanism
Fig.1 - mechanism of GOX, from https://www.cazypedia.org/index.php/File:CRO_mechanism.png (CC BY 4.0)

The GOX mechanism suggests that the substrate and active site copper share a covalent bond as intermediate, suggesting a minimal distance to the copper ion as suitable parameter for predicting if an enzyme will be active on a given docked ligand. This mechanism also suggests that it is possible to catalyze any sugar hydroxy group given, that it is capable of forming a copper covalent bond.

In order to analyze if molecular docking is a meaningful model for our analyses of finding enzymes that could potentially act on specific ligands, we tested docking the known substrate galactose to an experimentally solved structure of GOX, as well as glucose, which was known not to be a substrate of GOX. This docking was performed using Glide (as part of maestro).

GOX docked with Galactose
Fig.2 - Galactose docked to GOX

Docking of galactose to GOX showed that Glide docked the ligand inside the active site, therefore suggesting that docking potentially is a biologically meaningful model.

GOX docked with Glucose
Fig.3 - Glucose docked to GOX

Docking of glucose to GOX showed that Glide docked the ligand inside the active site, but with a worse fit compared to galactose, approximately 0.4 Ångström further away from the copper ion, suggesting fit into the active site to be worse for glucose.

We also investigated if modelling ligand binding strength was a suitable model for prediction of ligand activity for GOX using molecular mechanics with generalised Born and surface area solvation (MMGBSA) using Maestro.

MMGBSA Simulation results on Galactose and Glucose bound to GOX
Fig.4 - MMGBSA Simulation results on Galactose and Glucose bound to GOX

Using MMGBSA simulations of glucose and galactose we were unable to show a difference in binding energy of glucose and galactose to GOX, although glucose has a very high variance of binding energy between docked poses (10 Simulations per ligand, ligands were first docked using Glide and docked poses were then scored on binding energy using MMGBSA).

As GOX is effectively inactive with glucose as the substrate, but turns over Galactose [1], this is suggesting that molecular mechanics predicting ligand binding strength is either not a biologically meaningful model, or binding strength is less relevant to predicting ligand activity than active site fit, and that instead active site fit and distance to the copper ion is a more meaningful metric to judge likelihood of a ligand being able to be a substrate of an AA5 family enzyme.

In order to analyze the structural, as well as the presumed functional diversity of our enzyme database, which would allow us to search the structural space easier by excluding similar enzymes, we integrated data from structural alignments of generated alphafold structures with a phylogenetic tree (phyml), as well as data on conserved domains found in the enzyme sequences (CDD/SPARCLE).

Hierarchical Clustering Diagram of Root Mean Square Deviation of Alphafold structures computed from pairwise RMSD
Fig.5 - Hierarchical Clustering Diagram of Root Mean Square Deviation of Alphafold structures computed from pairwise RMSD
Phylogenetic Tree of AA5
Fig.6 - Phylogenetic tree of AA5 bacterial homologues with added schema of conserved domains (blue (kelch motif), yellow (C-terminal Early set domain associated with the catalytic domain of galactose oxidase), brown (Carbohydrate binding module families), aswell as colored in using by the colors and groups derived by hierarchical clustering as described above.

The phylogenetic tree of AA5 shows clades. These clades are comprised of enzymes with different conserved domains and different active site geometries of the AlphaFold structures of it s members, of which derived groups match up with observed clades in the phylogenetic tree. This suggests difference in function, which is supported by the differing conserved domains, which are most likely due to differing functional interactions, roles or substrates of the enzymes, as well as differing enzyme active site geometry, all of which in this case suggests a difference in ligands that can potentially be catalyzed by the enzyme. Enzymes A0A0A1TMB5 (TMB5) and A0A178E5X4 (E5X4), belonging to the clade1 (dark green), have docking results of the tetrasaccharide model of Antigen B with < 3 Å distance (OH 3 for TMB5) to the copper ion, which, in theory means that a covalent bond might be possible.

TMB5 active site with Antigen B
Fig.7 - Active site of TMB5 with docked antigen B tetrasaccharide.

TMB5 has a low distance of the second (2.91 Å) and third (2.89 Å) hydroxy group of the terminal galactose to the active site and we wanted to see if there were any activity, how selective the activity was for each of these groups.

E5X4 active site with Antigen B
Fig.8 - E5X4 active site with docked antigen B tetrasaccharide.

E5X4 has a small distance (2.69 Å) of the fith hydroxy group of the fucose to the copper ion, which is also an interesting modification that we wanted to explore along TMB5, for potential other modifications of fucose containing blood antigens, for example glycan tagging.

Wetlab

In our experimentes we aimed to assemble our final plasmid via gibbson assembly, transform the plasmid into E. coli, induce and purify the protein for use in a peroxidase assay, in which we wanted to test the amount of peroxide production on different substrates. Peroxide is a sideproduct of the reaction these enzymes catalyze and can therefore be used to test for activity of the enzyme.

DNA fragment amplification (PCR)

The gene fragments were synthesized by a commercial supplier (twist bioscience and IDT). Gene fragments were first amplified by PCR using taq polymerase. The Plasmid, which we used for our cloning was pXZ11, which was a gift by Dr. Xiaobo Zhong. This plasmid was also amplified by PCR using Q5 polymerase to intoduce Gibbson overhangs and linearized the plasmid for gibbson assembly.

Gel PCR Inserts
Fig.9 - Amplified DNA fragments recieved from twist bioscience of our inserts. Both fragments (TMB5 left, E5X4 right) were successfully amplified and are approximately 1.7 kb long.
Gel PCR Inserts
Fig.10 - Amplified linearized plasmid with gibbson overhangs added by PCR. All lanes contain the same product, amplified plasmid pXZ11.

Plasmid assembly

The Plasmid was assembled using Gibson assembly. Two plasmids were assembled, one with TMB5 as the insert, one with E5X4 as the insert. Both sequences contained an added C terminal His­6- tag.

Transformation into E.coli and screening

The assembled plasmids were transformed into stellar E. coli ,due to their outstanding transformation efficacy. The colonies were screened for inserts using colony PCR.
Gel Colony PCR TMB5
Fig.11 - Colony PCR of 9 colonies of insert TMB5, which is approximately 1.5 kb long. All colonies except 3, 5 and 7 contained the insert.
Gel Colony PCR TMB5
Fig.12 - Colony PCR of 9 colonies of insert E5X4, which is approximately 1.5 kb long. All colonies except for 7 contained the insert

The Plasmids were isolated via miniprep and sequenced using sanger sequencing.

Gel Colony PCR TMB5
Fig.13 - SDS-Page of samples taken during the protein purification test vI – before induction with IFPG nI after induction with IFPG Lysate - after Sonification FT – Flow through F1&2 – Fraction 1&2 10 & 20 referes to the GFP taged fusion proteins synthesized by IDT T 1,9 referes to Tmb5-1, the Plasmid that contains our target protein. The target Protein has a molecular weight of 52,8 kDa (marked yellow)

The target protein has a molecular weight of 52,8 kDa and should be visible in the area marked yellow in the picture. From the picture it is visible that a protein with the right molecular mass was present until the eluation, but disappears here. Furthermore there are too many proteins visible in the eluation, leading to the conclusion that the purification failed. There is no protein present in the other fractions. It is unknown why the purification failed and due to time constraints not possible to repeat the experiment, leading to an inconclusive result. the next step would be to troubleshoot the protein purification and attempt it again.

Conclusion

We have developed an easy to use approach for searching a large space of protein structures for the purposes of molecular docking, which in future might prove helpful for other projects and teams looking to search a wide space of possible proteins. We have also successfully employed molecular docking to find plausible candidate enzymes that allowed us to formulate a hypothesis about their reactivity and ligand specificity, which, however, we were unable to fully test in the wetlab in time, although we were able to produce all the needed materials for such a test, which will hopefully be performed a few weeks after wiki freeze and then finally test our hypothesis, allowing us to either finetune and expand our molecular docking based modelling approaches, or allows us to start anew in the next design cycle and come up with a modelling approach that better fits biological reality.

References

[1] ChemBioChem2002 WILEY-VCH-Verlag GmbH 1439-4227/02/03/08 Modification of Galactose Oxidase to Introduce Glucose 6-Oxidase Activity Lianhong Sun,Thomas Bulter,Miguel Alcalde,Ioanna P. Petrounia and Frances H. Arnold