Model | Stockholm - iGEM 2023

Structure prediction

The scope of our project extended beyond the confines of the laboratory, emphasising the indispensable role of computational modelling in synthetic biology. In engineering projects like ours, modelling serves as a crucial tool, primarily employed to facilitate design, inform decision-making, and, at times, to interpret experimental data. While time constraints compelled us to initiate our experimental pipeline without exhaustive modelling, relying largely on our prior knowledge and iterative experimentation, we nonetheless engaged in essential modelling activities across various facets of our project.

After a contribution to the plasmid design phase, our foremost computational undertaking was to establish a precise computational representation of the fusion protein construct we aimed to engineer. This was a critical step, especially considering that anchor peptides, which constituted a pivotal element of our design, were only available as peptidic sequences without structural data. Given the potential for unforeseen epistatic effects in fusion proteins, obtaining an atom-level three-dimensional structure that aligned with our engineered sequences was paramount. This structural representation would play a pivotal role in downstream analyses and decision-making processes.

In the absence of experimental structures, deep-learning models have emerged as trusted state-of-the-art tools for predicting protein foldings from sequences. One such prominent model, AlphaFold, has gained widespread recognition for its accuracy and reliability. In our pursuit of a computational structure for our fusion protein, we opted to utilise a model inspired by AlphaFold, known as OmegaFold. This choice was motivated by several factors. Notably, OmegaFold offered the ability to resolve protein structures without the reliance on upstream multiple sequence alignments, a particularly advantageous feature when working with engineered proteins that might lack homologous sequences. Consequently, OmegaFold exhibited lower computational demands, aligning with our project's resource and time constraints (Fig. 1).

Figure 1: The structural prediction of our NLuc fusion protein. The confidence is given by B Factor gradient, red being the highest confidence and dark blue the lowest confidence.

Leveraging OmegaFold, we conducted structure predictions for a total of seven distinct protein configurations, accounting for the various states of our engineered constructs. Throughout this process, we maintained vigilance to identify any potential misfolded predictions that could impact the functionality of our fusion peptide. Of these predictions, the structure we employed for the majority of our downstream analyses was that of the one combining the primary luciferase domain with the anchoring domain, and excluding the His-tag (refer to our experimental workflow for more details), mirroring the configuration intended for use in a hypothetical commercialised testing kit.

Visualisation

The visualisation of protein structures is of great significance in research. In this context, PyMOL emerges as an indispensable tool, offering a comprehensive platform for the rendering, exploration, and analysis of protein structures. PyMOL provides a tangible representation of molecular architecture, aiding in the comprehension of protein structures in interactive three-dimensional models. Key functionalities of PyMOL include the depiction of protein structures in various representations, including cartoons, surfaces and different colours. For our representations we have taken advantage of: (i) cartoon representations to visualise the secondary structures, (ii) surface representations to identify binding sites and interactions with other molecules, (iii) transparency gradients to combine the previous mention options, and (iv) adapted colouring schema for the assignation of specific regions. We have worked with two constructs, using an anchor peptide, a linker, and either NanoLuc luciferase (NLuc) or eGFP (Fig. 2). In the following sections we will describe the structure of both signal emitting proteins.

Our first construct used NLuc, a pivotal enzyme in the realm of bioluminescence. Structurally, it is characterised as a relatively compact protein, predominantly composed of beta sheets that collectively assume a beta-barrel conformation. The hydrophobic core of NLuc consists of nonpolar amino acid residues that tend to cluster together due to their aversion to water molecules. In contrast, the hydrophilic surface is rich in polar and charged amino acids, which readily interact with water molecules. This juxtaposition of hydrophobic and hydrophilic regions within the protein's structure promotes solubility.
Our second structure used eGFP, which emits a green fluorescence when subjected to specific wavelengths of light. Its structure is similar to that of Luciferase, presenting a compact beta-barrel confirmation. Central to this beta-barrel structure lies the chromophore, which assumes a central role in fluorescence.
Both constructs have an alpha-helical linker that connects the luminescent protein to the anchor peptide. It is composed of a central alpha-helical sequence and a downstream TEV cleavage site and an Avi-Tag.
The anchor peptide itself is based on an antimicrobial peptide from Bacillus subtilis. The anchor peptide (LCI) has 47 residues and forms a beta sheet structure (Gong et al., 2011). Previous research has shown the anchor peptide to be stable in solution and to present maximum binding affinity for polypropylene surfaces (Rübsam et al., 2017).

A) shows our NLuc construct. The anchor peptide is in yellow, the linker in cyan and NLuc in copper. B) shows our eGFP construct. The anchor peptide is in yellow, the linker in cyan and eGFP in indigo.

Simulation building

Our initial foray into protein characterization commenced with measurements based on the computational protein structure generated through OmegaFold. However, we were acutely aware that relying solely on a conformation derived from a deep-learning model carried inherent risks. Various biases, such as homology bias, resolution bias, template bias, and conformational bias, could potentially skew our predictions, impacting the overall accuracy and applicability of our findings.

To mitigate these concerns and attain a more realistic representation of the protein's conformation within its natural environment, we embarked on a two-step molecular dynamics (MD) simulation process. In the first step, we subjected the structure to an initial round of MD simulation using the NAMD engine. Prior to simulation, we pre-processed the structure using CHARM-GUI, mostly for solubilizing it. This initial relaxation step aimed to address any abrupt conformational shifts and to align the protein structure with its natural state (Fig. 3).

Figure 3: The RMSD (A) and RMSF (B) of our NLuc fusion protein.

Building upon the stabilised structure obtained from the initial simulation, we undertook a more extensive and precise second simulation run. This step allowed us to delve deeper into characterising critical attributes of our engineered constructs, including factors such as flexibility and stability. The wealth of data gathered through this extended simulation informed our understanding of how the fusion peptide behaves under dynamic conditions, yielding valuable insights that would guide our subsequent experimental efforts (Fig. 4).

Figure 4: Our solubilised protein.

Our ultimate aspiration was to computationally evaluate the binding interactions between our fusion peptide and an all-atom plastic molecule structure. This endeavour, however, proved to be our most ambitious and challenging undertaking. Regrettably, we were unable to fulfil this objective within the constraints of our limited time and computational resources. Generating a realistic, amorphous, polymer structure to simulate the plastic molecule, a task further complicated by the intricacies of CHARMM-GUI, posed substantial difficulties. Despite our best efforts, this aspect of our computational exploration remained unattained, underscoring the complexities involved in modelling such intricate biological interactions.

Read more about our construct design at experiments.

References

Gong, W., Wang, J., Chen, Z., Xia, B., & Lu, G. (2011). Solution structure of LCI, a novel antimicrobial peptide from Bacillus subtilis. Biochemistry, 50(18), 3621-3627.

Rübsam, K., Stomps, B., Böker, A., Jakob, F., & Schwaneberg, U. (2017). Anchor peptides: A green and versatile method for polypropylene functionalization. Polymer, 116, 124-132.