Emerging contaminants in water pose a significant risk to long-term health (Pereira et al., 2015; Stuart et al., 2012). Current detection methods require specialized laboratory equipment and trained personnel, thus we aimed to develop a quick and easy fluorescence-based detection method for erythromycin. To generate a prototype, we went through iterations of the engineering design process to test and improve upon different aspects of the project such as transformation efficiency and protein overexpression yield.
For donor fluorescence to occur, the chromophores in the system must be close enough in space for FRET to be possible, allowing acceptor fluorescence to occur. With our ECFP-EryK-mVENUS fusion construct, if erythromycin is present, EryK will ideally undergo a conformational change that should alter the distance between the donor, ECFP, and the acceptor, mVENUS (Figure 1).
To achieve this, we started by first transforming Escherichia coli TOP10 as a maintenance strain and then E. coli BL21 with a plasmid encoding the 6xHis tagged ECFP-EryK-mVENUS fusion construct (Chan et al., 2013; Jeong et al., 2015). After expression, our construct will be purified which will produce a detectable fluorescent signal change when bound to erythromycin (Figure 2).
To assess the feasibility of our approach, protein structure predictions and molecular dynamics simulations were carried out to ensure the fluorophores of the fusion construct were within a functional FRET distance.
Using the ab initio structure prediction workflow and ColabFold (Mirdita et al., 2022), a predicted structure of the ECFP-EryK-mVENUS fusion construct was obtained (Figure 3). ColabFold produces 5 predictions, however only the one with the best predicted aligned error (PAE) will be considered, as we are mainly interested in the 3D arrangement of the three constituent proteins.
While the 3D fold of each constituent protein matches previously characterized structures (Montemiglio et al., 2013; Pletnev et al 2012; Park et al., 2016), the connector regions and geometrical relationship between each protein is predicted with low confidence. This is indicated by the PAE values (Figure 4).
It should be noted that ColabFold uses multiple sequence alignments and neural networks to predict a 3D structure, however it doesn't consider real-world biophysics and interactions with solvent, spontaneous changes between open and closed conformations, and other possible variables.
To increase our confidence in the predicted structure of our construct, molecular dynamics (MD) simulations of the ColabFold model were performed to assess if the predicted structure obtained was maintained after biophysical interactions were introduced to the system.
To achieve this, a preliminary simulation was run with GROMACS using model 5 from Colabfold to assess whether the structure eventually converges over the course of the 10 ns simulation when including biophysical potentials. First, the topology was generated from the ColabFold model with the AMBER99SB-ILDN force field (Lindorff-Larsen et al., 2010) and a TIP3P (Jorgensen et al., 1983) water model. The model was put in a simulation box with a distance to the edge of 1.2 nm. The system was solvated in water and the charges were neutralized with Na+ ions. After this, the system was energy minimized to avoid problems that arise from disagreement between ColabFold's predicted structure and the energy minimum of our system according to the AMBER99SB-ILDN forcefield. To equilibrate the system, first a NVT simulation was run for 200 ps followed by a NPT simulation for 1 ns, both with 2 fs timesteps. The production MD was run for a total of 10 ns with 2 fs timesteps under NPT conditions (Figure 5). Simulations were performed according to previous work (Lemkul, J. A., 2019) and available documentation (Hess et al., 2008).
To ensure the reliability of the simulation, thermodynamic variables such as the temperature, pressure, density, and total energy of the system were monitored for convergence. It should also be noted that after 4 ns the root-mean-square deviation of the simulation stopped diverging from the energy-minimized ColabFold prediction. Interestingly, the most dynamic regions of our system (Figure 5) correspond to the regions that display the greatest conformational change when comparing X-ray structures of the open and closed conformations of EryK (Figure 6). A potential limitation of our in silico system is that the ionic strength may not match in vitro conditions thus altering the strength of electrostatic interactions (Zhou & Pang, 2018). Nevertheless, the simulation was promising and yields greater confidence in the predicted structure.
The original plasmid was made up of a pUC57 backbone and the insert for ECFP-EryK-mVENUS. As an initial step we digested and ligated the insert into a pBAD/HisB and pBAD/HisC backbone with XhoI and NcoI (New England Biolabs). Both the vector and the insert were digested with the enzymes (Figure 7A) and, after gel purification and ligation, transformed into E. coli TOP10 as a maintenance strain (Figure 7B).
Colony polymerase chain reaction (PCR) was performed to confirm successful transformation (Figure 8). Following this, miniprep was performed on 5 mL liquid cultures from the transformation palate, these were grown in LB and carbenicillin and incubated overnight at 37 ºC with agitation at 200 rpm. For miniprep the Wizard Plus SV Minipreps DNA Purification Systems (Promega) was used.
The subsequent PCR step was performed several times varying annealing temperatures, number of cycles, and elongation step duration. The backbone (Figure 9 A) and the insert (Figure 9 B) were successfully amplified. After this the fragments were ligated together using T4 DNA Ligase Reaction Buffer and T4 DNA Ligase (New England Biolabs).
The next step was to transform pET28b(+)/ECFP-EryK-mVENUS into E. coli BL21, an expression strain. This proved difficult as the process was repeated several times varying amounts of plasmid added and incubation times with no success. Some of the mistakes made include not using enough plasmid during the transformation, not letting the transformation reaction recover for enough time, or the temperature for heat shock being too high. Once optimized, the transformation was successful (Figure 10).
Once the plasmid with the whole construct was successfully transformed into E. coli BL21, the cultures were induced with 0.4 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) to stimulate protein overexpression (Studier, F. W., 2014). This was repeated several times with different temperatures and for different periods of time, the last attempt at 16 ºC for 12 hours yielded a faint band at the approximate expected molecular weight of our construct (~83 kDa) after sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) (Figure 11). Despite not seeing a clear band from the expression, we proceeded with the purification.
Purification was done through a Ni-NTA resin (Sigma-Aldrich) and all elutions were run on an SDS-PAGE gel to confirm the presence of our protein. Unfortunately, no protein was purified (Figure 12).
As we couldn't purify the complex, different tests were performed to assess viability of transformed E. coli in the presence of erythromycin. LB plates were made with different concentrations of erythromycin 0.2 μg mL-1, 2.5 μg mL-1, and 10 μg mL-1. Liquid cultures were made of transformed E. coli BL21 and plated onto these petri dishes. This was done to determine if E. coli remained viable after exposure to the antibiotic. Once colony growth and fluorescence were observed (Figure 13), it was determined that there is a detectable level of fluorescence when exposed to erythromycin, however, it was not established whether fluorescence was directly related to erythromycin or was basal.
Optimizing overexpression for subsequent protein purification is the next step. This will be done in the hopes of adding the purified protein directly to the sample with erythromycin to ensure proper detection and remove the risk of unwanted mutations and genetic variability caused by evolutionary pressure. Additionally, this will help calibrate the baseline of fluorescence with different concentrations of antibiotic to improve read accuracy and effectiveness.
In the future, and provided that ligand binding induces conformational changes in the sensor, EryK could be switched out for a different protein. This would allow detection of different emerging contaminants such as other antibiotics, per- and polyfluorinated substances, and heavy metals.
Current FRET system design incorporating biopart BBa_K4447001 would allow us to determine the presence of erythromycin in water bodies, which represents a first step towards the detection of antibiotics, emerging contaminants of significant importance. An advantage of our FRET system is the potential to replace biopart BBa_K4447001 with other genes that are compatible with different compounds such as other antibiotics or heavy metals.
The aforementioned reasons led us to search for an enzyme that catalyzes a reaction involving this metal, with the aim of generating a new biopart and be able to incorporate it into the biosensor, as a step towards detecting heavy metals in water bodies. The enzyme chosen for the generation of the biopart was phytochelatin synthase (PCS) (EC 2.3.2.15), which catalyzes the synthesis of glutathione (GSH) polymers called phytochelatins (PCs), with cadmium being the main inducer in plants and other organisms (García-García et al., 2014; 2020).
As with the structure for ECFP-EryK-mVENUS, a ColabFold model was obtained using the provided sequence for AtPCS (UniProt: Q9S7Z3) (Figure 14).
The sequence corresponds to a monomer and this was properly modeled by ColabFold with a high amount of confidence according to the PAE and PLDDT values (Figure 15).
MD simulations were performed to assess if the predicted structure obtained of the system was conserved after including biophysical potentials. A preliminary simulation was run with GROMACS using model 5 from ColabFold. The topology was generated with the same force field (AMBER99SB-ILDN) and water model (TIP3P) that was used for the simulation of the other construct (Lindorff-Larsen et al., 2010; Jorgensen et al., 1983) . The simulation was run in the same way and with the same parameters as the one for ECFP-EryK-mVENUS (Figure 16).
Contrary to the ECFP-EryK-mVENUS model, there are no published experimental structures of AtPCS1. The simulation shows that the protein structure does not significantly diverge from the ColabFold model and, while it doesn't encompass our entire construct, it is still informative as the fluorophores will not induce conformational changes. On the other hand, it would be necessary to run a simulation with the presence of the ion to determine whether the enzyme would undergo a conformational change when catalyzing the reaction.
The goal of this was to construct the vector that would allow the expression of AtPCS in E. coli. In order to achieve this, the following workflow was necessary:
This goal was successfully achieved as shown in Figure 17, where a band corresponding to the length of the AtPCS gene was identified in sample D1 from colony 1 after a digestion assay with enzymes NdeI and EcoRI HF.
The goal of this step was to confirm that E. coli was able to synthesize the AtPCS enzyme. To achieve this, the following workflow was carried out:
This goal was partially achieved, as several attempts were made to induce expression of AtPCS. Induction was attempted with different temperatures and for different periods of time, the last attempt at 16 ºC overnight produced a very faint band (like a stain) on the SDS-PAGE gel, as shown in Figure 18.
Multiple attempts were carried out to overexpress the protein, varying the induction conditions (incubation time and temperature) in an attempt to optimize them; however, no band was observed on the gel after performing the purification process using the colony 1 culture. After setting a 5 mL pre-culture from colony 4, we proceeded with inoculation, cell harvesting and sonication. This was done to continue with a final attempt of purification through a Ni-NTA resin (ThermoFisher). An SDS-PAGE was run to confirm the presence of our protein. Unfortunately, no protein was purified (Figure 19).
Optimization of induction conditions are needed for confirmation of AtPCS enzyme production. To complete the incorporation of the enzyme into the FRET system, the plans for future activities are as follows:
This is meant to indicate whether AtPCS produced by E. coli is able to interact with cadmium, which is necessary for its incorporation into the FRET system, as it would allow the detection of this heavy metal. Furthermore it would help determine whether a conformational change occurs or not.
Linking AtPCS with ECFP and mVENUS will allow the generation of a fluorescent signal once the enzyme interacts with cadmium. This step is meant to assemble the components of the FRET system so it can be produced by E. coli.
Once the protein construct is overexpressed it will need to be tested. To do so, the construct will be purified and then tested with different amounts of cadmium. We expect to be able to detect a change in FRET signal with an increasing intensity as the concentration of cadmium becomes higher.
Experiments undertaken and future endeavors are summarized and displayed in Figure 20.