Throughout the development of our systems, we have utilized iGEM's engineering cycle, particularly in the following aspects:
In order to validate the findings of a previous iGEM team and investigate the genotoxic effects of carcinogens, we conducted dosage-dependent tests on the TOP10 E. coli. We utilized the RecA(K3) promoter (BBa_K3020001 (K3)) in conjunction with EGFP (BBa_K3020002) as a bioreporter to detect DNA-damaging agents such as UV, H2O2 and nalidixic acid. The RecA(K3) This promoter is an optimized version of BBa_K629001 (K6), a design by team SYSU 2011 aimed at driving the motor system of E. coli in a radioactive environment. To determine the comparative DNA-damage detection ability of the RecA(K3) and RecA(K6) promoters, we designed RecA(BBa_K3020001)-EGFPBBa_K4814015 and RecA(BBa_K629001)-EGFP BBa_K4814014constructs, and measured the GFP emission from these two promoter designs when exposed to carcinogens. Furthermore, we added a New Composite Part using different RBS: Strong BBa_K4814013 and B0032 BBa_K4814002.
As the efficiency of FRET is largely dependent on the degree of the molecular separation and overall spatial arrangement of the two fluorophores involved, further ratification is needed to ensure the reliability of the sequences we designed for our FRET system and thereby validate the credibility of the data acquired from FRET imaging. To accomplish this, we have undertaken the development of a protein model. Our protein model involves a series of scientific methodologies, including sequence predictions, RMSD calculations, protein-protein docking, and distance measurements. We aim to predict and elucidate the interaction between ATRIP-eGFP and RPA1-mCherry, shedding light on the properties exhibited by the two fluorescent proteins as they interact. Our model also serves to outrule the possibility of unexpected or failed experimental outcomes being due to the feasibility of protein binding interactions. As our experiments progressed and our model advanced to measure the separation between fluorescent proteins, our model serves as further proof that the observation of increased FRET after UV treatment was most likely due to the close proximity of the proteins.
To test and investigate promoter RecA(K3) and RecA(K6), we designed two composite parts, RecA(K6)BBa_K4814014 and RecA(K3)BBa_K4814015.
We transformed our designed plasmids into E.coli top10. Bacterial culture was incubated over night for 18 hours and diluted to OD 0.23 before the experiments.
Initially, the bacteria were treated and transferred into 96-well plates to measure fluorescence and optical density (absorbance at 600nm). The dosage references were obtained from team BIT 2019. Additionally, we tested Aspartame at a concentration of 1 g/L based on the study by Yılmaz and Uçar (2014).
During the initial three experiments, a plate reader was employed to perform fluorescence scans on 96-well plates. However, as demonstrated by the graphical representations, the observed fluorescence levels do not exhibit an increase with the escalating quantities of UV and H2O2. On the contrary, a decrease in fluorescence is observed. Our hypothesis postulates that the utilization of the peripheral wells of the 96-well plate may result in the evaporation of the well contents during the incubation process, thereby inducing an edge effect. Consequently, this phenomenon may introduce a potential bias in the obtained results.
We calculated the standard error (SE), with n=3:
where σ is the standard deviation, and n is the number of trials.
1. After conducting the tests, as the fluorescence decreases when exposed to higher concentration of carciogen, we discovered the need to mitigate the potential fluorescence emitted by deceased cells. We improved the experiment by (1) normalizing the fluorescence using SFU (Test 1.2); (2) using cell imaging to obtain mean fluorescence of each bacteria. (Cycle 2)
2. We observed that our previous configuration, which involved utilizing all the wells of the plate, resulted in the occurrence of the "edge effect." This effect has the potential to significantly impact the standardization of the data (Mansoury, M. et al, 2021). The edge effect arises when there is a discrepancy in the rate of evaporation and aeration among the cultures within the multi-well plate (Ross, D., Tonner, P. D., & Vasilyeva, O. B., 2022).
Consequently, we restructured the arrangement of our samples within the 96-well plates to mitigate this issue.
We conducted empirical investigations to further confirm the requisite conditions for the experiment.
After our first improved dosage dependent experiment of RecA(BBa_K3020001)-EGFP and RecA(BBa_K629001)-GFP, we plotted the change of the SFU of RecA(BBa_K3020001)-EGFP over several time points: 1, 2, 3, 6, and 24 hours after the treatment.
As can be seen from the graphs, the SFU rises considerably when the dosage increases at the third hour (red points), which also confirms the data collecting time (2-3 hours after treatment) stated by team BIT 2019.
As per the methodology described by team BIT 2019 (https://2019.igem.org/Team:BIT/Bio), we used SFU=(RFU/OD600) to normalize fluorescence. We compared the EGFP signal among different groups at the third hour. After comparing the data from three independent experiments, we observed a direct correlation between SFU and the intensity/concentration of the carcinogens, specifically UV and H2O2. In the aspartame group, both K3 and K6 exhibit a marginal increase in specific fluorescence units (SFU). This phenomenon can be attributed to the complex and indirect mechanism by which aspartame may potentially contribute to cancer development. Moreover, the performance of the K3 promoter was superior to that of the K6 promoter, indicating that the optimized K3 promoter was able to reduce background noise.
We calculated the standard error (SE), with n=3.
The results collected at the third hour indicates a more significant performance of K3 concerning the reoprting of DNA damages.
We sought guidance from our advisors, Prof. Leo Tsz On Lee and Prof. Tzu-Ming Liu from the University of Macau, to obtain a more robust approach for quantifying the fluorescence emission of each bacterium. They recommended employing cell imaging techniques utilizing microscopes.
Upon further investigation, we observed that bacteria containing the RecA-EGFP plasmid exhibited green fluorescence even without any treatment. Consequently, to obtain accurate and objective measurements of the GFP levels, we employed image processing software such as ImageJ. This enabled us to analyze the images and derive reliable results rather than relying on subjective assessments to determine the GFP quantities.
During our initial attempt to capture images of the bacteria, we used a direct placement method by placing droplets of bacteria onto a glass slide. However, the resulting images exhibited blurriness and cloudiness, as depicted in Figure 1. This undesirable outcome was attributed to the lack of lubrication oil application after each slide capture. Furthermore, due to the rapid movement of the bacteria within the culture, obtaining a clear and stable image of E.coli proved challenging. Consequently, the calculation of fluorescence measurements became a difficult task (July 14th).
During our subsequent attempt, we implemented several improvements to enhance the quality of our bacterial imaging. Firstly, we ensured the consistent application of lubricating oil and adjusted the microscope focus accordingly. Additionally, we introduced the use of agarose gel (0.1 g/ml) to stabilize the bacteria, yielding favorable outcomes with the bacteria ceasing their movement (Figure 3).
We optimized our familiarity with the experimental procedure, leading to a reduction in the treatment duration and precise control of the post-treatment time, which was set at three hours.
Moreover, we discovered that bacteria carrying the RecA-EGFP plasmid exhibited inherent green fluorescence without any additional treatment. Consequently, we employed software such as ImageJ to process the images, enabling us to obtain reliable results rather than relying on subjective determination of GFP levels (July 20th)
Later, we conducted the first of our three independent experiments using confirmed conditions. We also used Image J to analyze our data.
It is evident that the disparity between group (c) and (d) (RecA K6 without H2O2 and with H2O2) is not highly significant. However, individual bacteria within group (b) (RecA(BBa_K3020001)-EGFP treated with H2O2) exhibit noticeably higher fluorescence intensity compared to group (a) (no treatment). Nevertheless, due to the absence of a readily discernible difference, we opted to employ ImageJ software for calculating the mean values. (August 8th)
RecA(BBa_K3020001)-EGFP no treatment | RecA(BBa_K3020001)-EGFP H2O2 5mM | ||
RecA(BBa_K629001)-EGFP no treatment | RecA(BBa_K629001)-EGFP H2O2 5mM | ||
In order to ensure a consistent and standardized approach to handling the images, we employed image processing techniques. Specifically, we utilized ImageJ software to convert the images into an 8-bit format and applied the RenyiEntropy model to modify the threshold. This particular model, known for preserving additivity in independent events and aligning with the axioms of probability due to its multilevel thresholding property, was selected after testing various models.
By implementing this method, we were able to effectively encircle all bacteria, including those with lower fluorescence levels. This approach prevents biased selection solely based on brighter E.coli and ensures fairness and high-quality results in our analysis.
Default | Mean | RenyiEntropy |
In certain instances, E. coli tends to aggregate and form clusters of fluorescent units. To ensure accurate analysis, we exclusively considered objects whose area ranged from 100 to 3000 square units. After restricting the selection to this specific range, we generated box plots using the resulting data points. The raw images were captured using a 100x confocal microscope, utilizing the GFP channel (488). Subsequently, the images were processed using ImageJ, applying an 8-bit format and RenyiEntropy threshold auto-adjustment.
After confirming the imaging method, we decided to present both the SFU and cell imaging data. The imaging data will enhance the accuracy and assist us to judge the ability of K3 and K6 promoters.
As depicted in Figure 6, the fluorescence intensity of RecA(BBa_K3020001)-EGFP bacteria exhibits an upward trend with increasing duration of UVB exposure. This observation suggests that UVB radiation induces the SOS response in bacteria. A similar pattern is observed in the other treatment groups, including H2O2, nalidixic acid, and aspartame. Through this experiment, we have successfully validated the efficacy of utilizing the RecA promoter and EGFP fluorescent protein combination for evaluating genotoxicity based on the extent of DNA damage.
The imaging results (Fig. 7) of the K6 group (E. coli with RecA(BBa_K629001)-EGFP) indicate a lack of significant dosage-dependent response. This can be attributed to the enhanced performance of RecA(BBa_K3020001)-EGFP, which has undergone sequence optimization. The fluorescence intensity in the UVB exposure groups ranging from 6 to 18 minutes remained relatively unchanged. Similarly, in the aspartame and nalidixic acid groups, the fluorescence exhibited minimal variations. Conversely, the H2O2 group displayed a slight decrease, but the overall changes were not significant.
All figures above have one-way ANOVA significance p < 0.001, with significance comparison (with No Treatment) indicated by: ns = no significance; * = p < 0.05; ** = p < 0.01; *** = p < 0.001; **** = p < 0.0001.
We aimed to further improve the reporter by replacing RBS B0034 (our composite part design) with RBS B0032 and new RBS strong. In order to evaluate different compositions of RecA-RBS-EGFP and enhance the design of RecA-EGFP by substituting the RBS region, we conducted a tests that target different RBSs.
The provided charts display the SFU of the DNA damage reporter with various RBS sequences, namely Strong (BBa_K4814013), B0032 (BBa_K4814002), and B0034 (BBa_K4814015). The original design of RecA(BBa_K3020001)-EGFP (K4814015) incorporated the B0034 RBS. In our study, we replaced B0034 with B0032 and Strong, a new RBS generated through machine learning analysis by Zhang, M. et al (2022). In this study, a machine learning model was developed to predict the translation initiation rate of different RBS sequences. Then, the designed sequences were synthesized and experimentally tested for their translation initiation rates.
We compared the fluorescence normalized to OD600 when exposed to genotoxic agents such as UV, H2O2, Aspartame, and Nalidixic acid.
We calculated the standard error (SE), with n=3.
In the case of aspartame, the SFU of all engineered bacteria decreases as the concentration of Aspartame increases. On the other hand, in the nalidixic acid group, the SFU of both the Strong and B0034 designs initially increases from 0 to 10 μg/ml, followed by a sharp decline between 10 and 100 μg/ml.
Zhang, M., Holowko, M. B., Zumpe, H. H., & Ong, C. S. (2022). Machine Learning Guided Batched Design of a Bacterial Ribosome Binding Site. ACS Synthetic Biology, 11(7), 2314-2326. https://doi.org/10.1021/acssynbio.2c00015
Yılmaz, S., & Uçar, A. (2014). A review of the genotoxic and carcinogenic effects of aspartame: does it safe or not?. Cytotechnology, 66(6), 875–881. Yılmaz, S., & Uçar, A. (2014). A review of the genotoxic and carcinogenic effects of aspartame: does it safe or not?. Cytotechnology, 66(6), 875–881. https://doi.org/10.1007/s10616-013-9681-0
Mansoury, M., Hamed, M., Karmustaji, R., Al Hannan, F., & Safrany, S. T. (2021). The edge effect: A global problem. The trouble with culturing cells in 96-well plates. Biochemistry and Biophysics Reports, 26, 100987. ISSN 2405-5808. https://doi.org/10.1016/j.bbrep.2021.100987.
Ross, D., Tonner, P. D., & Vasilyeva, O. B. (2022). Method for reproducible automated bacterial cell culture and measurement. Synthetic Biology, 7(1), ysac013. https://doi.org/10.1093/synbio/ysac013
We obtained the structures of the fluorophores (eGFP, mCherry, eCFP, YFP) and their respective DNA damage response (DDR) proteins (ATRIP, RPA1) from Protein Data Bank (PDB). By docking them together, we plan to gain insight into their arrangements, their respective feasibilities, and their binding configurations, and then dock the bound structures together to further our understanding.
At first, we planned to use Discovery Studio’s function ZDOCK to predict the protein conformations. ZDOCK is an algorithm that predicts the structures of protein complexes by leveraging individual protein structures. It also takes into account critical factors such as Pairwise Shape Complementarity (PSC), Desolvation (DE), and Electrostatics (ELEC) to calculate and rank the docking[1].
Below are the equations of the scoring function[1]:
The results of our attempt are represented in the table below:
Receptor | Ligand | ZDock Score |
---|---|---|
ATRIP | EGFP | 25.02 |
RPA | mCherry | 19.74 |
ATRIP | ECFP | 27.54 |
RPA | EYFP | 17.62 |
|
|
||
|
|
We then docked RPA-1 with ATR-ATRIP, intending to combine the result with the fluorophore-receptor docking result for further analysis:
Figure 5, 6. Left: ZDock results of ATR-ATRIP complex-RPA. Right: Graph of cluster, density, and, ZDock score, where each point represents a proposed structural configuration generated by ZDock.
In this approach, six structures from PDB were used:
Results were yielded consisting of clusters, cluster sizes, and ZDock scores. ZDock scores are generated from a scoring algorithm that ranks each pose based on the scoring function mentioned above; the higher the score, the more feasible the pose. In the results of RPA1-ATRATRIP, there is a general downward trend in the number of clusters as ZDock scores increase. This correlation could have been more conclusive evidence to prove our system was effective and efficient, however. Also, after extensive reading in the relative field, we recognized flaws in the previous approach: using the whole existing complex to model is not realistic; it does not complement our vector design, which only includes ATRIP, and the fluorescent proteins were not attaching to the sites we had intended them to. Simply docking the PDB files to form our engineered protein, instead of predicting our own proteins’ structures, neglects the possibility of any structural deformation due to these special circumstances, as FRET and fluorophore-tagged proteins are not naturally occurring.
After consulting Bioinformatics Scientist Mr. Edison Ong from Moderna, he advised us to apply an alternative approach and use a different set of software tools to achieve our goals. The recommended method included predicting the protein structure, validating it with existing proteins, docking the structures, and measuring the docked molecular separation. Subsequently, we begin with predicting how the peptide chains from our designed sequences will fold into proteins. At the same time, we found out that the PDB file of RPA-1 we used in Cycle 1.0 does not contain the protein-interacting domain(N-terminal) but only the DNA binding domain(dbd-A, dbd-B, dbd-C). After correcting the sequence, we subsequently began by predicting how the peptide chains from our designed sequences would fold into proteins.
In order to give an accurate representation of the proteins expressed from our vectors, we obtained the amino acid sequences of the involved proteins from Vectorbuilder, the platform where our vectors were designed. Given the lack of known protein structures for our target proteins, ATRIP and RPA1, we employed AlphaFold[2][3] for the prediction of their 3D models. To initiate the process, we utilized the modified amino acid sequence of each protein and subjected it to AlphaFold's simulation.
The sequences used for this prediction are listed below in the format of RefSeq Transcript ID-linker-Marker:
Figure 9, 10. AlphaFold predictions of RPA1-mCherry(left), ATRIP-EGFP(right)
To ensure the predicted structure’s accuracy, we aligned AlphaFold’s prediction with existing structures from PDB and calculated the RMSD (Root Mean Square Deviation) of each part of the protein compared to the known structures, using PyMOL. The align function superimposes the provided predicted structure with existing structures and identifies their similarity. In simple terms, a large RMSD indicates an unstable and deviating predicted protein structure and a RMSD below 5 is considered ideal.
Below are the RMSD values of RPA1-mCherry with two structures of RPA70 and mCherry. Codes in parantheses indicate the structure obtained from PDB.
RPA1-mCherry | |
---|---|
RPA70 N-term (2b29) | 0.369 |
RPA70 dbd-A, dbd-B, dbd-C (4gnx-C sequence: dbd-A:181-290, dbd-B:301-422, dbd-C:436-616) | 0.494, 0.662, 0.958 |
mCherry (4zio) | 2.581 |
Below are the RMSD values of ATRIP-EGFP with ATRIP and EGFP. Codes in parantheses indicate the structure obtained from PDB.
ATRIP-EGFP | |
---|---|
ATRIP(5yx0-C) | 43.318 |
EGFP(6b7t-A) | 0.298 |
However, in PyMOL’s calculation of the RMSD of the predicted ATRIP-EGFP to ATRIP, the RMSD value was 43.318, which was extraordinarily high. Still, this is expected, as the reference structure of ATRIP is from an ATR-ATRIP complex not only ATRIP. However, our vector only contains ATRIP as the size of ATR-ATRIP would be too great to put into a single lentiviral vector. Therefore, The ATRIP’s structure will be slightly different from the one in the complex.
We used ClusPro 2.0[4][5][6][7] next, utilizing its CPU, to perform molecular docking simulations. The docking calculations were carried out with hydrophobic-favored coefficients to enhance the accuracy of the results.
E = 0.40Erep +− 0.40Eatt + 600Eelec +2.00EDARS
A total of 30 models were generated through the docking simulations. Subsequently, we focused our analysis on the top 10 models with the highest scores. Remarkably, all of these models exhibited a range of FRET effectiveness within the distance range of 10-100Å, indicating that our designed bio-reporter system is feasible in terms of configuration and proximity.
In addition to ClusPro, we opted for HDOCK to dock our predicted structures in hopes that the results from these simulations would be able to co-validate each other. In the case of any disparities in the outcomes of these two algorithms, it would be meaningful to compare the different scores, parameters and configurations that they may provide.
Each HDOCK model comes with two scores, a docking score and a confidence score: The docking scores are calculated by a knowledge-based iterative scoring function. More negative docking scores indicate more likely binding models. However, since the score has not been calibrated to experimental data, it should not be interpreted as the actual binding affinity of two molecules. The confidence score is determined based on the docking score and is designed to indicate the likelihood of binding between the protein-protein/RNA/DNA complexes. Generally, when the confidence score is above 0.7, the two molecules would be very likely to bind in this pose.[8] The calculation of the confidence score is defined as follows:
Confidence_score = 1.0/[1.0+e0.02*(Docking_Score+150)]
Based on the literature review, FRET can be an accurate measurement of molecular proximity within the range of angstrom distances (10–100 Å)[9]. Using PyMOL, we analyzed the results of both ClusPro and HDOCK by calculating the angstrom distance between the fluorophores attached to ATRIP and RPA in the poses generated by these algorithms.
After results were obtained, we focused our analysis on the top 10 models with the highest scores using PyMol. Remarkably, all of these models exhibited a range of FRET effectiveness within the distance range of 10-100Å, indicating that our designed bio-reporter system is feasible in terms of configuration and proximity.
Below are the docking scores and confidence scores for the top ten models, which are ranked by their docking scores:
Rank | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
Docking Score | -252.39 | -239.71 | -231.76 | -228.53 | -221.35 | -217.78 | -217.14 | -215.96 | -214.32 | -213.43 |
Confidence Score | 0.8857 | 0.8574 | 0.8369 | 0.8279 | 0.8064 | 0.7950 | 0.7930 | 0.7890 | 0.7835 | 0.7805 |
In comparison to ClusPro, the results in HDOCK displayed a wider spectrum, ranging from 12.7Å to 132.5Å.
It is important to emphasize that the scores given by ClusPro and HDOCK are not directly correlated with the distances determined by PyMOL; poses with higher scores do not necessarily indicate a larger or smaller degree of separation. However, these highly-ranked poses are more likely to form, so analyzing their distance is relatively meaningful as it encompasses a substantial portion of the poses generated.
In this investigation, we obtained results from ZDock, AlphaFold, then PyMOL, ClusPro, HDOCK, and finally PyMOL again. The results were generally in favor of our system design despite a few discrepancies, such as a high RMSD between ATRIP-EGFP and ATRIP(5yx0-C), and a large range of distances between EGFP and mCherry in the poses generated by HDOCK. After analyzing the results, we believe that further validation can be completed experimentally through western blotting, which can confirm the proper folding of the expressed proteins.
[1] Chen, R., Li, L., & Weng, Z. (2003). ZDOCK: an initial-stage protein-docking algorithm. Proteins, 52(1), 80–87. https://doi.org/10.1002/prot.10389
[2] Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold: Making protein folding accessible to all. Nature Methods, 19(6), 679-682. https://doi.org/10.1038/s41592-022-01488-1
[3] Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A., Ballard, A. J., Cowie, A., Nikolov, S., Jain, R., Adler, J., Back, T., . . . Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589. https://doi.org/10.1038/s41586-021-03819-2
[4] Desta IT, Porter KA, Xia B, Kozakov D, Vajda S. Performance and Its Limits in Rigid Body Protein-Protein Docking. Structure. 2020 Sep; 28 (9):1071-1081. https://doi.org/10.1016/j.str.2020.06.006
[5] Vajda S, Yueh C, Beglov D, Bohnuud T, Mottarella SE, Xia B, Hall DR, Kozakov D. New additions to the ClusPro server motivated by CAPRI. Proteins: Structure, Function, and Bioinformatics. 2017 Mar; 85(3):435-444. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5313348/pdf/nihms834822.pdf
[6] Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, Beglov D, Vajda S. The ClusPro web server for protein-protein docking. Nature Protocols. 2017 Feb;12(2):255-278. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5540229/pdf/nihms883869.pdf
[7] Kozakov D, Beglov D, Bohnuud T, Mottarella S, Xia B, Hall DR, Vajda, S. How good is automated protein docking? Proteins: Structure, Function, and Bioinformatics. 2013 Dec; 81(12):2159-66. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3934018/pdf/nihms556382.pdf
[8]Yan, Y., Tao, H., He, J., & Huang, S. (2020). The HDOCK server for integrated protein–protein docking. Nature Protocols, 15(5), 1829-1852. https://doi.org/10.1038/s41596-020-0312-x
[9] Sekar, R. B., & Periasamy, A. (2003). Fluorescence resonance energy transfer (FRET) microscopy imaging of live cell protein localizations. The Journal of cell biology, 160(5), 629–633. https://doi.org/10.1083/jcb.200210140