Demonstration of the engineering design cycle.
Our lab efforts were split into designing two plasmids: an expression vector and storage vectors. The expression vector was used to test our gene circuit in PFAS, while we used the storage vectors in order to transform bacteria as “warehouses” to allow for easy access to our gene inserts.
Note: Full protocols for each step can be found on Experiment and Notebook Page
The design of the gene inserts and their addition into the plasmid was made by our team (REFERENCE DESIGN TAB OF WIKI). Preliminary protocols for procedures for our lab work were also created, including the ligation of the insert, PFAS testing, and Lactose testing (as a control). These protocols included basic lab techniques, such as Miniprep, PCR Purification columns, transformation, etc. In this cycle of the experiment, we were interested in making sure that the ligation procedure would result in our desired insert.
In the lab, the dried DNA delivered by Twist was resuspended and an aliquot of resuspended parts A, B, C were taken for assembly, with the rest to be stored in plasmids (reference Storage Vectors). Part A was digested with SpeI, B with XbaI, and C with XbaI overnight. A PCR cleanup column procedure was performed on all 3 parts. A and B were then ligated with T4 DNA Ligase to form part AB. AB was then digested with SpeI before going through a PCR cleanup column. After cleanup, AB was ligated with C to form ABC and digested with EcoRI and PstI to prepare it for insertion into a vector backbone. However, during the final PCR cleanup procedure for part ABC, Wash Buffer was added instead of Binding Buffer on accident. This resulted in a low yield for the ABC insert, necessitating multiple cycles of recovery cleanup through gradual addition of Binding Buffer and ethanol evaporation.
Due to the mistake in the last step of gene insert preparation, testing of the insert with the Bioanalyzer was postponed. It was decided that the process of creating the insert should be restarted from scratch.
In the next iteration of the engineering cycle, the goal was to recreate the composite part ABC without mistakes in the cleanup process. All other steps were correctly executed in the first cycle according to our planned procedure, so ideally the only alteration would be the avoidance of using the incorrect buffer during cleanup. We also decided to shorten digest times because ThermoFisher enzymes do not require overnight incubation and we had limited time left in the lab. Overnight incubation was only done in the first cycle because the entire assembly procedure could not be completed in one day, so digestion points were used as appropriate stopping points for lab work days.
The original steps set down in our lab notebook would be followed tightly with extra care required during cleanup stages. The goal of this cycle would remain unchanged: to assemble A, B, and C so that a gel analysis of our insert would have a band at ~3000 bp.
Every step described above and in our Experiments page for the assembly of A, B, and C was carried out correctly, including the botched step from the previous cycle.
The gene insert was loaded into an Agilent 2100 Bioanalyzer to compare band sizes. A dark band was found at 1000 bp, consistent with the size of A or B individually. A lighter band was present at 2000 bp, consistent with the size of the composite AB part. However, no band was detected at 3000 bp, which would’ve been representative of all 3 parts correctly ligated together.
From the Bioanalyzer results, it was evident something had gone wrong in between assembly of part AB and the final assembly of ABC. Furthermore, the faintness of the 2000 bp band was also indicative of a problem with ligation, however, we did not have time to redo the entire procedure, so we planned to re-ligate AB and C together from surplus pre-digested aliquots of AB and C.
Due to the low ligation efficiency demonstrated by the Bioanalyzer and the lack of the full ABC fragment, ligation time was increased beyond what was recommended by our Ligase supplier. Because we were also very short on time, we started from surplus AB and C left over from the previous cycle.
We had some surplus materials labeled as AB and C, and in our experiment, we chose to use the remaining AB along with an appropriate proportion of C. Initially, when we combined components A and B, we used a specific quantity of each. We followed the same procedure when we introduced component C during the ligation process of ABC. Although the kit supplier recommended a 5-minute ligation period with the provided enzyme quantity, we decided to extend this time significantly. Consequently, we performed a 15-minute ligation step and a 45-minute digestion step.
We loaded the ABC construct created in this cycle as well as the ABC construct created in Cycle 1 into the Bioanalyzer to compare band intensities and locations.
Bioanalyzer results for the ABC construct made in Cycle 3. A small band is present at 2000 bp but no discernable band is present at 3000.
Bioanalyzer results for the ABC construct created in Cycle 1. The Bioanalyzer showed that the A, B, and C parts used in Cycle 1 did not ligate at all. In the 3rd and 2nd engineering cycles, the construct did partially ligate to form AB but did not successfully incorporate part C. This means no detectable amounts of the full ABC construct was made in our lab. We decided not to pursue another cycle due to extreme time constraints.
Ligation remained a large problem with the procedure, as the parts evidently did not join together as expected. It is possible the large sizes of our inserts decreased the reaction efficiency during the ligation step, so in future assemblies with large parts a longer ligation time is strongly recommended. More testing will have to be conducted to determine exactly how much longer ligation time should take. If we had more time, we would’ve also miniprepped each insert in order to get a larger volume of each to work with, as that would’ve mitigated any DNA loss from each cleaning step. We were using nanogram quantities of DNA in these cycles because we didn’t have time to make more with mini prepping, so having more DNA would have increased our tolerance for any errors.
The design for the inserts in this portion were similar to the expression portion. Rather than ligating all three of the inserts together, the objective of this part of the project was to create storages of each part of the insert, which could then be extracted in the future for ligation and testing.
The first part was to digest and prepare the plasmids for ligation with the inserts. Plasmids were chosen for restriction sites (EcoRI and PST) which resulted in 43 viable plasmids. Out of these 43, the first 4 were taken for experimentation (BBa_I20270 - A11, BBa_J364000 - C11, BBa_J364001 - E11, BBa_J364002 - G11). These plasmids were aliquoted from the distribution kit and transformed into bacteria, which were streaked for overnight growth. A colony was taken from each plate for the Miniprep procedure, and the resulting supernatant fluid was used as A11, C11, E11, and G11.
The plasmids were nanodropped, and from concentration results, C11, E11, and G11 were used. Each of them was digested separately with EcoRI and PST overnight. They were then run through the bioanalyzer. After seeing good results, the plasmids were put through a gel purification column via gel electrophoresis. However, when the gel was examined for bands, none were present.
A few problems were observed in our build procedure, which we hoped to fix in the second iteration: Following the miniprep protocol carefully. We accidentally skipped the Resuspension Buffer which may have impacted our results. Taking out a singular colony before miniprep. Our plates contained a lot of lawns, which made it hard to pick out a singular colony, but this is something we needed to fix for the next iteration. The nanodrop results were unreliable because water was used as the blank, not the elution buffer.
Plasmid preparation was restarted by taking more of the already extracted A11, C11, E11, and G11. It was miniprepped while following the protocol carefully. The plasmids were nanodropped, and it was found that the A11, C11, and E11 were the purest. A PCR Purification column was used to purify the plasmid, and the plasmids were digested.
The undigested and digested plasmids were run on a Bioanalyzer and sufficient concentrations were seen. However, when they were run side by side on a gel and the gel was analyzed for bands, sufficient bands weren’t seen for A11, C11, or E11, regardless of the results in the Bioanalyzer.
Although we weren’t sure why there were no bands present on the gel, we realized that the nanodrop results weren’t the only factor we should take into consideration when figuring out what plasmids to use. We also realized that very proper gel technique is necessary for the gel purification protocol, or the bands won’t show up.
After seeing the lack of A11, C11, and E11 on the gel, G11 was quickly digested and run through a PCR Purification Column. After purification, G11 was nanodropped. After seeing the low concentration present, G11 was eluted 2 more times in order to extract any remnant plasmid in the column. The 3 elutions of G11 were run through a Bioanalyzer for verification. After verifying the presence of enough plasmid, the digested A, B, and C parts were each inserted into separate plasmids through a ligation protocol. After ligation, bacteria were transformed and streaked for storage.
The ligated fragments were run through a bioanalyzer, which showed sufficient amounts of ligation and a high concentration of ligated plasmid. The following day, the plate growth was also verified (shown below).
Although no work with these storage bacteria has been done, we did modify our lab expectations for this competition season. We originally expected to do transformations and testing for both the PFAS sensitive and Lactose sensitive plasmids, but after the lab work done, we realized that only the transformation of the PFAS sensitive insert was plausible in that timeframe.
In the initial cycle of our OpenMM project, our primary objective was to conceptualize a simulation that would allow us to explore the interactions between ahl (30C6HSL), LuxR, and PFOA. We began by investigating available modeling tools such as VCell, but quickly realized their limitations, particularly their reliance on mass action equations and molecular kinematics for predictions. Realizing the need for more comprehensive simulations, we sought advice from Dr. Monsen, who guided us toward a more sophisticated approach. Given our time constraints and limited lab access, molecular dynamics emerged as an attractive alternative. As we mapped out the simulation, our main focus was on understanding how the ligand moved within the molecular structure after docking. This required us to find a method for effectively docking the ligand and protein together while maintaining precise control over temperature and pressure. After thorough examination of the OpenMM documentation, we decided that an NPT simulation would best suit our needs, offering temperature and pressure control, and even the inclusion of a Monte Carlo barostat to capture vital data such as temperature and energy.
With the simulation design in place, the next step was to construct the script. Our first task was to dock the ligand and protein, which led us to discover the ROSIE server. This powerful tool generated over 200 iterations, and from this pool, we carefully selected the top 10 for inclusion in our simulation. Given our limited experience, we often defaulted to simplicity and used default settings, even opting to exclude water from the simulation.
When we attempted to execute the script, we encountered a series of errors. These ranged from minor syntax issues and deprecated libraries to a more significant problem: OpenMM's inability to recognize our ligand and assign appropriate charges to it. To name some errors we received, our original idea to use Swissdock did not work because it did not accept our input files, resulting in us using ROSIE. Regarding our input files, OpenMM could not identify the name of the ligand residue, and so it was an unknown, causing errors. One of the most fatal errors we recieved was that when ever we tried to do something related to the bonds, it could not recognise the bond and or its charge.
Through these initial challenges, we learned several critical lessons. We discovered that the forcefields we initially employed lacked essential data for our ligand. While forcefields for proteins are well-documented and encompass various residues, ligands typically lack such comprehensive information. Moreover, OpenMM struggled with charge assignment for the ligand molecule, resulting in additional errors and complications. We also discovered that things produced by online servers aren’t always 100% accurate and still needed to be edited by hand. We also realized in order for this to work, we will need to develop a forcefield for the specific ligand, which became a hassle in its own right.
In our second cycle of OpenMM exploration, we approached the project with a more informed strategy. Our focus shifted towards designing the necessary .xml files (forcefields) while ensuring accurate partial charge assignments for the ligand. We recognized that creating forcefields was a complex task, often requiring quantum computing tools. In addition, we explored the possibility of incorporating water into the simulation to enhance the accuracy of our results. To streamline our code and reduce complexity, we decided to implement a single system encompassing all four forcefields: the protein forcefield, water forcefield, ligand-protein forcefield, and ligand-specific forcefield. Instead of what was used before, a system with two subsystems, one for the ligand and one for the protein.
During the building phase, we relied heavily on the default models provided by OpenMM to minimize the risk of errors, as this was one of the few resources available to us. We also learned how to use antechamber to generate a file containing our charges, which could then be processed through an online server to create a forcefield file. However, fine-tuning was still necessary, particularly concerning the Coulomb scale for the non-bonded force.
With the modified simulation script in place, we ran the simulation once again. This time, however, we encountered new challenges, as the generated values did not align with our expectations. These discrepancies ranged from minor issues, such as temperature fluctuations, to more significant problems, including negative box volumes that raised questions about the theoretical space occupied by the simulation.
In the second cycle, our journey with OpenMM led to a deeper understanding of forcefield design, partial charge assignment, and simulation optimization. We learned valuable lessons about fine-tuning parameters and the importance of meticulous adjustments to achieve meaningful and reliable results in molecular simulations. We also learned how simulations can be inaccurate based on how accurate the information being fed to them is, i.e. forcefields. And forcefields in their own right are hard to develop and even harder to get right, so considering the fact that some info in the forcefield could be incorrect, the data returned to us made sense to us, and though some outputs were completely out of the question such as a negative box volume, it was just a simple error! We incorrectly stored the data for that column, causing a slight confusion. We also learned that the ensemble average of a property (like temperature) calculated over multiple simulation trajectories should converge to the expected macroscopic value. In other words, if your simulated system was big enough to stick a real thermometer into it (like a protein in a buffer on your lab bench), the temperature is converged to a single, macroscopic value that is read out on your thermometer device. However, at the level of a simulation, a single trajectory is just one possible realization of the system's behavior over time. Fluctuations in any single trajectory are expected due to the inherent statistical nature of the system. Over multiple trajectories or a long enough time, the average of these fluctuations should converge to the expected value.