Unfortunately, due to unforeseen events, the laboratory where the experiments were to be conducted was not completed on time. Nevertheless, as the team is dedicated to furthering the advancement of future iGEM teams, the decision was made to compile code that would assist future teams in optimizing the lead sequence for the type 3 secretion system (T3SS) while relating it to the biological activity value.
Performing Background Research:
In order to better understand the problem, a myriad of differing papers and interviews were conducted with professionals in the field. Dr. Cammie Lesser, who is a professor at Harvard, detailed how to best use T3SS, the organism that could express it, its potential uses and also potential challenges that the team may face. Furthermore, she stated that regular proteins may be too big to secrete and may fold incorrectly upon secretion so nanobodies would be recommended to test.
Designing our System:
The team planned on obtaining an E. coli strain (E.coli DH10β) that expresses the secretion system. According to Dr. Lesser, this strain would be easier to work with in the context of our project since it is easier to transform. This strain was also used by Dr. Lesser in her research as well so she was able to provide a plethora of information. She was also able to provide a plasmid that she had used to test secretion. The FLAG tagged plasmid, pDSW206FLAG, (Lynch P. et al.) contains a secretion signal sequence, a linker sequence and control nanobody sequence.
Building:
In the future, we plan on testing our system with the following wet-lab process:
Our team plans on using TOPO cloning methods with our donated plasmid (pDSW206 FLAG) to cut out the control nanobody sequence and replace it with the nanobody insert of our choice, using the cut plasmid as a backbone.
Next, we would transform our ligated plasmid into the E. coli. The engineered system would then be ready for testing.
Testing:
After successfully building our system and transforming our ligated plasmid into the bacteria, we would test for expression and secretion of nanobodies in vitro by performing SDS-page gel electrophoresis. Then, we would determine the therapeutic viability of secreted nanobodies by performing an ELISA assay. With this test, samples containing concentrations of protein/nanobody complexes are treated with enzyme linked antibodies, capable of converting substrates to a colorful product that can be quantified.
What we learned:
Upon learning that the laboratory would not be available, the team explored alternative avenues that did not require a physical lab. This challenge led to a period of brainstorming. Eventually, the team sought advice from Dr. Yang, a professor at Ohio State University, who guided the team towards developing a program capable of modeling the optimization process for lead sequences, particularly regarding their activity value. This transition occurred after considering potential coding-based projects.
Redefining the Problem and Setting the Foundation:
The team's objective was to now establish a framework for modeling the optimization of lead sequences, primarily for enhancing the Type 3 Secretion System (T3SS). This foundation aimed to facilitate the utilization of T3SS by future teams working on similar projects. Furthermore, the program that would be developed
Conducting In-Depth Background Research:
The OSU iGEM team collaborated with a diverse group of experts who contributed to the project's development. In the pursuit of knowledge about the T3SS, the team engaged with Dr. Yang, who provided invaluable guidance for creating a computer-based model that illustrates the lead sequence optimization process for T3SS.
Specifying Project Requirements: Researching what would go into this code
In the absence of a wet lab, the team recognized the need to replicate the lead sequence optimization process. Consequently, a proposed program was conceptualized. The requirements for this program were extensive, including the incorporation of various biological processes such as crossover, mutations, and generational progression within a stochastic, population-based, random selection model. The choice of programming language was also considered, and Java was deemed the most suitable due to its open-source nature, making it accessible to a broad audience. The team's goal was to ensure that the program could be utilized without the necessity of extensive coding expertise or additional application downloads. Although alternatives like C++ or Python were considered, they were dismissed as they would entail downloading various applications, which was not in line with the team's accessibility objectives.
Brainstorming and Evaluating Solutions: What is the most optimal path?
To create a fully functional project, the team initiated the process of determining whether it was feasible to input a nucleotide sequence and develop a program capable of altering this sequence and producing a relative activity metric. Through the papers and meetings that were held, the team decided upon a list of requirements that the program needed to meet in order to be of good quality. The development process incorporated several additional functions, including mutations and incremental generational changes.The code should aim to properly represent these processes. The ultimate aim was to have the program closely mimic the process of optimizing lead sequences for the Type 3 Secretion System, drawing insights from the work of Craig et al. in 2010.
Testing and Compliance with Requirements:
The program's requirements necessitated the inclusion of several biological processes to enable the faithful replication of the lead sequence optimization process. The program's current iteration includes features such as crossover, population generation, random selection, and random mutations. It also calculates the activity value at each generational step, and it provides the sequence for each generation. Starting with generation 1, which consists of the initial input sequence, the program iteratively modifies the sequence to maximize the lead sequence's activity value. While the program currently accomplishes these tasks, there is room for optimization, particularly regarding memory usage. Furthermore, it is crucial to emphasize that the code should exhibit stochastic behavior. To verify this, various random nucleotide sequences (including identical sequences) were input, and the program was tested for its capacity to generate different sequences.
Reiterating the Solution for Code Optimization:
To further enhance the program, it is essential to begin with a comprehensive code review, addressing any existing faults. The program currently encounters a segmentation fault and experiences issues related to sequence length. These issues demand attention. Consequently, the program will undergo further revisions to better mirror the lead sequence optimization process in the Type 3 Secretion System. The intial program merely just randomly assigned numbers and asked for too many inputs by the user to be of any use in terms of understanding the process. Each iteration however added some level of complexity that helped the code to mature into a more complete and sleek form. For instance in the second iteration, genetic mutations were added to the program rather then randomly assigning numbers. Furthermore, since the program is a stochastic model, this means that the code should not produce the exact same values more then once.
What we learned:
By embarking on the development of this program, the team gained a deeper understanding of how lead sequence optimization can be achieved, while acknowledging the stochastic nature of this phenomenon. Despite the program's rudimentary appearance, it serves as a foundational step for future work in this area. The program's incorporation of random events such as mutations underscores the stochastic aspects of optimizing lead sequences. Furthermore, by learning this information, the team is then able to go back and reanalyze the proposed product and improve upon its nature. This benefit extends to future years.