Model | Ohio-State-FABE

Computational Model for Type III Secretion Systems and Lead Sequences

Our team followed a multi-step process to develop a model for optimizing the lead sequence in the Type 3 Secretion System (T3SS). We coded several biological processes to produce a better model of the process of optimizing lead sequences and to ensure the model was more realistic.

For the initial step, we defined the problem of making T3SS optimization accessible to future teams. Subsequently, we conducted in-depth background research on T3SS with guidance from Dr. Shang-Tian Yang, a professor at The Ohio State University involved in bioprocessing research. Dr. Yang directed us toward creating a Java-based, population-oriented random selection model. Our team brainstormed and implemented functions to change nucleotide sequences and simulate the optimization process.Several biological functions were hard coded into the program to aid in the accuracy of the optimization process. The final step involved reiteration and code optimization, addressing issues such as segmentation faults and sequence length discrepancies to better model T3SS lead sequence optimization is performed by maximizing the biological activity value of the sequence.

A general equation is used to calculate the biological activity value (A) which is equal to the number of cells expressing a protein over the number of cells cultured times the inherent activity value (A = cf). The program, like most computer programs, makes several inferences and assumptions. It assumes that the relation between A and c is always linear and f stays constant. As one begins to graph A vs c, normally one would have to experimentally determine A. However, the program instead takes in an input for the activity value A and assigns a random number between 0.1 to 1 for f. Then the A value is changed based on the several biological processes that are coded into the program. The A value is changed based on the random values determined in these biological parts of the program. In the program, the sequence's activity value is subject to genetic crossover, random mutations, and population growth and expression in that population. The program itself is a stochastic model that relies mainly on a population-based random assignment methodology in order to calculate its values.

The model, although using hard coded values in order to optimize running the program, does not derive its values experimentally but rather randomly generates and manipulates data points based on several biological processes. Any future iterations of this model would certainly include experimental data for each run. iGEM teams that use this platform are encouraged to use experimental data for their desired biological activity factor.

Through the use and development of this program, the team was able to understand the complexities behind optimizing a lead sequence, the factors that affect it and the potential to produce a highly effective T3SS. This program, for future work, is a key stepping stone in the endeavor to produce a highly effective therapeutic system that utilizes a T3SS secreting a nanobody. This process could be improved if the lead sequence of the nanobody is optimized in order to ensure that many cells will express the nanobody and start producing the necessary protein needed by the person or cell.

The program utilizes java.Since Java is open source, it requires little support though the users computer should be able to run 64bit-Java applications. The code was mostly built by the team. The team utilized Ai software to develop a rough framework. From here, the framework was then filled in with code. The sections that were done by the team and Anna Bontempo were the sections involving the class declaration, population generation, crossover and the mutation portion. Ai software was used to complete a small portion of the prompt for intial user input of a 20 nucleotide sequence and to also generate the random point mutaiton section of the code.

So far, code that simulates the optimization of lead sequences is not apparent or obvious as to its existence. This makes the project that the team completed, a program that is not easily found else where.

Although an impressive program was made, there are still things that could be improved for future iGEM teams. The program needs to generate random values for certain generations in order to carry out its function. However, this problem could be addressed in the future. Furthermore, the biological processes should be expanded on even more and more restrictions could be made regarding certain nucleotide combintations (codons). However, despite its faults, the program still sets out to complete what it aims to do and is either the first to do what it does or one of a few if any programs that serve the same purpose.Below are some sample pictures of a section of code as well as the output pictures of the code as well.

WF1 WF1

Here is a link to our program for all viewers. If you are interested please feel free to download it!

DIGI-T3 program download here!