Proteins, composed of 20 different amino acids, play crucial roles in various biological processes. The diversity of protein types contributes to the richness of life, with the various combinations of these 20 amino acids at different lengths being the foundation of this diversity. However, many wild-type proteins have issues, such as low expression efficiency and susceptibility to misfolding, making it challenging for humans to utilize them directly. Therefore, certain optimizations are necessary before using wild-type proteins.
To determine the title of our project, we engaged in profound discussions with several professors and students specializing in proteins within our college. For many IGEMers and researchers, using a wild-type protein is almost inescapable. During the process of protein sequence modification, numerous steps are required. Each step has a plethora of web tools, and invoking these tools can be very time-consuming. Our research has shown that while various websites are available, it is difficult to find a platform offering a one-stop service. Therefore, after reviewing the relevant literature, we decided to integrate some well-researched models and create a platform rich in design links and resources, serving as a stepping stone for undergraduates to understand synthetic biology.
To automate protein optimization, we employed EVmutation, an unsupervised protein sequence optimization model. With EVmutation's automated calculations, we can determine whether each mutation at every mutation site is beneficial for protein utilization. Through automated calculations and selections, we can derive an optimized sequence.
Before utilizing the acquired sequences in constructing expression systems, we decided to undertake a series of actions to make the results more user-friendly. To better inform users about which site in the sequence underwent mutation, we use Jalview to display differences between the original and result sequences. Through these efforts, users can obtain comprehensive information about the directed evolution process.
The next step is the construction of the expression system. For this, we've designed a complete toolkit compatible with the most common methods used in laboratories. We first present users with a set of vectors and their information. Once they choose the desired vector, we perform codon optimization to enhance expression efficiency across different biological chassis. We also initiate signal peptide cleavage, ensuring the expressed protein isn't secreted outside the cell. These two steps can increase intracellular protein concentration, benefiting protein purification.
Lastly, we introduced an automated primer design process based on commonly used homologous recombination vectors. Using the information we've stored, we can effortlessly design PCR primers for target and vector sequences. With this, we've achieved a comprehensive procedure from protein optimization to expression.
Integrated Full Process & User-friendly Operation:
- Integrated Full Process & User-friendly Operation: Simply input the desired protein, and our tool can handle the entire process from protein optimization to expression system design.
- Provides Abundant Relevant Information: Our tool offers functionalities like mutation site display and structural prediction, allowing users to understand protein optimization from various angles.
- Appealing Interface & User-friendly for Various Skill Levels: Compared to pre-existing tools, our platform offers extensive prompts with a more aesthetically pleasing interface.