>

Model

"Use theory to cut through the fog of truth."


Abstract
Modeling played a crucial role in our project. We used engineering techniques to predict protein structures and further forecast the feasibility of active sites and surface display technologies, providing a vital foundation for determining our experimental approach.

Moreover, we employed modeling to assist the team in enzyme activity assays and result validation, and we received delightful outcomes: the mutant NicX exhibited increased enzyme activity, confirming the accuracy of active site prediction. Furthermore, the cell surface display system showed no difference in enzyme activity compared to the wild type, but it substantially reduced costs as it eliminated the need for purification steps, highlighting the success of our modeling efforts!

Structural Biology Predictive Analysis
Following an extensive review and analysis of existing data, we have considered point mutations as one of the strategies to optimize NicX's enzymatic efficiency. However, up to this stage, there is no literature available that has conducted X-ray crystallography (XRD) experiments on NicX's protein structure. Consequently, there is no data regarding the active sites. While high-throughput screening is a feasible approach, to achieve maximum efficiency, we have chosen to use protein folding simulations to obtain the protein structure and identify the active sites through molecular docking.

AlphaFold
AlphaFold is an artificial intelligence-based protein folding simulation algorithm, and it's the algorithm we've used. Its fundamental principle involves extracting features from the amino acid sequence, calculating distances and angle distributions between amino acids, constructing an evaluation function, and ultimately optimizing it to convergence using gradient descent, resulting in angle pairs and forming the final topological structure. In numerous practical applications, AlphaFold has achieved a level of accuracy that is generally considered reliable.

After simulation, we obtained the predicted structure of NicX:
Figure 1. NicX protein structure predicate by AlphaFold
AutoDock-Vina
AutoDock-Vina is an algorithm that simulates the interaction forces between proteins and small molecules. We initially used global docking to define the scope, and the results are as follows:


Figure 2. Result of global docking between NicX and Nicotine, predict by AutoDock-Vina

As shown in the above figure, the results of global docking indicate that several low-energy possibilities are concentrated in a specific area. Therefore, we further refined the scope and conducted a new predict to obtain more accurate results.


Figure 3. NicX-Nicotine reaction, predict by AutoDock-Vina

By examining the literature and analyzing its chemical reactions, we found that the reactions primarily occur in the pyrrolidine ring. Therefore, we believe that the TYR17 residue is not significantly related to enzyme activity. Combining the binding data from global simulations, we have identified the following potential active sites:


Unfortunately, due to budget constraints, during the final enzyme activity testing, we were only able to choose one mutation site for testing. We found that the hydrogen bond at the L48 site coincided with the location where the chemical bond in the enzyme-catalyzed reaction small molecule broke, and it was the residue closest in theoretical distance. As a result, we tested the enzyme activity at the L48 site. Furthermore, given the position of the L48 residue, we can reasonably assume that if the L48 residue is indeed an active residue, the active site is likely to be in this vicinity.

SDT Feasibility Analysis
In the surface display technique, the INPNC protein used needs to be linked at the N-terminus of the NicX protein. However, because the predicted active site of NicX is very close to the N-terminus, the INPNC protein can easily obstruct or even render NicX ineffective at the active site. Consequently, we conducted a folding prediction for INPNC-NicX-Histag to analyze the feasibility of the E. coli surface display scheme. The results are as follows:


Through our predictions, we found areas of relatively low confidence in the INPNC protein. However, considering previous literature and the function of INPNC, we believe that INPNC should be embedded on the surface of NicX, similar to a scaffold. Therefore, we believe its impact on activity is relatively low. Furthermore, we also simulated several proteins that were not successfully constructed in this experiment, and the results all suggest that they would not impact the activity.



Analysis and Discussion of Experimental Results

Analysis of Construct Failure
During the construction process, we encountered DNA construct failures for three fusion proteins. Given their considerable length and the presence of numerous hairpin structures in NicX's secondary structure, we conducted a 3D structural analysis of the DNA and simulated it. We discovered that due to its specific structure, the DNA would denature and break at a certain temperature during PCR, preventing the polymerase from proceeding with the reaction, which ultimately led to the construct failure issue.


Enzyme Activity Result Analysis
HPLC-MS Data Analysis
For details on the enzyme activity system and sample preparation, please refer to the engineering page.

Following the generation of HPLC-MS results, we performed initial curve smoothing and error correction through automation in the instrument analysis. We integrated the curves and used triple quadrupole technology for quantitative calculation of nicotine mass to obtain the raw data.

Since liquid chromatography parameters were already available in previous literature, we made slight adjustments to these parameters after conducting several test samples to obtain the final conditions.

Liquid Phase Conditions
Solution A: 0.1% Formic Acid (FA) in Water
Solution B: Acetonitrile (ACN) with 0.1% Formic Acid (FA)

Mass Spectrometry Conditions
MRM in Electrospray Ionization Positive Mode (ES+), time range 1-10 minutes, detecting nicotine in m/z 163.0243-116.9743 (pre-smoothed) and 163.1-130.1 (after machine auto-adjustment).


We initially prepared six different concentrations of nicotine standard solutions and used the instrument for automatic standard curve calibration. After that, we fine-tuned the parameters using several test samples.

Nicotine Concentration Calibration Chart

Nicotine Concentration Calibration Curve

After calibration, we were able to obtain the concentration of nicotine in the vast majority of the samples using automated methods. The obtained raw values are as follows:



In the case of the L48Q 250 μM sample, there was a minor contamination issue during the LC-MS sample preparation. We manually corrected the baseline to obtain the values. For all other groups, values were obtained automatically as described above.

Due to the need for a 10,000-fold dilution of HPLC-MS samples, we conducted relevant processing during data analysis to obtain the approximate raw data for the samples.

Data Analysis and Discussion
Firstly, for the wild-type NicX protein, we diluted the protein to a concentration of approximately 10 ng/μL before use. Considering that protein purification is not 100% due to protein properties, we roughly estimated that the protein constitutes around 50% of the total concentration. In the enzyme activity system, we added 2 μL, which is approximately 10 ng of protein. Compared to the 1 μg of protein determined in the literature, this is roughly 100 times lower, and the reaction rate is also around 100 times lower. Therefore, we can conclude that the wild-type NicX was successfully expressed and reproduced.

Furthermore, according to the Michaelis-Menten constant (Km) chart in the original paper, we found that the enzyme activity is not significantly affected by nicotine concentrations in the range of 250-400 μM. Thus, even at a nicotine concentration of 250 μM, the L48Q mutation exhibits comparable enzyme activity to NicX at 400 μM nicotine concentration. Analyzing the reaction rates, we found that even with a slightly lower nicotine concentration, L48Q has a higher enzyme reaction rate than the wild-type. Although the exact difference in reaction rates is unknown, this is sufficient to conclude that L48 is one of the active residues. As previously demonstrated, this means that our active site prediction is reasonably accurate.

For INPNC-NicX, we observed a suspicious anomalous peak in the 0.5-hour chromatogram, specifically due to column residue. The reason for this occurrence is likely improper handling of bacteria during pretreatment. However, due to the presence of residue, the final enzyme activity should be greater than 0.002 μmol/min. When we determined the concentration of INPNC-NicX at OD600 = 1.6, based on previous experiments and empirical data, we estimated the cell density to be approximately 1.6 x 10^8 cells per milliliter, which becomes 1.6 x 10^5 cells per milliliter after a 1000-fold dilution. In the system, we added 2 μL of bacterial culture, corresponding to 320 cells. Based on the previous calculation, we know that the reaction rate for 10 ng of protein is around 0.039 μmol/min. Using this information, we can infer that cells containing about 0.5 ng of protein are effectively participating in the reaction. Therefore, we can conclude that there is approximately 0.16 ng of protein per hundred cells. Additionally, we already know that 320 cells correspond to 0.5 ng of protein and a reaction rate of 0.002 μmol/min. Scaling this up, 6,400 cells correspond to 10 ng of protein and a reaction rate of around 0.04 μmol/min, consistent with the wild-type. Furthermore, in practical terms, when cultivating 1 L of medium to an OD600 of 1, there are approximately 10^8 cells, equivalent to about 16,000 ng of protein or a protein concentration of around 16 mg/mL, which is 32 times more than wild type. Additionally, surface display technology does not require cell disruption and purification, making it far more efficient than traditional intracellular expression methods.