Why do modeling?

Modeling is the process of constructing simplified representations of intricate real-world systems. These representations provide researchers with the means to thoroughly examine, analyze, and derive valuable insights into the behavior of these complex systems. Models have a variety of applications, serving various purposes, including hypothesis testing, optimization, system design, and future outcome predictions. They stand as formidable tools that facilitate the exploration of intricate systems while also delivering the advantage of saving valuable time and resources through the execution of virtual experiments before embarking on physical ones.

Kinetics x Structural

To comprehensively address the multifaceted challenges posed by PET-type microplastic degradation, our team embarked on a dual-model approach, strategically dividing our research into two primary domains: kinetics and structural analysis. This approach allows us to unravel the dynamics of microalgae growth and PET degradation while recognizing the importance of understanding the FAST-PETase: Linker: MHETase complex at a molecular level. By seamlessly integrating these two distinct yet interconnected models, our project, Mhetyguá, significantly advances our comprehension of PET-type microplastic degradation from kinetic and structural perspectives, firmly establishing our position at the forefront of innovative solutions for this critical environmental challenge.

Our goal

The goal of our project, Mhetyguá, is to act on the degradation of PET-type microplastics in sewage treatment plants, one of the destinations of this environmental contaminant that does not yet have a mitigation method in Brazilian sanitation institutions. We address the emerging concern regarding microplastics, increasing population awareness, while also actively contributing to the iGEM community. Some iGEM teams, such as TU Kaiserslautern 2019 and HU Berlin 2019, had previously focused on PET-related projects. Their work served as a valuable source of inspiration during the early stages of our project's development. We conducted some critical analysis of their projects to gain insights into potential innovative contributions within the realm of PET degradation.

In this context, we plan to act on microplastics degradation using enzymes secreted by genetically modified microalgae Chlamydomonas reinhardtii, FAST-PETase and MHETase, interconnected by a linker. Initially, FAST-PETase binds to PET to initiate activity, transforming it into MHET, so that MHETase can act and transform MHET into terephthalic acid (TPA) and ethylene glycol (EG). Our idea would be to foresee a possible application of microalgae in stabilization ponds, one of the wastewater treatment approaches. However, in order for us to scale and validate the process mechanism in sewage treatment systems, we need to simulate and evaluate the dynamics of the microalgae growth and PET degradation processes by applying a kinetic model, giving us the opportunity to answer the following questions:

  • How does the stabilization pond medium influence microalgae growth? And, consequently, in the amount of enzymes?
  • How much microplastic can we degrade with the amount of secreted enzymes?
  • How long can it take to degrade a satisfactory amount of plastic?

Furthermore, we sought to deepen our comprehension of this protein_linker_protein complex and its prospective applications in PET degradation. Our approach involved an in-depth analysis of its 3D structure, binding affinities, and dynamic behavior through advanced computational methods to address the following inquiries:

  • How does temperature influence the stability and flexibility of the complex, and how do these dynamics impact its overall behavior?
  • How do the computational methods enhance our understanding of complex protein structures and their dynamics?
  • What insights can computational predictions provide about the 3D structure and stability of our complex?
Figure 1. General representation of microalgae growth and PET degradation by FAST-PETase and MHETase enzymes
General representation of microalgae growth and PET degradation by FAST-PETase and MHETase enzymes

Created with BioRender.com

Our perceptions

From our perspective, the kinetics simulation results suggest that the system is feasible, but its effectiveness varies with the season. FAST-PETase has the potential to degrade PET in wastewater treatment systems, even at lower enzyme concentrations. However, challenging conditions, such as colder temperatures, require system improvements. Further studies on protein production and secretion by C. reinhardtii are needed and nutrient availability appears adequate for microalgae growth under typical wastewater conditions, but it's essential to consider that temperature and solar irradiance can vary significantly. In general, the model initially prioritized simplicity and measurability but held plans for improvements, including enhancing the representation of stabilization ponds, incorporating enzyme production models, and implementing a global sensitivity analysis to assess the impact of model inputs. However, time constraints, focus on project essentials, and access limitations to certain data hindered the integration of these improvements. Nevertheless, the model's base code for stabilization ponds and the enzyme production equation provide valuable resources for future teams working within this scope.

Regarding structural analysis, which employed a groundbreaking approach, harnessing AlphaFold2, a cutting-edge prediction method, to predict the 3D structure of the FAST-PETase: Linker: MHETase complex. Our predictions demonstrated a remarkable 91.02% similarity based on the ERRAT2 score, firmly establishing the reliability of our projections. We efficiently utilized computational resources, employing the ColabFold Colab notebook and conducting detailed analyses using the PyMOL platform. Furthermore, we evaluated the binding affinity of the complex to its respective ligands, PET and MHET, through molecular docking simulations. Our findings indicated that the fusion of FAST-PETase and MHETase with a linker may enhance binding affinity, especially with PET. These results suggest a promising avenue for plastic degradation. Additionally, we ventured into molecular dynamics simulations to explore the dynamic behavior of our complex, emphasizing the role of temperature in its stability. While our simulations provided valuable insights, extended simulation times and additional temperatures are necessary for robust conclusions.

Figure 2. Predicted 3D Structure of our Complex
Predicted 3D Structure of our Complex using AlphaFold2

Created with AlphaFold2

Conclusions

In conclusion, our project represents a significant step toward addressing the issue of PET-type microplastic degradation in sewage treatment plants. Kinetic simulations have provided insights into fundamental questions, such as the influence of stabilization pond medium on microalgae growth and enzyme production, as long as conditions support the log-phase growth of C. reinhardtii, enzyme concentration remains proportional to microalgae concentration, allowing for efficient removal of PET microplastic from water. Furthermore, in an optimal scenario, our model has demonstrated the potential to degrade 400 mg/m3 of PET within 19 days, aligning with PET concentrations ranging from 24.9 mg/m3 to 680 mg/m3 detected in the influent of TIAN, Lei et al. study [4]. This marks a significant advancement in PET plastic degradation, emphasizing the crucial role of time as a key factor.

Our successful prediction of the complex's 3D structure using AlphaFold2 and the favorable binding affinity results from molecular docking simulations are encouraging signs of progress. The stability analysis via molecular dynamics simulations suggests that lower temperatures, such as 20ºC and 25ºC, are more conducive to maintaining the complex's structural integrity. However, resource limitations necessitate further investigation with extended simulation times and additional temperatures. Summing it up, our research forms a solid base for future work in plastic degradation, offering valuable insights into the structure and behavior of the FAST-PETase : Linker : MHETase complex.

References

[1] DYM, Clive. Principles of mathematical modeling. Elsevier, 2004.

[2] Rangwala, H., & Karypis, G. (Eds.). (2011). Introduction to protein structure prediction: methods and algorithms. John Wiley & Sons.

[3] Haile, J. M. (1992). Molecular dynamics simulation: elementary methods. John Wiley & Sons, Inc.

[4] Lei Tian, Ewa Skoczynska, Robert-Jan van Putten, Heather A. Leslie, Gert-Jan M. Gruter.Quantification of polyethylene terephthalate micro- and nanoplastics in domestic wastewater using a simple three-step method. Science of The Total Environment, Volume 857, Part 2, 2023, 159209, ISSN 0048-9697, https://doi.org/10.1016/j.scitotenv.2022.159209.


1. Kinetic Model

A kinetic model is a mathematical method to describe various processes' dynamic behavior. It carries a set of equations that capture the underlying mechanisms governing these processes over time. In the context of microalgae growth and enzymatic degradation, kinetic models provide a structured approach to understanding and predicting how these biological systems evolve. They enable us to quantify factors influencing growth rates, enzyme activities, and substrate conversion. This modeling approach offers advantages such as precision, scalability, and the ability to optimize conditions for specific outcomes.

a. Stabilization ponds in sewage treatment plants and microalgae cultivation

Our kinetic model's first step aims to analyze the biomass growth of the microalgae Chlamydomonas reinhardtii in stabilization ponds over time. Stabilization ponds, designed to promote biological degradation of organic matter, are a possible implementation of our chassis, so that, in addition to being a method for microplastic degradation, it will assist in the oxygenation of the pond medium through photosynthesis and will act, through consumption, in the removal of excess nutrients that are present in wastewater, such as nitrogen and phosphorus.

Do you want to know more about how stabilization ponds work? Check our implementation here.

Mhetygua’s microalgae growth model considers several key factors influencing microalgae’s proliferation in these ponds. These factors include solar irradiance, pH levels, temperature, nutrient and carbon intakes, and the oxygenation of the pond environment.

Figure 1. General representation of the microalgae stabilization pond system and its growth factors

Created with BioRender.com

b. Plastic degradation

In addition to biomass growth, the model also addresses the production of the interconnected enzymes FAST-PETase and MHETase by microalgae and the degradation of PET by the enzyme chimera over time. One of our assumptions is that the more microalgae we have, the more plastic degradation will occur as more enzymes will be secreted. Therefore, the microalgae growth and PET degradation models maintain a direct proportional relationship.

c. Integration method

Our system will be solved using the Gompertz model. In the context of dynamic systems such as microalgae growth and plastic degradation, the Gompertz model is employed to depict population growth, particularly in scenarios where growth is constrained and slows over time. Considering the interrelationship between the growth model and the degradation model mentioned before, we can simulate and analyze the results within the framework of the Gompertz equation. This model allows us to calculate how the microalgae population changes as it grows and tracks simultaneously how the plastic concentration decreases due to enzymatic degradation.

MODEL

1.Essential Background

Before we get to the equations, it is essential to understand the processes involving our microalgae and the pond. To model, paying attention to the main factors influencing the system is necessary. Therefore, a literature review must be carried out together with a critical analysis of what information will be most relevant to be used. Taking this into consideration, at the end of each model, we have separated the main processes to facilitate understanding.

a. Growth rate model

The growth rate model is based on an existing model, called ABACO, by Sánchez-Zurano et al., and with a theoretical basis from the BIOALGAE model by SOLIMENO et al. Both models bring a sewage treatment approach with a microalgae-bacteria consortium, incorporating important growth aspects and characteristics of microalgae, such as their dependence on light, endogenous respiration, and consumption of nutrients, in addition to the influence of temperature and pH. It includes the interrelationship between the different microbial populations and the balance between releasing and consuming relevant compounds such as oxygen and carbon dioxide. The two models, then, serve as great tools for developing wastewater treatment processes using microalgae.

Although the authors' modeling is focused on the microalgae-bacteria consortium, Mhetyguá’s model concentrates only on microalgae to simplify the number of variables. Considering that complex media, such as wastewater, require the assessment of several aspects that influence microalgae growth, we focused only on the chassis that would secrete the PET-degrading enzymes. However, we do not disregard the presence of bacteria when it comes to oxygen consumption and carbon dioxide release, as this interaction is essential for the balance of substances and the functioning of the metabolic mechanisms of the existing microbiota, and, of course, the bacteria occur naturally in the pond, disregarding them completely would not be ideal in an attempt to represent the reality of the pond when modeling.

In the ABACO model, the experimental part was carried out in a closed photobioreactor (FBR). In this case, as our study medium would be a stabilization pond, similar to an open reactor, some adaptations needed to be made.

  • In closed photobioreactors, high oxygen concentrations become an inhibition factor for photosynthesis, consequently being an obstacle to the growth of microalgal biomass, which required some attention from the authors regarding the maximum amount of dissolved oxygen for the specific strain of the microalgae. This was not a problem for us since, to simplify our model, we are considering dissolved oxygen equal to zero, knowing that bacteria consume the oxygen produced by photosynthetic activity and that it also ends up being released into the atmosphere as well as at night, when oxygen is consumed by both microalgae and bacteria in the respiration process.
  • Another relevant point to adapt is the daily removal of part of the microalgal biomass from inside the photobioreactor, which the authors carried out. This would not be done in our system since the treatment process in stabilization ponds requires a hydraulic retention time, in which the focus would be on the removal and degradation of unwanted components through microalgae and not on optimizing their biomass since we will not use biomass afterward, mainly because any GMO should be discarded after the process.

As said before, there is no continuous inlet and outlet flow in the pond as in the author’s continuous FBR. Microalgal biomass concentration varies according to the hydraulic retention time (HRT). The HRT is how long water remains in the stabilization pond before leaving. During this period, microalgal biomass concentration can increase or decrease depending on growth conditions.

The microalgae specific growth rate μALG is influenced by crucial factors such as average light irradiance (Iav), temperature, pH, dissolved oxygen, and CO2. Furthermore, the influence of the availability of nutrients, such as phosphate phosphorus (P − PO4), ammonia nitrogen (N − NH4) , and nitrate nitrogen (N − NO3), which are the most found in wastewater. As a decay factor, the model considers the endogenous respiration (mALG) of the microalgae.

μALG = (μALG((Iav))) · μALG(T) · μALG(pH) · μALG(DO2) · μALG(CO2) · μALG(N) · μALG([P − PO4]) - (mALG)

Having contextualized our model basis, we will now introduce the main aspects that involve the growth of our microalgae in detail:

Figure 2. Simplified representation of the growth model showing the main microalgae processes in a stabilization pond during day and night

Created with BioRender.com Components that are influenced by bacteria are marked with *

I. Growth factors

  • Light influence
  • The primary concern in the mass cultivation of photoautotrophic microalgae is the efficient utilization of intense light for both the growth of cell mass and the production of secondary metabolites. The amount of light energy these microorganisms receive depends on the photon flux density reaching the culture surface.

    \[\mu _{ALG}(I_{av})=\frac{\mu _{ALG,max}\times I_{av}^{n}}{I_{k}^{n}+ I_{av}^{n}}\]

    Where:

    \(\mu _{ALG,max}\) is the maximum growth rate of microalgae.

    \(I_{av}\) is the average irradiance within the crop, summarizing the availability of light within the system.

    \(I_{k}\) is the irradiance constant (equivalent to the irradiance required to achieve half the maximum growth rate)

    \(n\) is the form parameter

    \[I_{av}=\frac{I_{0}}{K_{a}\times X_{ALG}\times h}(1-e^{-K_{a}\times X_{ALG}\times h})\]

    Where:

    \(I_{0}\) is the irradiance at the surface of the system.

    \(K_{a}\) is the biomass extinction coefficient.

    \(h\) is the depth of culture in the system.

  • Temperature influence
  • Temperature's impact on biochemical reactions makes it one of the most crucial environmental factors influencing the biochemical composition of microalgae.

    \[\mu _{ALG}(T)=\frac{(T-T_{max,ALG}) (T-T_{min,ALG})^{2}}{(T_{opt,ALG}-T_{min,ALG})(((T_{opt,ALG}-T_{min,ALG})(T-T_{opt,ALG}))-((T_{opt,ALG}-T_{max,ALG})(T_{opt,ALG}+T_{min,ALG}-2\times T)))}\]

    Where:

    \(T\) is the cultivation temperature.

    \(T_{max,ALG}\) is the maximum temperature for microalgae.

    \(T_{min,ALG}\) is the minimum temperature for microalgae.

    \(T_{opt,ALG}\) is the optimum temperature for microalgae.

  • Carbon influence
  • When it comes to autotrophic production, the primary carbon sources are CO2 and HCO3. Unlike terrestrial plants, atmospheric CO2 alone cannot meet the carbon requirements of autotrophic algal production systems. The diffusion of CO2 from the atmosphere into open ponds supports growth to some extent. In our model, CO2 is also made available for the photosynthetic process as a product of the degradation of organic matter carried out by heterotrophic bacteria.

    \[\mu _{ALG}(CO_{2}) = \frac {X_{CO_{2}} + X_{HCO_{3}}}{K_{S, C, ALG}+ X_{CO_2}+X_{HCO_3}+\frac{X_{CO_2}^{n C,ALG}}{K_{I, C, ALG}}}\]

    Where:

    \(X_{CO_2}\) is the concentration of carbon dioxide.

    \(X_{HCO_3}\) is the concentration of bicarbonate.

    \(K_{S, C, ALG}\) is the microalgae half-saturation constant for carbon.

    \(K_{I, C, ALG}\) is the microalgae inhibition constant for carbon.

    \(n C, ALG\) is the microalgae form parameter for carbon.

  • pH influence
  • Generally, microalgae thrive in a slightly alkaline pH range. The bicarbonate-carbonate system is the primary buffer in freshwater and is the most effective method for controlling and maintaining pH levels.

    The bicarbonate-carbonate buffer system acts through the following reactions:

    \[2HCO_3^-\Leftrightarrow CO_3^{2-}+H_2O+CO_2\]

    \[HCO_3^-\Leftrightarrow CO_2+OH^-\]

    \[CO_3^{2-}+H_2O\Leftrightarrow CO_2+2OH^-\]

    These reactions suggest that during photosynthetic \(CO_2\) fixation, there is an accumulation of \(OH^-\), leading to a gradual increase in pH. Additionally, as active photosynthesis raises pH levels, the opposite occurs during respiration, with \(CO_2\) being released, although this effect is relatively small, typically accounting for less than 10% of total photosynthetic production.

    \[\mu _{ALG}(pH)=\frac{(pH-pH_{max,ALG}) (pH-pH_{min,ALG})^{2}}{(pH_{opt,ALG}-pH_{min,ALG})(((pH_{opt,ALG}-pH_{min,ALG})(pH-pH_{opt,ALG}))-((pH_{opt,ALG}-pH_{max,ALG})(pH_{opt,ALG}+pH_{min,ALG}-2\times pH)))}\]

    Where:

    \(pH\) is the cultivation pH.

    \(pH_{max}\) is the maximum pH for microalgae.

    \(pH_{min}\) is the minimum pH for microalgae.

    \(pH_{opt}\) is the optimum pH for microalgae.

  • Nitrogen influence
  • After carbon, nitrogen is the most crucial nutrient for microalgae's internal processes. Microalgae can grow using both ammonium and nitrate as a nitrogen source. It is observed that nitrate consumption is only used when there is no ammonium in the system due to the preference to assimilate ammonium, assuming that it is due to a smaller amount of energy spent when ammonium is assimilated. In its absence, our microalgae assimilates nitrate as a source of nitrogen.

    \[\mu _{ALG}([N-NH_4]) = \frac {X_{NH_4}}{X_{NH_4}+ K_{S, NH_4, ALG}+\frac{X_{NH_4}^{n NH4,ALG}}{K_{I, NH_4, ALG}}}\]

    Where:

    \(X_{NH_4}\) is the concentration of ammonium nitrogen.

    \(K_{S, NH_4, ALG}\) is the microalgae half-saturation constant for ammonium.

    \(K_{I, NH_4, ALG}\) is the microalgae inhibition constant for ammonium.

    \(n NH_4, ALG\) is the microalgae form parameter for ammonia.

    \[\mu _{ALG}([N-NO_3]) = \frac {X_{NO_3}}{X_{NO_3}+ K_{S, NO_3, ALG}+\frac{X_{NO_3}^{n NO3,ALG}}{K_{I, NO_3, ALG}}}\]

    Where:

    \(X_{NO_3}\) is the concentration of nitrate nitrogen

    \(K_{S, NO_3, ALG}\) is the half-saturation constant of microalgae for nitrate

    \(K_{I, NO_3, ALG}\) is the microalgae inhibition constant for nitrate

    \(n NO_3, ALG\) is the microalgae form parameter for nitrate.

  • Phosphorus influence
  • Phosphorus is essential for growth and various cellular processes, such as energy transfer and the biosynthesis of nucleic acids and DNA. Even though algal biomass contains less than 1% phosphorus, it often becomes one of the most critical limiting factors for algal growth.

    \[\mu _{ALG}([P-PO_4]) = \frac {X_{PO_4}}{X_{PO_4}+ K_{S, PO_4, ALG}}\]

    Where:

    \(X_{PO_4}\) is the concentration of phosphorus phosphate.

    \(K_{S, PO4, ALG}\) is the half-saturation constant of microalgae for phosphate.

  • Oxygen influence
  • The accumulation of photosynthetically generated oxygen is one of the main factors limiting the scale-up of closed photobioreactors, as the accumulation of oxygen inhibits photosynthetic processes. The uniqueness of the stabilization pond lies in its oxygenation operation mode, not relying on an electromechanical air supply.

    In addition to improving water quality, oxygenation will be essential for the consumption of oxygen by the heterotrophic bacteria in the process of degradation of organic matter, which results in carbon dioxide that will be used by microalgae again in photosynthesis to produce oxygen. This also influences the pH of the water, as previously mentioned.

    Therefore, there is a complex interaction between the concentration of dissolved oxygen, the degradation of organic matter, respiration, photosynthesis, and the pH of the pond. Considering this network of interactions involving oxygen, we decided to equate it to zero in our model for simplification. This does not mean that there is no oxygen in the pond since it is essential for its functioning. Instead, it means that all the oxygen being produced is already consumed by bacteria during the day, used at night for respiration, and released into the atmosphere. In the following day, the processes are repeated, allowing us to equate it to zero in the model. In addition, in the kinetic simulations, we are using time over the course of days and not the day and night periods.

    \[\mu _{ALG}(DO_2) = 1 - (\frac {DO_2}{DO_{2, max}})^m\]

    Where:

    \(DO_2\)(%) is the oxygen dissolved in the system.

    \(DO_2, max\)(%) is the maximum amount of dissolved oxygen for the microalgae strain.

    \(m\) is a form parameter.

  • Endogenous respiration
  • Endogenous respiration is essentially the opposite process of photosynthesis and represents the factor in the decay of microalgae. While photosynthesis uses carbon dioxide (CO2) and sunlight to produce glucose and release oxygen (O2), endogenous respiration consumes glucose (or other organic compounds) and O2 to produce energy in the form of adenosine triphosphate (ATP) and releases CO2. as a byproduct.

    \[m _{ALG}=m_{min,ALG}+\frac {m_{max,ALG\times I_{av}^{nresp}}}{I_{k,resp}^{nresp}+{I_{av}^{nresp}}}\]

    Where:

    \(m_{min, ALG}\) and \(m_{max,ALG}\) are the minimum and maximum breathing rates.

    \(I_{k, resp}\) is the irradiance needed to stop photosynthesis and start the respiration process.

    \(nresp\) is the form parameter for respiration.

II. Fundamental Processes

1. Throughout their growth process, microalgae (XALG) capture inorganic carbon (XCO2 and XHCO3), consume nutrients (XNH4, XNO3 and XPO4) and release oxygen (DO2) through photosynthesis;

2. The decomposition of organic matter by bacteria uses the oxygen generated by microalgae and leads to the generation of XCO2;

3. The biomass density of microalgae increases due to autotrophic growth but decreases due to endogenous respiration.

Process Equation
Growth by light limitation \(\mu _{ALG}(I_{av})=\frac{\mu _{ALG,max}\times I_{av}^{n}}{I_{k}^{n}+ I_{av}^{n}}\)
Average irradiance \(I_{av}=\frac{I_{0}}{K_{a}\times X_{ALG}\times h}(1-e^{-K_{a}\times X_{ALG}\times h})\)
Influence of temperature \(\mu _{ALG}(T)=\frac{(T-T_{max,ALG}) (T-T_{min,ALG})^{2}}{(T_{opt,ALG}-T_{min,ALG})(((T_{opt,ALG}-T_{min,ALG})(T-T_{opt,ALG}))-((T_{opt,ALG}-T_{max,ALG})(T_{opt,ALG}+T_{min,ALG}-2\times T)))}\)
Influence of carbon \(\mu _{ALG}(CO_{2}) = \frac {X_{CO_{2}} + X_{HCO_{3}}}{K_{S, C, ALG}+ X_{CO_2}+X_{HCO_3}+\frac{X_{CO_2}^{n C,ALG}}{K_{I, C, ALG}}}\)
Influence of pH \(\mu _{ALG}(pH)=\frac{(pH-pH_{max,ALG}) (pH-pH_{min,ALG})^{2}}{(pH_{opt,ALG}-pH_{min,ALG})(((pH_{opt,ALG}-pH_{min,ALG})(pH-pH_{opt,ALG}))-((pH_{opt,ALG}-pH_{max,ALG})(pH_{opt,ALG}+pH_{min,ALG}-2\times pH)))}\)
Influence of ammonium nitrogen \(\mu _{ALG}([N-NH_4]) = \frac {X_{NH_4}}{X_{NH_4}+ K_{S, NH_4, ALG}+\frac{X_{NH_4}^{n NH4,ALG}}{K_{I, NH_4, ALG}}}\)
Influence of nitrate nitrogen \(\mu _{ALG}([N-NO_3]) = \frac {X_{NO_3}}{X_{NO_3}+ K_{S, NO_3, ALG}+\frac{X_{NO_3}^{n NO3,ALG}}{K_{I, NO_3, ALG}}}\)
Influence of phosphate \(\mu _{ALG}([P-PO_4]) = \frac {X_{PO_4}}{X_{PO_4}+ K_{S, PO_4, ALG}}\)
Influence of oxygen \(\mu _{ALG}(DO_2) = 1 - (\frac {DO_2}{DO_{2, max}})^m\)
Endogenous respiration \(m _{ALG}=m_{min,ALG}+\frac {m_{max,ALG\times I_{av}^{nresp}}}{I_{k,resp}^{nresp}+{I_{av}^{nresp}}}\)

Now that the main aspects that influence the growth of our microalgae have been presented, we will present the degradation model and then how the two models, growth and degradation, will be integrated.

b. PET Degradation Model

In our degradation model, as mentioned before, we will use the enzymes FAST-PETase and MHETase to degrade PET. FAST-PETase binds to PET to initiate activity, transforming it into MHET, so that MHETase can act and transform MHET into terephthalic acid (TPA) and ethylene glycol (EG). The reaction mechanisms of our degradation model are represented below:

\[[PET] + [FASTPETase-MHETase]\leftrightarrow [C1]\to [MHET]+[FASTPETase-MHETase]\]

\[[MHET] + [FASTPETase-MHETase]\leftrightarrow [C2]\to [FASTPETase-MHETase]+[TPA]+[EG]\]

Figure 3. Representation of PET degradation mechanism by the linked enzymes FAST-PETase and MHETase>

Created with BioRender.com Components that are influenced by bacteria are marked with *

I. Growth factors

FAST-PETase acts on PET to transform it into Mono-(2-hydroxyethyl)terephthalic acid (MHET). There were no records of the actual degradation rate of PET by FAST-PETase, but we had information from the article "Machine learning‐aided engineering of hydrolases for PET depolymerization", which presented the results of the time required for degradation and the initial masses of PET films, from this, we were able to calculate a degradation rate to be used in our model. The data analyses made to get such a degradation rate are available in our GitHub here.

II. MHETase

MHETase acts on MHET to transform it into terephthalic acid (TPA) and ethylene glycol (EG). Its enzymatic activity was analyzed using the Michaelis-Menten kinetic equation, which calculates the speed at which an enzyme converts a substrate into a product.

\[v=k_{MHETase}\times \frac{[MHETase]\times [MHET]}{Km_{MHETase} + [MHET]}\]

Where:

\(v =\) Product formation speed.

\(k_{MHETase} =\) Rate of MHET degradation by the enzyme MHETase.

\([MHETase] =\) Enzyme concentration.

\([MHET] =\) MHET concentration.

\(Km =\) Michaelis-Menten constant (enzyme-substrate affinity).

III. Enzyme concentration

It is important to remember that, in our model, we are taking into account a directly proportional relationship between the microalgae population and the concentration of enzymes. We are assuming that the more microalgae, the more enzymes will be secreted, therefore, describing the secretion mechanism proved to be relevant for our model.

In our final model, we summarized the enzyme secretion rate and expression rate in one term named Ke, the enzyme production yield per algae, that is, the number of functional enzymes produced and secreted per microalgae cell. Since we found no research establishing enzyme production yield specifically for C. reinhardtii in terms of quantity, and there are records of million enzymes produced per cell from other organisms, we arbitrarily chose the value of 75 enzymes per cell.

c. Integration method

Now that we have explained the main mechanisms of our model, the combination of both of them, considering the proportional microalgae-enzyme relationship, will be presented.

I. Context

Although we are using the ABACO and BIOALGAE models as basis, which use the differential equations method, when considering only the microalgae in our model, and not the microalgae-bacteria consortium, the bacterial absence directly influenced how the amount of nutrients varied in the stabilization pond medium, resulting in an infinite supply of nutrients for the microalgae.

We realized that when, initially, we were using the differential equations provided by the base articles in our simulations, but since in the bacterial absence scenarios there were no nutrient variation and natural factors like temperature and irradiance exhibit relatively consistent patterns of μALG throughout the month, it became clear that, in our model, the calculated μALG from the base article was actually the maximum specific growth rate (μALGmax), which is the growth rate when nutrients are most ideal.

Consequently, our approach to solving the growth model, given that C. reinhardtii is known to grow similarly to bacteria and yeast, incorporated a pattern that aligns with microalgae's sigmoidal growth curve, encompassing the lag phase, the log phase, and the stationary phase. In this particular context, the Gompertz methodology proves to be the most suitable choice. By considering μALG as μALGmax in Gompertz modeling, we can estimate microalgae biomass concentration without the need to precisely predict how nutrients are utilized by the various organisms within the pond.

We chose to employ the Norton reparametrization, as it provides a well-characterized adjustment of the original Gompertz equation with easily interpretable parameters that have direct biological significance:

\[X(t) = X₀ \times e^{ln(A/X₀ ) (1 - e^{-\mu _{ALG,max}\times t})}\]

Where:

\(X(t)\) is the concentration of microalgae as a function of time.

\(X₀\) is the initial concentration of microalgae.

\(A\) is the maximum concentration of microalgae.

\(\mu_{ALG,max}\) is the situation-specific maximum growth rate.

\(t\) is a given point in time.

From that, we can estimate the enzyme concentration in the system as a function of algae concentration. As there are no general models to describe enzyme production in C. reinhardtii, we speculated a mechanistic model as follows:

\[E(X, t) = X_{(t)}\times Ke +E_{(t-1)}\times 1-De\]

Where:

\(E(X, t)\) is the enzyme concentration as a function of algae concentration and time.

\(X\) is the algae concentration.

\(Ke\) is the enzyme production yield per algae.

\(E_{(t-1)}\) is the concentration of enzymes at the previous time point.

\(\)De is the enzyme degradation rate per day.

Since the FAST-PETase Michaelis-Menten parameters are not described yet, it is fruitful to describe the PET concentration as a function of enzyme concentration. The equation is as follows:

\[PET(E, t) =PET_{(t-1)}-k_{FAST-PETase}\times E_{(t)}\]

Where:

\(PET(E, t)\) is the PET concentration as a function of enzyme concentration and time.

\(k_{FAST-PETase}\) is the rate of PET degradation by FAST-PETase.

\(E(t)\) is the enzyme concentration at time t.

In this way, we were able to evaluate how our models behave simultaneously, analyzing the degradation of PET microplastic according to the growth of our microalgae. The equation system was written in Python. You can check our code here.

Process Equation
Enzyme concentration \(E(X, t) = X_{(t)}\times Ke +E_{(t-1)}\times 1-De\)
PET concentration \(PET(E, t) =PET_{(t-1)}-k_{FAST-PETase}\times E_{(t)}\)
Product formation speed \(v=k_{MHETase}\times \frac{[MHETase]\times [MHET]}{Km_{MHETase} + [MHET]}\)

2. Variables and parameters

Variables are quantities that can change or vary over time or in response to different conditions, while parameters are constant values that affect the system's behavior but do not vary during the process or analysis. In the table below, we recorded these values, as well as the initial conditions, so that we could simulate our model.

Name Value Description Unit Source
System features
Inlet effluent
pH \(7.8 \pm 0.1\) pH of incoming effluent - [3]
\(CO_2\) \(20 000\) Carbon dioxide concentration \(mg\times m^{-3}\) [3]
\(HCO_3\) \(1 600\) Bicarbonate concentration \(mg\times m^{-3}\) [3]
\(N-NH_4^+\) \(30 900 \pm 1 500\) Ammonium concentration \(mg\times m^{-3}\) [3]
\(N-NO_3^-\) \(400 \pm 700\) Nitrate concentration \(mg\times m^{-3}\) [3]
\(P-PO_4^-\) \(119200 \pm 5100\) Phosphate concentration \(mg\times m^{-3}\) [1]
\(X_{PET}\) \(24.9\) to \(680\) Expected PET initial concentration \(mg\times m^{-3}\) [9]
GROWTH MODEL
\(X_{ALG0}\) \(0.0000003322\) Initial microalgae concentration \(\mu M/m^{-3}\) Estimated
\(X_{ALG,max}\) \(0.003322\) Maximum microalgae concentration \(\mu M/m^{-3}\) [8]
\(\mu_{ALG,max}\) \(1.591\) Maximum growth rate of microalgae \(day^{-1}\) [1]
\(I_k\) \(168\) Irradiance constant \(\mu E\times m^{-2}\times s^{-1}\) [3]
\(I_0\) \(393\) to \(934\)* Irradiance on the system surface \(\mu E\times m^{-2}\times s^{-1}\) [3]
\(I_{k,resp}\) \(134\) Irradiance needed to start the respiration process \(\mu E\times m^{-2}\times s^{-1}\) [3]
\(n\) \(1.7\) Form parameter - [3]
\(n_{resp}\) \(1.4\) Form parameter for respiration - [3]
\(K_a\) \(0.000007\) Biomass extinction coefficient \(m^2\times mg^{-1}\) [3]
\(h\) \(1.8\) Depth of culture \(m\) [1]
\(m_{min,ALG}\) \(0.01\) Minimum respiration rate \(day^-1\) [1]
\(m_{max,ALG}\) \(0.276\) Maximum respiration rate \(day^-1\) [1]
\(T_{max,ALG}\) \(49\) Maximum temperature for microalgae \(ºC\) [3]
\(T_{min,ALG}\) \(3.4\) Minimum temperature for microalgae \(ºC\) [3]
\(T_{opt,ALG}\) \(30\) Optimum temperature for microalgae \(ºC\) [3]
\(pH_{max,ALG}\) \(12.9\) Maximum pH for microalgae \(-\) [2]
\(pH_{min,ALG}\) \(1.8\) Minimum pH for microalgae \(-\) [2]
\(pH_{opt,ALG}\) \(8.5\) Optimum pH for microalgae \(-\) [2]
\(DO_2\) \(0\) Concentration of dissolved oxygen \(mg\times m^{-3}\) [3]
\(m\) \(4.15\) Form Parameter \(-\) [3]
\(K_{S,C,ALG}\) \(4\) Microalgal half-saturation constant for carbon \(mg\times m^{-3}\) [3]
\(K_{I,C,ALG}\) \(120000\) Microalgal inhibition constant for carbon \(mg\times m^{-3}\) [3]
\(K_{S,NH_4,ALG}\) \(1540\) Microalgae half-saturation constant for ammonium \(mg\times m^{-3}\) [3]
\(K_{I,NH_4,ALG}\) \(571000\) Microalgal inhibition constant for ammonium \(mg\times m^{-3}\) [3]
\(n_{NH_4,ALG}\) \(1\) Form parameter for ammonium \(-\) Estimated
\(K_{S,NO_3,ALG}\) \(2770\) Microalgal half-saturation constant for nitrate \(mg\times m^{-3}\) [3]
\(K_{I,NO_3,ALG}\) \(396600\) Microalgal inhibition constant for nitrate \(mg\times m^{-3}\) [3]
\(n_{NO_3,ALG}\) \(1\) Form parameter for nitrate \(-\) Estimated
\(K_{S,PO_4,ALG}\) \(430\) Microalgal half-saturation constant for phosphate \(mg\times m^{-3}\) [3]
DEGRADATION MODEL
\(Ke\) \(75\) Protein production yield \(enzyme \times cell^{-1}\) Estimated
FAST-PETase_MHETase
Name Value Description Unit Source
\(k_{FAST-PETase}\) \(33.5\) Rate of PET degradation by the enzyme FAST-PETase \(\mu M\times day^{-1}\times \mu M\) [7]
\(Km_{MHETase}\) \(23.17 \pm 1.65\) Michaelis-Menten constant \(\mu M\) [6]
\(K_{MHETase}\) \(2384640 \pm 224640\) Rate of MHET degradation by the enzyme MHETase \(dia^{-1}\) [6]

RESULTS

Setting up the scenarios

To assess the feasibility of our system, we conducted simulations under two distinct scenarios: summer and winter. We chose these scenarios due to the significant variations in irradiance and temperature expected in Foz do Iguaçu since nutrient availability and microplastic presence in wastewater are not likely to vary much throughout the year.

Summer Scenario: For this scenario, we approximated the irradiance as a constant 934 uE/m², based on data from the BIOALGAE article. While this value does not precisely match the actual irradiance in Foz do Iguaçu, it is a reasonable approximation for our purposes.

Winter Scenario: To maintain proportionality, we derived the actual irradiance values from CRESESB-Centro de Referência para Energia Solar e Eólica (cepel.br) and calculated an equivalent value of 393 uE/m² for winter. This approach accounts for the non-convertible units between solar irradiance (W/m²) and the constants used in the BIOALGAE and ABACO models (uE/m²), explaining our choice not to use actual values.

Outcomes

Summer simulations:

Graph 1. Summer Simulation - PET degradation over Hydraulic Retention Time by FAST-PETase
Graph 2. Summer Simulation - Chlamydomonas concentration over HRT
Graph 3. Summer Simulation - Functional FAST-PETase_MHETase concentration over HRT

From observing the graphs, it is noticeable that microalgae C. reinhardtii rapidly reaches the culture maximum concentration considering the summer scenario. With maximum culture concentration, protein quantity also reaches a high concentration quickly. Therefore, the accumulated amount of protein quickly degrades the PET microplastic. That way, more than 400mg of PET is degraded within the expected hydraulic retention time.

As a side note, as the reader can see, MHET concentrations are not presented in our results. That’s because, as other teams, such as Team:Humboldt Berlin/Model - 2019.igem.org, had already stated, the MHETase catalytic activity is so much more significant in comparison to PETase that MHET concentration tends to zero through the whole process. Our first simulations confirmed that statement is still valid even in the case of FAST-PETase. So, as a readability decision, we chose not to elaborate on MHET degradation further.

Winter simulations:

Graph 4. Winter Simulation - PET degradation over Hydraulic Retention Time by FAST-PETase
Graph 5. Winter Simulation - Chlamydomonas concentration over HRT
Graph 6. Winter Simulation - Functional FAST-PETase_MHETase concentration over HRT

On the other hand, it becomes clear that the microalgae population slowly decays during the coldest days of winter. Still, a relevant amount of microplastic is degraded in hydraulic retention time (about 110 mg of PET). However, it is possible to argue that since conditions are inadequate for growth, the heterologous expression also becomes sub-optimal to none and that enzyme hydrolytic activities are reduced. A more pessimistic result would indicate no PET degradation during winter.

Is it possible to degrade PET microplastic from wastewater using C. reinhardtii to produce FASTPETase linked to MHETase?

The results are promising for applying FAST-PETase_MHETase-producing microalgae in water treatment stabilization ponds to biodegrade PET microplastic. Our findings suggest degrading most PET in the water in less than 20 days is feasible. However, the degradation rate decreases significantly in winter due to limited microalgae growth resulting from low solar irradiance and low temperatures.

How may our simulations help future experiments?

Our simulations provide valuable insights for future investigations. The most significant observation is that FAST-PETase presents the potential to degrade PET quickly enough that wastewater treatment ponds could use it for PET degradation even in scenarios with lower enzyme concentrations. Future research teams face the challenge of improving the system to enable PET degradation in more challenging conditions, such as colder temperatures. Additionally, our work underscores the need for further omic studies in Chlamydomonas reinhardtii, particularly concerning protein production and secretion.

Also, from elaborating the model, it became clear that key points for the success of using C. reinhardtii to biodegrade microplastic present in wastewater still require experimental measurements. Through experimentation, it may be possible to answer two key questions:

  • How much FAST-PETase linked to MHETase is produced from our microalgae?
  • Will the C. reinhardtii behave as expected in a stabilization pond?

In conclusion

Let's revisit the initial questions our model and simulations aimed to address:

    How does the stabilization pond medium influence microalgae growth and, consequently, the amount of enzymes?

    It is evident that nutrient availability, pH, temperature, and light irradiance in the medium play essential roles in microalgae growth. Therefore, according to our simulations, typical wastewater conditions seem to provide ample nutrients for microalgae growth. However, it's vital to consider that temperature and solar irradiance can vary significantly depending on geographical characteristics and the season. As long as the conditions support C. reinhardtii log-phase growth, enzyme concentration should remain proportional to microalgae concentration. As enzyme concentration increases, more PET microplastic is removed from our water.

    How much microplastic can we degrade with the amount of secreted enzymes?

    In an ideal scenario, it is possible to completely degrade over 400 mg of PET microplastic within approximately 19 days, considering the enzymes secreted during this time.

    How long can it take to degrade a satisfactory amount of plastic?

    Our model perceives the data collected from the article by TIAN, Lei et al. [9] with great interest and optimism. The detection of polyethylene terephthalate (PET) microplastics in all influents of the study, with concentrations ranging from 24.9 mg/m³ to 680 mg/m³, underscores the viability of our system, which demonstrates the capability of PET-degrading enzymes in our simulation, suggesting that it is possible to degrade 400 mg/m³ of PET in 19 days of hydraulic retention time. The ability to degrade a substantial amount of PET in a relatively short period is promising, also being coherent with the operational system of stabilization ponds. This capacity could contribute to reducing PET's presence in the environment and, in turn, mitigating the negative impacts of microplastics on aquatic and terrestrial ecosystems. However, it's essential to emphasize that the practical application of this discovery must take into account various factors, such as the process scalability, biosafety, associated costs, and effectiveness under real-world conditions.

    In summary, the potential to degrade 400 mg of PET within a 19-day timeframe, taking into account the fluctuating PET concentrations detected in the influent of the TIAN, Lei et al. study, offers a compelling opportunity in which time emerges as a crucial factor significantly improved in the realm of PET plastic degradation. This vision becomes a reality as long as coordinated efforts succeed in transforming this potential into tangible and sustainable solutions for the global issue of microplastics.

      Model Improvements

      As stated earlier, our model accounts for many simplifications, and as many biomathematicians would say, a simple but measurable model is better than a complex immeasurable one. Nonetheless, we maintained a vision for advancing the model's sophistication while enhancing its practicality. This section delves into the potential enhancements and outlines our perspective on how future teams can leverage this model as a foundation for their own refinements and innovations. There are some reasons why we did not implement said improvements in our model: Firstly, time. As we finished the first version of the model, our team faced a substantial workload, with a primary focus on comprehending our chassis and constructing genetic circuits. As soon as we had some breathing space to start building on a better model, it was already time to write wiki scripts, and we concluded it was best to pass on what we learned through a carefully written wiki than to make a few changes in a model that had already served its greatest purpose. Also, paywalls limited our research as some data and equations were in inaccessible articles to which we only got access much later on the project.

        Improving the stabilization pond representation

        One of our initial simplifications involved omitting consideration of other microorganisms in the pond. That led us to end up being unable to establish resource usage, which limited our model. We did, though, build a base code structured in object-oriented programming, which accounts for the complete ecosystem of the stabilization pond as described by Sánchez-Zurano et al. [1] and Solimeno et al. [3]. We make this code available here and encourage other teams to use it in future projects within the scope of stabilization ponds.

        Enzyme production

        Studies on promotor activities are still limited in Chlamydomonas reinhardtii. However, a model proposed by Gerosa et al. [10] defines enzyme production based on bacterial specific growth rate, and it might apply to C. reinhardtii. This model is also interesting because usual laboratory measurements, e.g., microalgal concentration and protein quantification, already allow the verification of the model's capacity to describe the produced enzymes. This equation is also included in the skeleton of the more complete version of our code.

        Global Sensitivity Analysis

        A global sensitivity analysis (GSA) evaluates the impact of each model input on its output. Therefore, a GSA would help us understand how much variation is expected in our system according to distinct nutrient concentrations in the influent wastewater and temperature/irradiance. This is a significant step and quite an improvement to our simpler method of measuring an “optimistic” summer scenario and a “pessimistic” winter scenario. The methodology to the GSA in our model would be very similar to what was made by Team:UNILA LatAm/Model - 2021.igem.org in their sensitivity analysis. Yet, this part is not anywhere in our available code and would have to be developed by future users of our model.

      REFERENCES

      [1] SÁNCHEZ-ZURANO, Ana et al. ABACO: a new model of microalgae-bacteria consortia for biological treatment of wastewaters. Applied Sciences, v. 11, n. 3, p. 998, 2021.

      [2] SÁNCHEZ-ZURANO, Ana, et al. A novel photo-respirometry method to characterize consortia in microalgae-related wastewater treatment processes. Algal research, v. 47, p. 101858, 2020.

      [3] SOLIMENO, Alessandro et al. Integral microalgae-bacteria model (BIO_ALGAE): Application to wastewater high rate algal ponds. Science of the total environment, v. 601, p. 646-657, 2017.

      [4] NURDOGAN, Yakup; OSWALD, William J. Enhanced nutrient removal in high-rate ponds. Water science and technology, v. 31, n. 12, p. 33-43, 1995.

      [5] BOUTERFAS, Radia; BELKOURA, Mouhssine; DAUTA, Alain. Light and temperature effects on the growth rate of three freshwater [2pt] algae isolated from a eutrophic lake. Hydrobiologia, v. 489, p. 207-217, 2002.

      [6] KNOTT, Brandon C. et al. Characterization and engineering of a two-enzyme system for plastics depolymerization. Proceedings of the National Academy of Sciences, v. 117, n. 41, p. 25476-25485, 2020.

      [7] Hongyuan Lu1 , et al. Machine learning-aided engineering of hydrolases for PET depolymerization. Nature v604, 2022.

      [8] Robert A. Freudenberg, Thomas Baier, Alexander Einhaus, Lutz Wobbe, Olaf Kruse, High cell density cultivation enables efficient and sustainable recombinant polyamine production in the microalga Chlamydomonas reinhardtii, Bioresource Technology, Volume 323, 2021, 124542, ISSN 0960-8524, https://doi.org/10.1016/j.biortech.2020.124542.

      [9] Lei Tian, Ewa Skoczynska, Robert-Jan van Putten, Heather A. Leslie, Gert-Jan M. Gruter, Quantification of polyethylene terephthalate micro- and nanoplastics in domestic wastewater using a simple three-step method, Science of The Total Environment, Volume 857, Part 2, 2023, 159209, ISSN 0048-9697, https://doi.org/10.1016/j.scitotenv.2022.159209.

      [10] GEROSA, Luca; KOCHANOWSKI, Karl; HEINEMANN, Matthias; SAUER, Uwe. Dissecting specific and global transcriptional regulation of bacterial gene expression. Molecular Systems BiologyEMBO, 2013. (1) DOI: 10.1038/msb.2013.14. Disponível em: http://dx.doi.org/10.1038/msb.2013.14.


Molecular Modelling


3D Structure Prediction

As of the present, the experimental determination of the 3D structure of FAST-PETase : Linker : MHETase complex's remains elusive. However, the emergence of AlphaFold2, a cutting-edge prediction method, has paved the way for insightful predictions regarding the real-world configuration of this structure1. Employing AlphaFold2, we successfully predicted the intricate 3D structure. Our predictions have garnered a remarkable 91.02% similarity as evaluated by the ERRAT2 score, firmly establishing the reliability of our projected outcomes.

Notably, considering the computational resource intensiveness associated with AlphaFold2, we harnessed the computational power of the ColabFold2 Colab notebook for these predictions. Subsequently, we conducted a meticulous analysis of these projections, utilizing the PyMOL3 platform as a lens to reveal the subtleties and complexities encapsulated within our predictions.

Figure 1: Predicted 3D Structure of the FAST-PETase : 12 AA Flexible Linker : MHETase Complex using AlphaFold2

Outcome of Prediction:

In the pursuit of structural prediction, the establishment of a clear success metric is crucial. Thus, the evaluation of AlphaFold2's output emerged as a pivotal step in determining the credibility and viability of the obtained structure for subsequent analyses. Our model's assessment on the careful consideration of the following parameters:

pLDDT (Predicted Local-Distance Difference Test) Scores:

AlphaFold's per-residue prediction of its IDDT-Ca score4. To elaborate, the IDDT metric quantifies the proportion of accurately predicted interatomic distances, rather than focusing on the alignment of predicted and actual structures.

This evaluation acknowledges the precision of local structures and the successful alignment of individual domains. The pLDDT metric reflects these attributes, serving as an adept measure of localized confidence.

In Figure 1, the local confidence granted by AlphaFold is denoted by the intensity of blue coloring within the structure. Conversely, shades of red indicate regions of relatively lower confidence. This visualization has been made through the use of pLDDT scores, presented as a B-factor spectrum.

A beautiful sunset

Figure 2: Distribution of pDDT Scores as a 2D Line Graph

Furthermore, Figure 2 showcases the pLDDT scores, depicted in the form of a 2D line graph. This visualization provides a clear perspective on the distribution of pLDDT scores across the structure. A line graph adds a layer of clarity, allowing for a comprehensive assessment of the scores' variations and trends.

PAE (Predicted Aligned Error) Scores:

Recognizing that elevated pLDDT scores across all domains may not necessarily indicate AlphaFold's assurance regarding their relative orientations, we introduced the incorporation of PAE scores. In essence, PAE signifies AlphaFold's anticipatory estimation of positional discrepancy at residue x, contingent on an alignment between the predicted and actual structures centered around residue y. By design, PAE gauges the confidence about the interplay of residue pairs' spatial arrangements. While predominantly employed to evaluate relative domain positions, the utility of PAE extends seamlessly to situations where the assessment of pairwise confidence holds significance.

A beautiful sunset

Figure 3: Predicted Error Comparison within the Same Domain (Amino Acids 34-113)

A beautiful sunset

Figure 4: Predicted Error Variation across Different Domains (Amino Acids 34-862)

In this comprehensive analysis, a pivotal observation comes to the forefront: within a context of a relatively static domain (figure 3), the predicted error showcases a remarkably low magnitude. As we delve deeper into the intricacies, figure 4 presents a conspicuous contrast, revealing an upsurge in the predicted error magnitude. It's worth noting that this increase in predicted error is likely attributed to the presence of a flexible linker within the structural configuration.

ERRAT Score:

From a scale of 0 to 100, the ERRAT score5 emerges as a compelling gauge that appraises both the trustworthiness and the statistical reliability of residue conformations vis-à-vis reference structures. The underlying principle involves the computation of an error factor for each protein residue, accomplished by contrasting the atom positions within the model against a statistically derived distribution distilled from resolved protein structures. This rigorous analysis scrutinizes deviations between atom positions within the model and the anticipated distribution, effectively yielding an error factor tailored to each residue.

Culminating in an error confidence value for the entire structure, ERRAT substantiates its prowess as a judicious evaluator of structural fidelity.

In the realm of experimental benchmarks, structures hitherto resolved tend to exhibit values surpassing the 90 threshold. Our model boasts an ERRAT score of 91.02, echoing its adeptness in achieving a remarkable accord with this benchmark.

Ramachandran Plot:

The Ramachandran Plot6 stands as a critical tool for validating the structural integrity of proteins, providing a comprehensive assessment of the permissible orientations of phi (φ) and psi (Ψ) dihedral angles within amino acid residues. This analysis offers profound insights into the spatial arrangement and stability of the protein's three-dimensional structure.

Within this framework, the plot categorizes the conformational space into distinct regions, each indicating the feasibility of different angles for phi and psi rotations. The most favorable and allowed regions are recognized, while less favored but still acceptable regions are also discerned. Meanwhile, regions where steric clashes and energetic constraints impose limitations are designated as disallowed.

For quantitative evaluation of structural quality, the number of data points lying outside the most favorable region serves as a robust metric. In our specific protein model, the Ramachandran Plot was generated using PROCHECK7 which provided enlightening observations: around 83.2% of amino acid residues conform to optimal orientations, underscoring a predominant presence of permitted arrangements. Additionally, 11.3% of residues occupy a marginally less ideal yet still tolerable conformational space, while 4.2% exhibit conformations considered less favored. The remaining 1.3% of residues fall within disallowed regions, indicating instances of steric clashes and unfavorable energetics.

RMSD Score:

The Root Mean Square Deviation (RMSD) stands as a pivotal metric within the realm of structural biology, providing a quantitative gauge of the dissimilarity between two protein structures. Its role as a formidable tool is paramount in evaluating the accuracy and reliability of predicted protein structures, as showcased in the context of the FAST-PETase-Linker-MHETase complex we are investigating.

RMSD quantifies the average displacement of atoms between a reference structure, often derived from experimental analysis and the envisioned predictive structure. Within our distinctive scenario—focused on predicting a protein complex comprising both FAST-PETase and MHETase—it's crucial to meticulously assess the alignment of each component (FAST-PETase and MHETase) in isolation. This precise evaluation is conveyed through the provision of two distinct RMSD values:

FAST-PETase RMSD: With a specific value of 0.384, this metric indicates the average atomic displacement when comparing the envisioned FAST-PETase structure against its corresponding reference structure, spanning all 242 amino acid residues.

MHETase RMSD: In contrast, boasting a recorded value of 0.515, this metric encapsulates the mean atomic displacement between the predicted MHETase structure and its corresponding reference structure, spanning its full range of 501 amino acid residues.

The presentation of these dual RMSD values serves as a strategic approach to glean heightened insights into the resultant predictions. By distinctly evaluating the alignment of both FAST-PETase and MHETase, we gain a more nuanced perspective for quantitatively assessing the quality of our predictive outcomes. Lower RMSD values generally correspond to heightened ongruence between the predicted and reference structures, signifying a more accurate prediction. Conversely, higher RMSD values may indicate deviations from the reference structure, thereby highlighting areas that could benefit from further refinement in the prediction process.

In a comprehensive endeavor to enhance the presentation of our findings, we have created a visual representation that showcases the alignment between the predictions generated by AlphaFold (depicted in blue) and the experimental 3D structures of both FAST-PETase and MHETase.

Figure 5: Structural Alignment Visualization - AlphaFold Predictions vs. Experimental 3D Structures

This video, marked as Figure 5, serves as a visual aid, enabling viewers to witness the dynamic correlation between the predictive outcomes and the real-world structural conformation. Through this visualization, we aim to provide a more holistic understanding of the accuracy and alignment of our predictions within the FAST-PETase-Linker-MHETase complex.


Molecular Docking

Docking is a pivotal computational technique, serving as a potent tool for the investigation of protein-protein interactions and the assessment of binding affinity between molecules. In the context of our study concerning the FAST-PETase: Linker: MHETase complex, docking plays an indispensable role in elucidating the intricate molecular interactions that underlie its functionality.

Why Docking?

The experimental determination of the 3D structure of this complex remains elusive. However, recent advancements in computational biology, exemplified by AlphaFold2, have provided us with the means to predict its 3D structure with remarkable precision. Docking emerges as a pivotal tool as we dive deeper into comprehending the intricate interactions within this complex. Its primary objective is to elucidate binding energy and discern which component, be it the individual proteins FAST-PETase and MHETase or our composite assembly, exhibits a stronger affinity.

Furthermore, docking aids in the identification of potential challenges. For instance, it enables us to evaluate whether the binding sites of FAST-PETase and MHETase may overlap or clash. The presence of conflicts at these binding sites could indicate the necessity for structural adjustments within the complex to optimize its functionality.

Probing Protein-Protein Interactions:

Docking empowers us to investigate the interactions between proteins, specifically FAST-PETase and MHETase when linked by a linker molecule. It facilitates the precise localization of critical binding sites, commonly known as active sites, where these proteins make contact. To visualize and communicate our findings, we have presented the following videos, which illustrate our exploration of these active sites. This visual representation aids in our assessment of potential interference between them.

Figure 6: Binding Sites Analysis in the FAST-PETase : Linker : MHETase Complex (Color Key: Blue - FAST-PETase, Green - MHETase)

Figure 7: Analysis of the Interactions Between the FAST-PETase and MHETase Domains

For this second figure, we applied a relaxation process to our initial AlphaFold prediction to minimize the energy within the system. This step allowed us to observe the interactions between the FAST-PETase and MHETase domains. Fortunately, ColabFold natively provides us with this process.

Upon closer examination, the blue region of the protein represents the FAST-PETase domain, while the green region represents the MHETase domain. In the image, two amino acids highlighted in red, Asp112 from the FAST-PETase domain and Arg535 from the MHETase domain, form a salt bridge. A salt bridge is a combination of two non-covalent interactions: hydrogen bonding and ionic bonding. Although non-covalent interactions are typically considered relatively weak, these small stabilizing interactions collectively contribute significantly to the overall stability of a conformer.

Additionally, three other amino acids, depicted as sticks, are hydrophobic residues. These residues tend to cluster in regions unfavorable for interactions with water, a behavior known as the hydrophobic effect—a thermodynamic phenomenon. Fortunately, none of these amino acids are part of the binding site or active site of the FAST-PETase, which leads us to conclude that there are no interfering binding sites.

While performing a molecular dynamics simulation to delve deeper into these interactions would be beneficial, we find this initial test satisfactory to assess whether our proteins interfere with each other.

Assessing Binding Affinity:

Docking simulations play a pivotal role in the evaluation of binding affinity within the FAST-PETase : Linker : MHETase complex, comparing it with the binding affinity of FAST-PETase and MHETase individually, towards the ligands PET and MHET, respectively. Through these simulations, we can precisely quantify the strength of the interactions between these proteins and each ligand, shedding light on the nature of their binding.

Rather than opting for a computationally intensive flexible docking approach, we adopted a constrained docking strategy. In this approach, we maintained the receptor in a rigid state while allowing flexibility in the ligand. This decision aligns with the prevailing methodology employed by most contemporary programs. Additionally, it mitigates the computational demands on our systems, safeguarding their performance.

To accomplish our experiments, we employed two distinct algorithms: one for sampling, which predicts the various possible conformations, or "poses", that a ligand can assume when binding to the protein, and another for scoring, which computes the binding energies associated with each pose in relation to the protein. The conformation or pose with the lowest energy is identified as the most favorable and is considered the optimal binding configuration.

For the sampling algorithm, we employed the genetic algorithm AutoDock. This algorithm operates in a stochastic manner, generating a myriad of diverse conformations. From this pool of conformations, the algorithm selectively retains those with scores surpassing a predefined threshold. Ultimately, this process yields the top 9 poses characterized by the lowest energy levels.

For our analysis, we conducted a total of four experiments, consisting of two experiments utilizing the predicted 3D structure obtained through AlphaFold2 and two experiments employing the PDB files of FAST-PETase (structure 7SH6) and MHETase (structure 6QZ2) downloaded from the Protein Data Bank8.

For the ligands, we retrieved the 3D structures of PET and MHET from the ZINC15 Database9. A thorough literature review confirmed the already established binding sites of the proteins. Consequently, we employed a specific docking approach, wherein we precisely defined the region where the ligands should bind. Additionally, we utilized OpenBabel10 to standardize the structure format to match that of the proteins.

To prepare the 3D structures of the proteins, we employed AutoDockTools 1.5.711. This involved removing water molecules and all non-protein components, verifying and rectifying any missing atoms, adding hydrogen atoms to facilitate bonding between the ligand and the protein's binding site, and applying Kollman Charges to the protein. We also used MGLTools to establish the grid box around the binding sites of the proteins.

For the docking simulations, we chose to utilize AutoDock Vina12 due to its exceptional accuracy and high-speed performance. The comprehensive results obtained from AutoDock Vina can be found in the supplementary data. From this data, we generated Figures 8 and 9 using PyMOL, illustrating the 3D interactions between our protein and the ligands.

Figure 8: Binding Site-Ligand Interaction of the FAST-PETase Binding Site in the FAST-PETase : Linker : MHETase Complex

Figure 9: Binding Site-Ligand Interaction of the MHETase Binding Site in the FAST-PETase : Linker : MHETase Complex

Furthermore, we conducted 2D analysis for all four experiments using LigPlot+13 and have included these results in the supplementary data14.

Energy Affinity Findings:

Upon analyzing the docking results involving FAST-PETase, MHETase, and our complex, several intriguing trends have emerged:

Protein-ligand Complex

Affinity (\(kcal\times mol^{-1}\))

FAST-PETase : Linker : MHETase (PET) -6.3
FAST-PETase : Linker : MHETase (MHET) -6.2
MHETase (MHET) -5.8
FAST-PETase (PET) -5.6

Figure 10: Table displaying the energy values of the proteins when bound to their respective ligands.

Upon comparing the optimal ligand conformations for each protein, as determined by their energy affinity values, it becomes evident that our linker-bound protein exhibits a slightly higher binding affinity to both MHET and PET at their respective binding sites, as opposed to using FAST-PETase and MHETase separately. This suggests that the fusion of FAST-PETase and MHETase enzymes with a linker may not detrimentally impact binding affinity in comparison to non-linked proteins, and, in fact, may even enhance it. The docking results notably indicate a significant improvement in binding affinity within the FAST-PETase : Linker : MHETase complex, especially with regard to PET, when compared to FAST-PETase alone. Conversely, MHET affinity experiences a modest increase when our complex is employed. This suggests that the presence of the linker and the interaction between FAST-PETase and MHETase could potentially enhance overall binding affinity.

It's important to note that while our docking results provide valuable insights into the initial binding affinities, it's possible to achieve more accurate observations of ligand attraction over time through advanced methods like molecular dynamics simulations, which offer a dynamic perspective of the complex interactions between proteins and ligands.


Molecular Dynamics

As we conclude our exploration of molecular docking, we now set our sights on a thrilling journey into the dynamic behavior of our complex. Molecular dynamics (MD) serves as a computational simulation method that unveils the physical movements of atoms and molecules. During MD simulations, these atoms and molecules interact over a fixed time, providing insight into the dynamic evolution of the system. In its most common form, the trajectories of these entities are determined by numerically solving Newton's equations of motion for a system of interacting particles. The forces governing these interactions, along with their potential energies, are typically computed using interatomic potentials or molecular mechanical force fields, making MD an ideal choice for investigating our complex.

Why Molecular Dynamics?

Molecular dynamics offers a multitude of advantages, including the ability to observe atomic-level protein movements and interactions over time, detect structural alterations, assess domain affinity, predict the impact of modifications on protein stability, corroborate AlphaFold2-derived information, and evaluate protein flexibility in response to varying environmental conditions, such as temperature changes. By employing MD, we aim to gain a comprehensive understanding of how our proteins adapt and flex within their surroundings and to assess protein stability under diverse circumstances.

Cost-Effective and Accessible MD Simulations:

The utilization of MD simulations presents a multitude of computational challenges, including the potential strain on personal computers and the substantial resource demands associated with prolonged simulations. These challenges are accentuated when dealing with a complex of our scale, encompassing more than 900 amino acids. In addition to these challenges, our university, regrettably, faces budget constraints that prevent us from maintaining a functional cluster infrastructure.

In light of these obstacles, we proactively sought a solution by harnessing the capabilities of cloud computing, enabling us to conduct cost-effective and accessible MD simulations for our complex. Thankfully, our search led us to discover a robust pipeline15 ideally suited to address these specific challenges.

Within this collaborative notebook-based pipeline, we primarily utilized the OpenMM 716 simulation engine for all aspects of our MD simulations, except for force field calculations. For these critical force field calculations, we relied on AMBER's force field module17. The seamless integration of OpenMM and AMBER facilitates compatibility across a broad range of biomolecular systems and research applications. This integration ensures that our simulations can be meticulously tailored to address our specific objectives and remain readily adaptable to accommodate even more complex systems, should such a need arise.

In this context, we made three experiments, varying only the temperature to observe its influence on our complex. All other parameters were held constant, allowing us to isolate and analyze the temperature variable while keeping all other conditions consistent.

Simulation Parameters:

In our simulations, we employed the AMBER force field, specifically the ff19SB force field18, which represents the latest advancement in SB protein force fields. This updated force field has demonstrated improvements in amino acid-dependent properties, including helical propensities, and accurately captures dihedral angle preferences observed in actual protein structures stored in the Protein Data Bank.

The development and validation of the ff19SB force field, as detailed in the associated research paper, highlight its enhanced predictive power when paired with a more precise water model like OPC19. This combination proves particularly effective in modeling sequence-specific behaviors, protein mutations, and rational protein design when compared to the utilization of the TIP3P model, which exhibits notable limitations when used with the QM-based ff19SB force field. Therefore, we followed the authors' recommendation and incorporated OPC as our water model.

In our simulations, we utilized a 12 Ångstrom-sized box to encompass our protein and neutralized the system by introducing Na+ and Cl- ions at a concentration of 0.15. For the equilibration protocol, we conducted simulations over one nanosecond and conducted experiments at three different temperatures: 20°C (293.15 K), 25°C (298.15 K), and 30°C (303.15 K), all maintained at a pressure of 1 bar.

Regarding the production phase of the MD simulation protocol, we maintained the same configuration as during equilibration, conducting simulations over one nanosecond (1 ns) for each simulation at the same temperatures (20°C, 25°C, and 30°C). We acknowledge that this duration may appear limited, but as previously mentioned, resource constraints necessitated these choices.

MD Simulation Analysis:

RMSD Analysis:

Root Mean Square Deviation (RMSD), serves as a metric that quantifies the deviation of atomic positions over time when compared to a reference structure. By closely monitoring RMSD, we gain the ability to gauge the structural stability of our protein throughout the simulation. In a stable system, RMSD values tend to remain relatively low, while an increasing RMSD may signify structural fluctuations or deviations. Furthermore, RMSD analysis proves invaluable in assessing the stability of binding interactions between our proteins. Lower RMSD values for the complex in comparison to its unbound state indicate a robust and stable binding interaction. Conversely, higher RMSD values may suggest a weaker or transient interaction between the molecules. Moreover, the identification of unusually high RMSD values or abrupt spikes can serve as early indicators of noteworthy events within the system. These events may encompass phenomena such as protein unfolding, aggregation, or substantial perturbations. Recognizing these outliers becomes pivotal in pinpointing and comprehending anomalous behavior within the simulation.

Figure 11: RMSD Comparison Across Three Temperatures

When analyzing our RMSD results, a consistent profile emerged across all tested temperatures. Notably, we observed that lower temperatures corresponded to increased stability within our complex, an outcome aligned with our expectations. Importantly, no outliers were detected, indicating the likelihood of our protein maintaining its structural integrity.

It is important to acknowledge our constraints regarding simulation duration due to resource limitations. Despite this limitation, a noteworthy observation was made: at 20°C and 25°C, the RMSD profiles exhibited remarkable similarity, differing by a mere 0.03 in their respective means (averages). In contrast, the 30°C simulation displayed a slightly higher difference of 0.23+. While this difference is relatively small, it suggests that 30°C might be approaching a temperature range where protein stability could become compromised. Conversely, 20°C and 25°C appear to be favorable temperatures for our protein.

However, it is imperative to emphasize the need for extended simulation times to bolster the robustness of our conclusions. Additionally, introducing a 15°C simulation could provide valuable insights into the temperature-dependent behavior of our protein.

Furthermore, comparing our complex's stability with FAST-PETase and MHETase alone could shed light on which entity exhibits superior stability. These ongoing investigations will enhance our understanding of our complex's behavior and inform future research directions.

RMSF Analysis:

Root Mean Square Fluctuation (RMSF) analysis is a valuable tool for identifying regions within the protein that exhibit varying degrees of flexibility.

Figure 12: Comparison of RMSF at Three Temperatures

Here, we observe strikingly similar profiles across all three temperature experiments, characterized by fluctuations at the protein's beginning and end, as well as pronounced flexibility in our designated flexible linker region.

Figure 13: Molecular Dynamics Simulation at 30°C with RMSF Spectrum

This figure offers a visual representation of the molecular dynamics simulation at 30°C, using RMSF as a spectrum. In this representation, the reddest areas denote regions with the highest fluctuation, while the bluest areas signify the most stable regions.

Upon closer examination of these figures, it becomes evident that our flexible linker consistently exhibits the expected flexibility required for its functional role. These results further bolster our confidence in the 3D structure prediction generated by Alphafold, as the agreement between the predicted and observed dynamics is remarkably consistent.

Rg Analysis:

The radius of gyration (Rg) is a metric used to gauge the compactness or dispersion of biomolecules, including our protein. When applied to molecular dynamics simulations, analyzing the Rg graph yields valuable insights into the structural dynamics of the molecule.

Figure 14: Comparison of Rg at Three Temperatures

In Figure 14, we observe that at 20°C and 25°C, the Rg profiles exhibit remarkable similarity, characterized by a steady or slight decrease. This suggests that the molecule maintains its overall structural compactness, indicating relative stability with no major structural changes.

Conversely, when examining the 30°C simulation, we note a profile trending slightly upward, indicating a modest expansion or unfolding of the molecule. This implies that the molecule has become somewhat less compact and is undergoing structural alterations, such as a polymer chain extending.

It's important to note that while we've adjusted the graph's scale for clarity, the actual Rg values among the three simulations remain nearly identical.

In our molecular dynamics simulations, the monitoring of Rg functioned as a critical quality control metric, bolstering our confidence in the insights derived from RMSD analysis, even in instances where we couldn't meet the recommended simulation duration.

Pairwise Correlation Analysis:

In our final step, we conducted a pairwise correlation analysis using Pearson's Cross-Correlation (CC) graphs. This analysis provides essential insights into the interplay of motions between pairs of residues throughout our molecular dynamics simulation. To enhance clarity, we created a heatmap representation of the CC matrix.

Figure 15: Comparison of Pearson's Cross-Correlation Graphs at Three Temperatures

The values within the CC graph span a range from -1 to 1, denoting both the magnitude and direction of correlation. Positive values, depicted in shades of purple and tending towards 1, signify a positive correlation. This suggests that when one residue undergoes a particular movement or conformational change, the other residue tends to follow suit. Positive correlations often point to coordinated motions or structural stability. Conversely, negative values, visible in shades of green and approaching -1, indicate a negative correlation. In this scenario, when one residue experiences a specific motion or conformational change, the other tends to move or change in the opposite direction. Negative correlations may indicate anti-correlated motions or structural changes that counterbalance each other. White regions featuring values near zero indicate minimal or no correlation, implying that the movements of these residues are independent of one another.

Upon analyzing the CC graph, we observe a strong positive correlation within the FAST-PETase domain. Conversely, in the linker region, located around the 300th amino acid residue, we identify negative correlations. These negative correlations might highlight regions that counterbalance each other, contributing to structural integrity maintenance.

Notably, at 20°C, the graph tends to display slightly more regions in purple, suggesting pairs of residues that move in concert. Nevertheless, across all three temperature simulations, the CC graphs exhibit remarkable similarity. However, it's important to note that while these observations are intriguing, they warrant further investigation through extended simulation times to establish significance conclusively.


Final Considerations

In our pursuit of elucidating the structural intricacies of the FAST-PETase : Linker : MHETase complex, we have ventured through a diverse array of computational tools and analyses. This journey has yielded several critical insights into the complex's behavior and characteristics. However, as we conclude this model page, we must critically assess our accomplishments, acknowledge areas for potential enhancement, and offer guidance to future teams embarking on similar endeavors.

Our successful prediction of the complex's 3D structure using AlphaFold2 is a noteworthy achievement, supported by a remarkable 91.02% similarity as evaluated by the ERRAT score. However, the computational resource intensity associated with AlphaFold2 presents a limitation. Future teams should explore strategies to optimize resource allocation while maintaining prediction accuracy.

Molecular docking allowed us to explore protein-protein interactions and binding affinity within the complex. The results suggest that the linker-bound protein may enhance binding affinity, but further experiments are required to confirm this. Future research should consider the impact of linker length and conformation on binding affinity.

Our MD simulations offered insights into structural stability, flexibility, and the influence of temperature. While resource constraints limited our simulation duration, we observed significant insights. Future teams should aim for extended simulations and consider lower temperatures to comprehensively assess complex behavior.

While our findings are significant, we acknowledge the need for continued research to further solidify our insights. We encourage future teams to build upon our work, leveraging advances in computational biology and biotechnology to unlock deeper insights into this complex and its potential applications in plastic degradation.

Overall Recommendations:

  • Resource Optimization: We strongly recommend that future teams prioritize resource optimization. Exploring efficient resource allocation strategies for resource-intensive tasks, such as AlphaFold2 predictions and MD simulations, can significantly enhance computational capabilities. Collaboration with institutions offering high-performance computing resources could prove advantageous. If we were to start over, we would certainly consider this approach.
  • Temperature and pH Variation: To obtain a more comprehensive understanding of the complex's behavior across a broader spectrum of conditions, we encourage the exploration of simulations at varying temperatures and pH levels. This expanded parameter space can provide valuable insights into the complex's versatility and adaptability.
  • Linker Optimization: Investigate the influence of linker length and conformation on binding affinity. Optimizing these aspects can maximize the functionality of the complex, potentially leading to improved performance.
  • Extended Simulations: Whenever feasible, extend simulation durations to ensure the robustness of observations and conclusions. Longer simulations can reveal nuanced behaviors and validate the stability of biomolecular systems.
  • Experiment Validation: While we faced challenges in our wetlab experiments, we emphasize the importance of validating computational findings through experimental techniques whenever possible. Experimental validation can significantly enhance the model's reliability and impact.
  • In the future, we hope to see greater emphasis on the importance of cost-effective and efficient MD simulations in advancing biomolecular understanding. Access to computational resources should not be limited by financial constraints, and collaborative efforts should be encouraged to overcome such barriers.

    For those facing budget constraints similar to ours, we recommend exploring our contributions page. There, we delve into the role of cloud computing in scientific research, highlighting its advantages and adoption trends. We provide insights on how to make the most of cloud computing resources, ensuring that even those with limited resources can leverage its potential effectively.


    References:

    3D Structure Prediction Studies:

    1 Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589.

    Full Text

    2 Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold: making protein folding accessible to all. Nature methods, 19(6), 679-682.

    Full Text

    3 PyMOL Molecular Graphics System. (Version 2.5). Schrödinger, LLC. Retrieved from https://pymol.org/2/

    4 Mariani, V., Biasini, M., Barbato, A., & Schwede, T. (2013). lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics, 29(21), 2722-2728.

    Full Text

    5 Colovos, C., & Yeates, T. O. (1993). Verification of protein structures: patterns of nonbonded atomic interactions. Protein science, 2(9), 1511-1519.

    Full Text

    6 Ramachandran, G. T., & Sasisekharan, V. (1968). Conformation of polypeptides and proteins. Advances in protein chemistry, 23, 283-437.

    Full Text

    7 Laskowski, R. A., MacArthur, M. W., Moss, D. S., & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. Journal of applied crystallography, 26(2), 283-291.

    Full Text


    Molecular Docking Studies:

    8 Burley, S. K., Bhikadiya, C., Bi, C., Bittrich, S., Chao, H., Chen, L., ... & Zardecki, C. (2023). RCSB Protein Data Bank (RCSB. org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic acids research, 51(D1), D488-D508.

    Full Text

    9 Irwin, J. J., & Shoichet, B. K. (2005). ZINC − a free database of commercially available compounds for virtual screening. Journal of chemical information and modeling, 45(1), 177-182.

    Full Text

    10 O'Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Open Babel: An open chemical toolbox. Journal of cheminformatics, 3(1), 1-14.

    Full Text

    11 Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S., & Olson, A. J. (2009). AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. Journal of computational chemistry, 30(16), 2785-2791.

    Full Text

    12 Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2), 455-461.

    Full Text

    13 Laskowski, R. A., & Swindells, M. B. (2011). LigPlot+: multiple ligand–protein interaction diagrams for drug discovery.

    Full Text


    Molecular Dynamics Studies:

    15 Arantes, P. R., Polêto, M. D., Pedebos, C., & Ligabue-Braun, R. (2021). Making it rain: cloud-based molecular simulations for everyone. Journal of Chemical Information and Modeling, 61(10), 4852-4856.

    Full Text

    16 Eastman, P., Swails, J., Chodera, J. D., McGibbon, R. T., Zhao, Y., Beauchamp, K. A., ... & Pande, V. S. (2017). OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7), e1005659.

    Full Text

    17 Case, D. A., Aktulga, H. M., Belfon, K., Ben-Shalom, I., Brozell, S. R., Cerutti, D. S., ... & Kollman, P. A. (2021). Amber 2021. University of California, San Francisco.

    Full Text

    18 Tian, C., Kasavajhala, K., Belfon, K. A., Raguette, L., Huang, H., Migues, A. N., ... & Simmerling, C. (2019). ff19SB: Amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. Journal of chemical theory and computation, 16(1), 528-552.

    Full Text

    19 Izadi, S., Anandakrishnan, R., & Onufriev, A. V. (2014). Building water models: a different approach. The journal of physical chemistry letters, 5(21), 3863-3871.

    Full Text