Modeling is the process of constructing simplified representations of intricate
real-world systems. These representations provide
researchers with the means to thoroughly examine, analyze, and derive valuable insights into the behavior of these
complex systems. Models have a variety of
applications, serving various purposes, including hypothesis testing, optimization, system design, and future
outcome predictions. They stand as formidable tools
that facilitate the exploration of intricate systems while also delivering the advantage of saving valuable time and
resources through the execution of virtual
experiments before embarking on physical ones.
Kinetics x Structural
To comprehensively address the multifaceted challenges posed by PET-type microplastic
degradation, our team embarked on a dual-model
approach, strategically dividing our research into two primary domains: kinetics and structural analysis. This
approach allows us to unravel the dynamics of
microalgae growth and PET degradation while recognizing the importance of understanding the FAST-PETase: Linker:
MHETase complex at a molecular level. By
seamlessly integrating these two distinct yet interconnected models, our project, Mhetyguá, significantly advances
our comprehension of PET-type microplastic
degradation from kinetic and structural perspectives, firmly establishing our position at the forefront of
innovative solutions for this critical environmental
challenge.
Our goal
The goal of our project, Mhetyguá, is to act on the degradation of PET-type
microplastics in sewage treatment plants, one of the
destinations of this environmental contaminant that does not yet have a mitigation method in Brazilian sanitation
institutions. We address the emerging concern
regarding microplastics, increasing population awareness, while also actively contributing to the iGEM community.
Some iGEM teams, such as TU Kaiserslautern 2019
and HU Berlin 2019, had previously focused on PET-related projects. Their work served as a valuable source of
inspiration during the early stages of our project's
development. We conducted some critical analysis of their projects to gain insights into potential innovative
contributions within the realm of PET degradation.
In this context, we plan to act on microplastics degradation using enzymes secreted by
genetically modified microalgae Chlamydomonas
reinhardtii, FAST-PETase and MHETase, interconnected by a linker. Initially, FAST-PETase binds to PET to initiate
activity, transforming it into MHET, so that
MHETase can act and transform MHET into terephthalic acid (TPA) and ethylene glycol (EG). Our idea would be to
foresee a possible application of microalgae in
stabilization ponds, one of the wastewater treatment approaches. However, in order for us to scale and validate the
process mechanism in sewage treatment systems,
we need to simulate and evaluate the dynamics of the microalgae growth and PET degradation processes by applying a
kinetic model, giving us the opportunity to
answer the following questions:
How does the stabilization pond medium influence microalgae growth? And, consequently, in the amount of enzymes?
How much microplastic can we degrade with the amount of secreted enzymes?
How long can it take to degrade a satisfactory amount of plastic?
Furthermore, we sought to deepen our comprehension of this protein_linker_protein
complex and its prospective applications in PET
degradation. Our approach involved an in-depth analysis of its 3D structure, binding affinities, and dynamic
behavior through advanced computational methods to
address the following inquiries:
How does temperature influence the stability and flexibility of the complex, and how do these dynamics impact
its overall behavior?
How do the computational methods enhance our understanding of complex protein structures and their dynamics?
What insights can computational predictions provide about the 3D structure and stability of our complex?
Our perceptions
From our perspective, the kinetics simulation results suggest that the system is
feasible, but its effectiveness varies with the
season. FAST-PETase has the potential to degrade PET in wastewater treatment systems, even at lower enzyme
concentrations. However, challenging conditions, such
as colder temperatures, require system improvements. Further studies on protein production and secretion by C.
reinhardtii are needed and nutrient availability
appears adequate for microalgae growth under typical wastewater conditions, but it's essential to consider that
temperature and solar irradiance can vary significantly.
In general, the model initially prioritized simplicity and measurability but held plans for improvements, including
enhancing the representation of stabilization
ponds, incorporating enzyme production models, and implementing a global sensitivity analysis to assess the impact
of model inputs. However, time constraints,
focus on project essentials, and access limitations to certain data hindered the integration of these improvements.
Nevertheless, the model's base code for
stabilization ponds and the enzyme production equation provide valuable resources for future teams working within
this scope.
Regarding structural analysis, which employed a groundbreaking approach, harnessing
AlphaFold2, a cutting-edge prediction method, to
predict the 3D structure of the FAST-PETase: Linker: MHETase complex. Our predictions demonstrated a remarkable
91.02% similarity based on the ERRAT2 score, firmly
establishing the reliability of our projections. We efficiently utilized computational resources, employing the
ColabFold Colab notebook and conducting detailed
analyses using the PyMOL platform. Furthermore, we evaluated the binding affinity of the complex to its respective
ligands, PET and MHET, through molecular docking
simulations. Our findings indicated that the fusion of FAST-PETase and MHETase with a linker may enhance binding
affinity, especially with PET. These results suggest
a promising avenue for plastic degradation. Additionally, we ventured into molecular dynamics simulations to explore
the dynamic behavior of our complex, emphasizing
the role of temperature in its stability. While our simulations provided valuable insights, extended simulation
times and additional temperatures are necessary for
robust conclusions.
Conclusions
In conclusion, our project represents a significant step toward addressing the issue of
PET-type microplastic degradation in sewage
treatment plants. Kinetic simulations have provided insights into fundamental questions, such as the influence of
stabilization pond medium on microalgae growth
and enzyme production, as long as conditions support the log-phase growth of C. reinhardtii, enzyme concentration
remains proportional to microalgae concentration,
allowing for efficient removal of PET microplastic from water. Furthermore, in an optimal scenario, our model has
demonstrated the potential to degrade 400 mg/m3
of PET within 19 days, aligning with PET concentrations ranging from 24.9 mg/m3 to 680 mg/m3
detected in the influent of TIAN, Lei et al. study [4]. This marks a
significant advancement in PET plastic degradation, emphasizing the crucial role of time as a key factor.
Our successful prediction of the complex's 3D structure using AlphaFold2 and the
favorable binding affinity results from molecular
docking simulations are encouraging signs of progress. The stability analysis via molecular dynamics simulations
suggests that lower temperatures, such as 20ºC
and 25ºC, are more conducive to maintaining the complex's structural integrity. However, resource limitations
necessitate further investigation with extended
simulation times and additional temperatures. Summing it up, our research forms a solid base for future work in
plastic degradation, offering valuable insights
into the structure and behavior of the FAST-PETase : Linker : MHETase complex.
References
[1] DYM, Clive. Principles of mathematical modeling. Elsevier, 2004.
[2] Rangwala, H., & Karypis, G. (Eds.). (2011). Introduction to protein structure prediction: methods and
algorithms. John Wiley & Sons.
[3] Haile, J. M. (1992). Molecular dynamics simulation: elementary methods. John Wiley & Sons, Inc.
[4] Lei Tian, Ewa Skoczynska, Robert-Jan van Putten, Heather A. Leslie, Gert-Jan M. Gruter.Quantification of
polyethylene terephthalate micro- and nanoplastics in domestic wastewater using a simple three-step method.
Science of The Total Environment, Volume 857, Part 2, 2023, 159209, ISSN 0048-9697,
https://doi.org/10.1016/j.scitotenv.2022.159209.
1. Kinetic Model
A kinetic model is a mathematical method to describe various processes' dynamic behavior.
It carries a set of equations that capture the underlying mechanisms governing these processes over time. In the
context of microalgae growth and enzymatic degradation, kinetic models provide a structured approach to understanding
and predicting how these biological systems evolve. They enable us to quantify factors influencing growth rates,
enzyme activities, and substrate conversion. This modeling approach offers advantages such as precision, scalability,
and the ability to optimize conditions for specific outcomes.
a. Stabilization ponds in sewage treatment plants and microalgae cultivation
Our kinetic model's first step aims to analyze the biomass growth of the microalgae
Chlamydomonas reinhardtii in stabilization ponds over time. Stabilization ponds, designed to promote biological
degradation of organic matter, are a possible implementation of our chassis, so that, in addition to being a method
for microplastic degradation, it will assist in the oxygenation of the pond medium through photosynthesis and will
act, through consumption, in the removal of excess nutrients that are present in wastewater, such as nitrogen and
phosphorus.
Do you want to know more about how
stabilization ponds work? Check our implementation here.
Mhetygua’s microalgae growth model considers several key factors influencing microalgae’s
proliferation in these ponds. These factors include solar irradiance, pH levels, temperature, nutrient and carbon
intakes, and the oxygenation of the pond environment.
b. Plastic degradation
In addition to biomass growth, the model also addresses the production of the
interconnected enzymes FAST-PETase and MHETase by microalgae and the degradation of PET by the enzyme chimera over
time. One of our assumptions is that the more microalgae we have, the more plastic degradation will occur as more
enzymes will be secreted. Therefore, the microalgae growth and PET degradation models maintain a direct proportional
relationship.
c. Integration method
Our system will be solved using the Gompertz model. In the context of dynamic systems such
as microalgae growth and plastic degradation, the Gompertz model is employed to depict population growth, particularly
in scenarios where growth is constrained and slows over time. Considering the interrelationship between the growth
model and the degradation model mentioned before, we can simulate and analyze the results within the framework of the
Gompertz equation. This model allows us to calculate how the microalgae population changes as it grows and tracks
simultaneously how the plastic concentration decreases due to enzymatic degradation.
MODEL
1.Essential Background
Before we get to the equations, it is essential to understand the processes involving our
microalgae and the pond. To model, paying attention to the main factors influencing the system is necessary.
Therefore, a literature review must be carried out together with a critical analysis of what information will be most
relevant to be used. Taking this into consideration, at the end of each model, we have separated the main processes to
facilitate understanding.
a. Growth rate model
The growth rate model is based on an existing model, called ABACO, by Sánchez-Zurano et
al., and with a theoretical basis from the BIOALGAE model by SOLIMENO et al. Both models bring a sewage treatment
approach with a microalgae-bacteria consortium, incorporating important growth aspects and characteristics of
microalgae, such as their dependence on light, endogenous respiration, and consumption of nutrients, in addition to
the influence of temperature and pH. It includes the interrelationship between the different microbial populations and
the balance between releasing and consuming relevant compounds such as oxygen and carbon dioxide. The two models,
then, serve as great tools for developing wastewater treatment processes using microalgae.
Although the authors' modeling is focused on the microalgae-bacteria consortium,
Mhetyguá’s model concentrates only on microalgae to simplify the number of variables. Considering that complex media,
such as wastewater, require the assessment of several aspects that influence microalgae growth, we focused only on the
chassis that would secrete the PET-degrading enzymes. However, we do not disregard the presence of bacteria when it
comes to oxygen consumption and carbon dioxide release, as this interaction is essential for the balance of substances
and the functioning of the metabolic mechanisms of the existing microbiota, and, of course, the bacteria occur
naturally in the pond, disregarding them completely would not be ideal in an attempt to represent the reality of the
pond when modeling.
In the ABACO model, the experimental part was carried out in a closed photobioreactor
(FBR). In this case, as our study medium would be a stabilization pond, similar to an open reactor, some adaptations
needed to be made.
In closed photobioreactors, high oxygen concentrations become an inhibition factor for photosynthesis,
consequently being an obstacle to the growth of microalgal biomass, which required some attention from the authors
regarding the maximum amount of dissolved oxygen for the specific strain of the microalgae. This was not a problem
for us since, to simplify our model, we are considering dissolved oxygen equal to zero, knowing that bacteria
consume the oxygen produced by photosynthetic activity and that it also ends up being released into the atmosphere
as well as at night, when oxygen is consumed by both microalgae and bacteria in the respiration process.
Another relevant point to adapt is the daily removal of part of the microalgal biomass from inside the
photobioreactor, which the authors carried out. This would not be done in our system since the treatment process in
stabilization ponds requires a hydraulic retention time, in which the focus would be on the removal and degradation
of unwanted components through microalgae and not on optimizing their biomass since we will not use biomass
afterward, mainly because any GMO should be discarded after the process.
As said before, there is no continuous inlet and outlet flow in the pond as in the
author’s continuous FBR. Microalgal biomass concentration varies according to the hydraulic retention time (HRT). The
HRT is how long water remains in the stabilization pond before leaving. During this period, microalgal biomass
concentration can increase or decrease depending on growth conditions.
The microalgae specific growth rate μALG is influenced by
crucial factors such as average light irradiance (Iav), temperature, pH, dissolved oxygen, and
CO2. Furthermore, the influence of the availability of nutrients, such as phosphate phosphorus
(P − PO4), ammonia nitrogen (N − NH4) , and nitrate nitrogen (N −
NO3), which are the most found in wastewater. As a decay factor, the model considers the
endogenous respiration (mALG) of the microalgae.
Having contextualized our model basis, we will now introduce the main aspects that involve
the growth of our microalgae in detail:
I. Growth factors
Light influence
The primary concern in the mass cultivation of photoautotrophic microalgae is the
efficient utilization of intense light for both the growth of cell mass and the production of secondary metabolites.
The amount of light energy these microorganisms receive depends on the photon flux density reaching the culture
surface.
\(I_{0}\) is the irradiance at the surface of the system.
\(K_{a}\) is the biomass extinction coefficient.
\(h\) is the depth of culture in the system.
Temperature influence
Temperature's impact on biochemical reactions makes it one of the most crucial
environmental factors influencing the biochemical composition of microalgae.
\(T_{max,ALG}\) is the maximum temperature for microalgae.
\(T_{min,ALG}\) is the minimum temperature for microalgae.
\(T_{opt,ALG}\) is the optimum temperature for microalgae.
Carbon influence
When it comes to autotrophic production, the primary carbon sources are
CO2 and HCO3. Unlike terrestrial plants, atmospheric
CO2 alone cannot meet the carbon requirements of autotrophic algal production systems. The
diffusion of CO2 from the atmosphere into open ponds supports growth to some extent. In our
model, CO2 is also made available for the photosynthetic process as a product of the
degradation of organic matter carried out by heterotrophic bacteria.
\(X_{CO_2}\) is the concentration of carbon dioxide.
\(X_{HCO_3}\) is the concentration of bicarbonate.
\(K_{S, C, ALG}\) is the microalgae half-saturation constant for carbon.
\(K_{I, C, ALG}\) is the microalgae inhibition constant for carbon.
\(n C, ALG\) is the microalgae form parameter for carbon.
pH influence
Generally, microalgae thrive in a slightly alkaline pH range. The bicarbonate-carbonate
system is the primary buffer in freshwater and is the most effective method for controlling and maintaining pH
levels.
The bicarbonate-carbonate buffer system acts through the following reactions:
\[2HCO_3^-\Leftrightarrow CO_3^{2-}+H_2O+CO_2\]
\[HCO_3^-\Leftrightarrow CO_2+OH^-\]
\[CO_3^{2-}+H_2O\Leftrightarrow CO_2+2OH^-\]
These reactions suggest that during photosynthetic \(CO_2\) fixation, there is an accumulation of \(OH^-\), leading
to a gradual increase in pH. Additionally, as active photosynthesis raises pH levels, the opposite occurs during
respiration, with \(CO_2\) being released, although this effect is relatively small, typically accounting for less
than 10% of total photosynthetic production.
After carbon, nitrogen is the most crucial nutrient for microalgae's internal processes.
Microalgae can grow using both ammonium and nitrate as a nitrogen source. It is observed that nitrate consumption is
only used when there is no ammonium in the system due to the preference to assimilate ammonium, assuming that it is
due to a smaller amount of energy spent when ammonium is assimilated. In its absence, our microalgae assimilates
nitrate as a source of nitrogen.
\(X_{NO_3}\) is the concentration of nitrate nitrogen
\(K_{S, NO_3, ALG}\) is the half-saturation constant of microalgae for nitrate
\(K_{I, NO_3, ALG}\) is the microalgae inhibition constant for nitrate
\(n NO_3, ALG\) is the microalgae form parameter for nitrate.
Phosphorus influence
Phosphorus is essential for growth and various cellular processes, such as energy
transfer and the biosynthesis of nucleic acids and DNA. Even though algal biomass contains less than 1% phosphorus,
it often becomes one of the most critical limiting factors for algal growth.
\(X_{PO_4}\) is the concentration of phosphorus phosphate.
\(K_{S, PO4, ALG}\) is the half-saturation constant of microalgae for phosphate.
Oxygen influence
The accumulation of photosynthetically generated oxygen is one of the main factors
limiting the scale-up of closed photobioreactors, as the accumulation of oxygen inhibits photosynthetic processes.
The uniqueness of the stabilization pond lies in its oxygenation operation mode, not relying on an electromechanical
air supply.
In addition to improving water quality, oxygenation will be essential for the
consumption of oxygen by the heterotrophic bacteria in the process of degradation of organic matter, which results
in carbon dioxide that will be used by microalgae again in photosynthesis to produce oxygen. This also influences
the pH of the water, as previously mentioned.
Therefore, there is a complex interaction between the concentration of dissolved oxygen,
the degradation of organic matter, respiration, photosynthesis, and the pH of the pond. Considering this network of
interactions involving oxygen, we decided to equate it to zero in our model for simplification. This does not mean
that there is no oxygen in the pond since it is essential for its functioning. Instead, it means that all the oxygen
being produced is already consumed by bacteria during the day, used at night for respiration, and released into the
atmosphere. In the following day, the processes are repeated, allowing us to equate it to zero in the model. In
addition, in the kinetic simulations, we are using time over the course of days and not the day and night periods.
\(DO_2\)(%) is the oxygen dissolved in the system.
\(DO_2, max\)(%) is the maximum amount of dissolved oxygen for the microalgae strain.
\(m\) is a form parameter.
Endogenous respiration
Endogenous respiration is essentially the opposite process of photosynthesis and
represents the factor in the decay of microalgae. While photosynthesis uses carbon dioxide
(CO2) and sunlight to produce glucose and release oxygen (O2),
endogenous respiration consumes glucose (or other organic compounds) and O2 to produce
energy in the form of adenosine triphosphate (ATP) and releases CO2. as a byproduct.
\(m_{min, ALG}\) and \(m_{max,ALG}\) are the minimum and maximum breathing rates.
\(I_{k, resp}\) is the irradiance needed to stop photosynthesis and start the respiration process.
\(nresp\) is the form parameter for respiration.
II. Fundamental Processes
1. Throughout their growth process, microalgae (XALG)
capture inorganic carbon (XCO2 and XHCO3), consume nutrients
(XNH4, XNO3 and XPO4) and release oxygen
(DO2) through photosynthesis;
2. The decomposition of organic matter by bacteria uses the oxygen generated by
microalgae and leads to the generation of XCO2;
3. The biomass density of microalgae increases due to autotrophic growth but
decreases due to endogenous respiration.
Now that the main aspects that influence the growth of our microalgae have been presented,
we will present the degradation model and then how the two models, growth and degradation, will be integrated.
b. PET Degradation Model
In our degradation model, as mentioned before, we will use the enzymes FAST-PETase and
MHETase to degrade PET. FAST-PETase binds to PET to initiate activity, transforming it into MHET, so that MHETase can
act and transform MHET into terephthalic acid (TPA) and ethylene glycol (EG). The reaction mechanisms of our
degradation model are represented below:
FAST-PETase acts on PET to transform it into Mono-(2-hydroxyethyl)terephthalic acid
(MHET). There were no records of the actual degradation rate of PET by FAST-PETase, but we had information from the
article "Machine learning‐aided engineering of hydrolases for PET depolymerization", which presented the results of
the time required for degradation and the initial masses of PET films, from this, we were able to calculate a
degradation rate to be used in our model. The data analyses made to get such a degradation rate are available in our
GitHub here.
II. MHETase
MHETase acts on MHET to transform it into terephthalic acid (TPA) and ethylene glycol
(EG). Its enzymatic activity was analyzed using the Michaelis-Menten kinetic equation, which calculates the speed at
which an enzyme converts a substrate into a product.
It is important to remember that, in our model, we are taking into account a directly
proportional relationship between the microalgae population and the concentration of enzymes. We are assuming that the
more microalgae, the more enzymes will be secreted, therefore, describing the secretion mechanism proved to be
relevant for our model.
In our final model, we summarized the enzyme secretion rate and expression rate in one
term named Ke, the enzyme production yield per algae, that is, the number of functional enzymes produced and secreted
per microalgae cell. Since we found no research establishing enzyme production yield specifically for C. reinhardtii
in terms of quantity, and there are records of million enzymes produced per cell from other organisms, we arbitrarily
chose the value of 75 enzymes per cell.
c. Integration method
Now that we have explained the main mechanisms of our model, the combination of both of
them, considering the proportional microalgae-enzyme relationship, will be presented.
I. Context
Although we are using the ABACO and BIOALGAE models as basis, which use the differential
equations method, when considering only the microalgae in our model, and not the microalgae-bacteria consortium, the
bacterial absence directly influenced how the amount of nutrients varied in the stabilization pond medium, resulting
in an infinite supply of nutrients for the microalgae.
We realized that when, initially, we were using the differential equations provided by the
base articles in our simulations, but since in the bacterial absence scenarios there were no nutrient variation and
natural factors like temperature and irradiance exhibit relatively consistent patterns of
μALG throughout the month, it became clear that, in our model, the calculated
μALG from the base article was actually the maximum specific growth rate
(μALGmax), which is the growth rate when nutrients are most ideal.
Consequently, our approach to solving the growth model, given that C. reinhardtii is known
to grow similarly to bacteria and yeast, incorporated a pattern that aligns with microalgae's sigmoidal growth curve,
encompassing the lag phase, the log phase, and the stationary phase. In this particular context, the Gompertz
methodology proves to be the most suitable choice. By considering μALG as
μALGmax in Gompertz modeling, we can estimate microalgae biomass concentration without the
need to precisely predict how nutrients are utilized by the various organisms within the pond.
We chose to employ the Norton reparametrization, as it provides a well-characterized
adjustment of the original Gompertz equation with easily interpretable parameters that have direct biological
significance:
\(X(t)\) is the concentration of microalgae as a function of time.
\(X₀\) is the initial concentration of microalgae.
\(A\) is the maximum concentration of microalgae.
\(\mu_{ALG,max}\) is the situation-specific maximum growth rate.
\(t\) is a given point in time.
From that, we can estimate the enzyme concentration in the system as a function of algae
concentration. As there are no general models to describe enzyme production in C. reinhardtii, we speculated a
mechanistic model as follows:
\[E(X, t) = X_{(t)}\times Ke +E_{(t-1)}\times 1-De\]
Where:
\(E(X, t)\) is the enzyme concentration as a function of algae concentration and time.
\(X\) is the algae concentration.
\(Ke\) is the enzyme production yield per algae.
\(E_{(t-1)}\) is the concentration of enzymes at the previous time point.
\(\)De is the enzyme degradation rate per day.
Since the FAST-PETase Michaelis-Menten parameters are not described yet, it is fruitful to
describe the PET concentration as a function of enzyme concentration. The equation is as follows:
\(PET(E, t)\) is the PET concentration as a function of enzyme concentration and time.
\(k_{FAST-PETase}\) is the rate of PET degradation by FAST-PETase.
\(E(t)\) is the enzyme concentration at time t.
In this way, we were able to evaluate how our models behave simultaneously, analyzing the
degradation of PET microplastic according to the growth of our microalgae. The equation system was written in Python.
You can check our code here.
Process
Equation
Enzyme concentration
\(E(X, t) = X_{(t)}\times Ke +E_{(t-1)}\times 1-De\)
Variables are quantities that can change or vary over time or in response to different
conditions, while parameters are constant values that affect the system's behavior but do not vary during the process
or analysis. In the table below, we recorded these values, as well as the initial conditions, so that we could
simulate our model.
Name
Value
Description
Unit
Source
System features
Inlet effluent
pH
\(7.8 \pm 0.1\)
pH of incoming effluent
-
[3]
\(CO_2\)
\(20 000\)
Carbon dioxide concentration
\(mg\times m^{-3}\)
[3]
\(HCO_3\)
\(1 600\)
Bicarbonate concentration
\(mg\times m^{-3}\)
[3]
\(N-NH_4^+\)
\(30 900 \pm 1 500\)
Ammonium concentration
\(mg\times m^{-3}\)
[3]
\(N-NO_3^-\)
\(400 \pm 700\)
Nitrate concentration
\(mg\times m^{-3}\)
[3]
\(P-PO_4^-\)
\(119200 \pm 5100\)
Phosphate concentration
\(mg\times m^{-3}\)
[1]
\(X_{PET}\)
\(24.9\) to \(680\)
Expected PET initial concentration
\(mg\times m^{-3}\)
[9]
GROWTH MODEL
\(X_{ALG0}\)
\(0.0000003322\)
Initial microalgae concentration
\(\mu M/m^{-3}\)
Estimated
\(X_{ALG,max}\)
\(0.003322\)
Maximum microalgae concentration
\(\mu M/m^{-3}\)
[8]
\(\mu_{ALG,max}\)
\(1.591\)
Maximum growth rate of microalgae
\(day^{-1}\)
[1]
\(I_k\)
\(168\)
Irradiance constant
\(\mu E\times m^{-2}\times s^{-1}\)
[3]
\(I_0\)
\(393\) to \(934\)*
Irradiance on the system surface
\(\mu E\times m^{-2}\times s^{-1}\)
[3]
\(I_{k,resp}\)
\(134\)
Irradiance needed to start the respiration process
\(\mu E\times m^{-2}\times s^{-1}\)
[3]
\(n\)
\(1.7\)
Form parameter
-
[3]
\(n_{resp}\)
\(1.4\)
Form parameter for respiration
-
[3]
\(K_a\)
\(0.000007\)
Biomass extinction coefficient
\(m^2\times mg^{-1}\)
[3]
\(h\)
\(1.8\)
Depth of culture
\(m\)
[1]
\(m_{min,ALG}\)
\(0.01\)
Minimum respiration rate
\(day^-1\)
[1]
\(m_{max,ALG}\)
\(0.276\)
Maximum respiration rate
\(day^-1\)
[1]
\(T_{max,ALG}\)
\(49\)
Maximum temperature for microalgae
\(ºC\)
[3]
\(T_{min,ALG}\)
\(3.4\)
Minimum temperature for microalgae
\(ºC\)
[3]
\(T_{opt,ALG}\)
\(30\)
Optimum temperature for microalgae
\(ºC\)
[3]
\(pH_{max,ALG}\)
\(12.9\)
Maximum pH for microalgae
\(-\)
[2]
\(pH_{min,ALG}\)
\(1.8\)
Minimum pH for microalgae
\(-\)
[2]
\(pH_{opt,ALG}\)
\(8.5\)
Optimum pH for microalgae
\(-\)
[2]
\(DO_2\)
\(0\)
Concentration of dissolved oxygen
\(mg\times m^{-3}\)
[3]
\(m\)
\(4.15\)
Form Parameter
\(-\)
[3]
\(K_{S,C,ALG}\)
\(4\)
Microalgal half-saturation constant for carbon
\(mg\times m^{-3}\)
[3]
\(K_{I,C,ALG}\)
\(120000\)
Microalgal inhibition constant for carbon
\(mg\times m^{-3}\)
[3]
\(K_{S,NH_4,ALG}\)
\(1540\)
Microalgae half-saturation constant for ammonium
\(mg\times m^{-3}\)
[3]
\(K_{I,NH_4,ALG}\)
\(571000\)
Microalgal inhibition constant for ammonium
\(mg\times m^{-3}\)
[3]
\(n_{NH_4,ALG}\)
\(1\)
Form parameter for ammonium
\(-\)
Estimated
\(K_{S,NO_3,ALG}\)
\(2770\)
Microalgal half-saturation constant for nitrate
\(mg\times m^{-3}\)
[3]
\(K_{I,NO_3,ALG}\)
\(396600\)
Microalgal inhibition constant for nitrate
\(mg\times m^{-3}\)
[3]
\(n_{NO_3,ALG}\)
\(1\)
Form parameter for nitrate
\(-\)
Estimated
\(K_{S,PO_4,ALG}\)
\(430\)
Microalgal half-saturation constant for phosphate
\(mg\times m^{-3}\)
[3]
DEGRADATION MODEL
\(Ke\)
\(75\)
Protein production yield
\(enzyme \times cell^{-1}\)
Estimated
FAST-PETase_MHETase
Name
Value
Description
Unit
Source
\(k_{FAST-PETase}\)
\(33.5\)
Rate of PET degradation by the enzyme FAST-PETase
\(\mu M\times day^{-1}\times \mu M\)
[7]
\(Km_{MHETase}\)
\(23.17 \pm 1.65\)
Michaelis-Menten constant
\(\mu M\)
[6]
\(K_{MHETase}\)
\(2384640 \pm 224640\)
Rate of MHET degradation by the enzyme MHETase
\(dia^{-1}\)
[6]
RESULTS
Setting up the scenarios
To assess the feasibility of our system, we conducted simulations under two distinct
scenarios: summer and winter. We chose these scenarios due to the significant variations in irradiance and temperature
expected in Foz do Iguaçu since nutrient availability and microplastic presence in wastewater are not likely to vary
much throughout the year.
Summer Scenario: For this scenario, we approximated the irradiance as a constant 934
uE/m², based on data from the BIOALGAE article. While this value does not precisely match the actual irradiance in Foz
do Iguaçu, it is a reasonable approximation for our purposes.
Winter Scenario: To maintain proportionality, we derived the actual irradiance values from
CRESESB-Centro de Referência para Energia Solar e Eólica (cepel.br) and calculated an equivalent value of 393 uE/m²
for winter. This approach accounts for the non-convertible units between solar irradiance (W/m²) and the constants
used in the BIOALGAE and ABACO models (uE/m²), explaining our choice not to use actual values.
Outcomes
Summer simulations:
From observing the graphs, it is noticeable that microalgae C. reinhardtii rapidly reaches
the culture maximum concentration considering the summer scenario. With maximum culture concentration, protein
quantity also reaches a high concentration quickly. Therefore, the accumulated amount of protein quickly degrades the
PET microplastic. That way, more than 400mg of PET is degraded within the expected hydraulic retention time.
As a side note, as the reader can see, MHET concentrations are not presented in our
results. That’s because, as other teams, such as Team:Humboldt Berlin/Model - 2019.igem.org, had
already stated, the MHETase catalytic activity is so much more significant in comparison to PETase that MHET
concentration tends to zero through the whole process. Our first simulations confirmed that statement is still valid
even in the case of FAST-PETase. So, as a readability decision, we chose not to elaborate on MHET degradation further.
Winter simulations:
On the other hand, it becomes clear that the microalgae population slowly decays during
the coldest days of winter. Still, a relevant amount of microplastic is degraded in hydraulic retention time (about
110 mg of PET). However, it is possible to argue that since conditions are inadequate for growth, the heterologous
expression also becomes sub-optimal to none and that enzyme hydrolytic activities are reduced. A more pessimistic
result would indicate no PET degradation during winter.
Is it possible to degrade PET microplastic from wastewater using C. reinhardtii to
produce FASTPETase linked to MHETase?
The results are promising for applying FAST-PETase_MHETase-producing microalgae in water
treatment stabilization ponds to biodegrade PET microplastic. Our findings suggest degrading most PET in the water in
less than 20 days is feasible. However, the degradation rate decreases significantly in winter due to limited
microalgae growth resulting from low solar irradiance and low temperatures.
How may our simulations help future experiments?
Our simulations provide valuable insights for future investigations. The most significant
observation is that FAST-PETase presents the potential to degrade PET quickly enough that wastewater treatment ponds
could use it for PET degradation even in scenarios with lower enzyme concentrations. Future research teams face the
challenge of improving the system to enable PET degradation in more challenging conditions, such as colder
temperatures. Additionally, our work underscores the need for further omic studies in Chlamydomonas reinhardtii,
particularly concerning protein production and secretion.
Also, from elaborating the model, it became clear that key points for the success of using
C. reinhardtii to biodegrade microplastic present in wastewater still require experimental measurements. Through
experimentation, it may be possible to answer two key questions:
How much FAST-PETase linked to MHETase is produced from our microalgae?
Will the C. reinhardtii behave as expected in a stabilization pond?
In conclusion
Let's revisit the initial questions our model and simulations aimed to address:
How does the stabilization pond medium influence microalgae growth and,
consequently, the amount of enzymes?
It is evident that nutrient availability, pH, temperature, and light irradiance in
the medium play essential roles in microalgae growth. Therefore, according to our simulations, typical
wastewater conditions seem to provide ample nutrients for microalgae growth. However, it's vital to consider
that temperature and solar irradiance can vary significantly depending on geographical characteristics and the
season. As long as the conditions support C. reinhardtii log-phase growth, enzyme concentration should remain
proportional to microalgae concentration. As enzyme concentration increases, more PET microplastic is removed
from our water.
How much microplastic can we degrade with the amount of secreted enzymes?
In an ideal scenario, it is possible to completely degrade over 400 mg of PET
microplastic within approximately 19 days, considering the enzymes secreted during this time.
How long can it take to degrade a satisfactory amount of plastic?
Our model perceives the data collected from the article by TIAN, Lei et al. [9] with
great interest and optimism. The detection of polyethylene terephthalate (PET) microplastics in all influents of
the study, with concentrations ranging from 24.9 mg/m³ to 680 mg/m³, underscores the viability of our system,
which demonstrates the capability of PET-degrading enzymes in our simulation, suggesting that it is possible to
degrade 400 mg/m³ of PET in 19 days of hydraulic retention time. The ability to degrade a substantial amount of
PET in a relatively short period is promising, also being coherent with the operational system of stabilization
ponds. This capacity could contribute to reducing PET's presence in the environment and, in turn, mitigating the
negative impacts of microplastics on aquatic and terrestrial ecosystems. However, it's essential to emphasize
that the practical application of this discovery must take into account various factors, such as the process
scalability, biosafety, associated costs, and effectiveness under real-world conditions.
In summary, the potential to degrade 400 mg of PET within a 19-day timeframe, taking
into account the fluctuating PET concentrations detected in the influent of the TIAN, Lei et al. study, offers a
compelling opportunity in which time emerges as a crucial factor significantly improved in the realm of PET
plastic degradation. This vision becomes a reality as long as coordinated efforts succeed in transforming this
potential into tangible and sustainable solutions for the global issue of microplastics.
Model Improvements
As stated earlier, our model accounts for many simplifications, and as many
biomathematicians would say, a simple but measurable model is better than a complex immeasurable one.
Nonetheless, we maintained a vision for advancing the model's sophistication while enhancing its practicality.
This section delves into the potential enhancements and outlines our perspective on how future teams can
leverage this model as a foundation for their own refinements and innovations. There are some reasons why we
did not implement said improvements in our model: Firstly, time. As we finished the first version of the
model, our team faced a substantial workload, with a primary focus on comprehending our chassis and
constructing genetic circuits. As soon as we had some breathing space to start building on a better model, it
was already time to write wiki scripts, and we concluded it was best to pass on what we learned through a
carefully written wiki than to make a few changes in a model that had already served its greatest purpose.
Also, paywalls limited our research as some data and equations were in inaccessible articles to which we only
got access much later on the project.
Improving the stabilization pond representation
One of our initial simplifications involved omitting consideration of other
microorganisms in the pond. That led us to end up being unable to establish resource usage, which limited
our model. We did, though, build a base code structured in object-oriented programming, which accounts for
the complete ecosystem of the stabilization pond as described by Sánchez-Zurano et al. [1] and Solimeno et
al. [3]. We make this code available here
and encourage other teams to use it in future projects within the scope of stabilization
ponds.
Enzyme production
Studies on promotor activities are still limited in Chlamydomonas reinhardtii.
However, a model proposed by Gerosa et al. [10] defines enzyme production based on bacterial specific growth
rate, and it might apply to C. reinhardtii. This model is also interesting because usual laboratory
measurements, e.g., microalgal concentration and protein quantification, already allow the verification of
the model's capacity to describe the produced enzymes. This equation is also included in the skeleton of the
more complete version of our code.
Global Sensitivity Analysis
A global sensitivity analysis (GSA) evaluates the impact of each model input on
its output. Therefore, a GSA would help us understand how much variation is expected in our system according
to distinct nutrient concentrations in the influent wastewater and temperature/irradiance. This is a
significant step and quite an improvement to our simpler method of measuring an “optimistic” summer scenario
and a “pessimistic” winter scenario. The methodology to the GSA in our model would be very similar to what
was made by Team:UNILA LatAm/Model -
2021.igem.org in their sensitivity analysis. Yet, this part is not anywhere in our available code and
would have to be developed by future users of our model.
REFERENCES
[1] SÁNCHEZ-ZURANO, Ana et al. ABACO: a new model of microalgae-bacteria consortia for biological
treatment of wastewaters. Applied Sciences, v. 11, n. 3, p. 998, 2021.
[2] SÁNCHEZ-ZURANO, Ana, et al. A novel photo-respirometry method to characterize consortia in
microalgae-related wastewater treatment processes. Algal research, v. 47, p. 101858, 2020.
[3] SOLIMENO, Alessandro et al. Integral microalgae-bacteria model (BIO_ALGAE): Application to wastewater
high rate algal ponds. Science of the total environment, v. 601, p. 646-657, 2017.
[4] NURDOGAN, Yakup; OSWALD, William J. Enhanced nutrient removal in high-rate ponds. Water science
and technology, v. 31, n. 12, p. 33-43, 1995.
[5] BOUTERFAS, Radia; BELKOURA, Mouhssine; DAUTA, Alain. Light and temperature effects on the growth rate
of three freshwater [2pt] algae isolated from a eutrophic lake. Hydrobiologia, v. 489, p. 207-217, 2002.
[6] KNOTT, Brandon C. et al. Characterization and engineering of a two-enzyme system for plastics
depolymerization. Proceedings of the National Academy of Sciences, v. 117, n. 41, p. 25476-25485, 2020.
[7] Hongyuan Lu1 , et al. Machine learning-aided engineering of hydrolases for PET depolymerization.
Nature v604, 2022.
[8] Robert A. Freudenberg, Thomas Baier, Alexander Einhaus, Lutz Wobbe, Olaf Kruse, High cell density
cultivation enables efficient and sustainable recombinant polyamine production in the microalga
Chlamydomonas reinhardtii, Bioresource Technology, Volume 323, 2021, 124542, ISSN 0960-8524,
https://doi.org/10.1016/j.biortech.2020.124542.
[9] Lei Tian, Ewa Skoczynska, Robert-Jan van Putten, Heather A. Leslie, Gert-Jan M. Gruter, Quantification
of polyethylene terephthalate micro- and nanoplastics in domestic wastewater using a simple three-step
method, Science of The Total Environment, Volume 857, Part 2, 2023, 159209, ISSN 0048-9697,
https://doi.org/10.1016/j.scitotenv.2022.159209.
[10] GEROSA, Luca; KOCHANOWSKI, Karl; HEINEMANN, Matthias; SAUER, Uwe. Dissecting specific and global
transcriptional regulation of bacterial gene expression. Molecular Systems BiologyEMBO, 2013. (1) DOI:
10.1038/msb.2013.14. Disponível em: http://dx.doi.org/10.1038/msb.2013.14.
Molecular Modelling
3D Structure Prediction
As of the present, the experimental determination of the 3D structure of FAST-PETase : Linker : MHETase complex's
remains elusive.
However, the emergence of AlphaFold2, a cutting-edge prediction method, has paved the way for
insightful predictions regarding the real-world configuration of this structure1.
Employing AlphaFold2, we successfully predicted the intricate 3D structure. Our predictions have garnered a remarkable
91.02%
similarity as evaluated by the ERRAT2 score, firmly establishing the reliability of our projected outcomes.
Notably, considering the computational resource intensiveness associated with AlphaFold2, we harnessed the
computational
power of the ColabFold2 Colab notebook for these predictions.
Subsequently, we
conducted a meticulous analysis of these projections, utilizing the PyMOL3
platform
as a lens to reveal the subtleties and complexities encapsulated within our predictions.
Figure 1: Predicted 3D Structure of the FAST-PETase : 12 AA Flexible Linker : MHETase Complex using AlphaFold2
Outcome of Prediction:
In the pursuit of structural prediction, the establishment of a clear success metric is crucial. Thus, the evaluation
of
AlphaFold2's output emerged as a pivotal step in determining the credibility and viability of the obtained structure
for
subsequent analyses. Our model's assessment on the careful consideration of the following parameters:
AlphaFold's per-residue prediction of its IDDT-Ca score4. To elaborate,
the IDDT
metric quantifies the proportion of accurately predicted interatomic distances, rather than focusing on the alignment
of
predicted and actual structures.
This evaluation acknowledges the precision of local structures and the successful alignment of individual domains. The
pLDDT
metric reflects these attributes, serving as an adept measure of localized confidence.
In Figure 1, the local confidence granted by AlphaFold is denoted by the intensity of blue coloring within the
structure. Conversely, shades of red indicate regions of relatively lower confidence. This visualization has been made
through the use of pLDDT scores, presented as a B-factor spectrum.
Figure 2: Distribution of pDDT Scores as a 2D Line Graph
Furthermore, Figure 2 showcases the pLDDT scores, depicted in the form of a 2D line graph. This visualization provides
a
clear perspective on the distribution of pLDDT scores across the structure. A line graph adds a layer of
clarity, allowing for a comprehensive assessment of the scores' variations and trends.
PAE (Predicted Aligned Error) Scores:
Recognizing that elevated pLDDT scores across all domains may not necessarily indicate AlphaFold's assurance regarding
their
relative orientations, we introduced the incorporation of PAE scores. In essence, PAE signifies AlphaFold's
anticipatory
estimation of positional discrepancy at residue x, contingent on an alignment between the predicted and actual
structures
centered around residue y. By design, PAE gauges the confidence about the interplay of residue pairs' spatial
arrangements. While predominantly employed to evaluate relative domain positions, the utility of PAE extends
seamlessly to
situations where the assessment of pairwise confidence holds significance.
Figure 3: Predicted Error Comparison within the Same Domain (Amino Acids 34-113)
Figure 4: Predicted Error Variation across Different Domains (Amino Acids 34-862)
In this comprehensive analysis, a pivotal observation comes to the forefront: within a context of a relatively static
domain (figure 3), the predicted error showcases a remarkably low magnitude. As we delve deeper into the intricacies,
figure 4 presents a conspicuous contrast, revealing an upsurge in the predicted error magnitude. It's worth noting
that
this increase in predicted error is likely attributed to the presence of a flexible linker within the structural
configuration.
ERRAT Score:
From a scale of 0 to 100, the ERRAT score5 emerges as a compelling gauge
that appraises both the trustworthiness and the statistical reliability of residue conformations vis-à-vis reference
structures. The underlying principle involves the computation of an error factor for each protein residue,
accomplished by
contrasting the atom positions within the model against a statistically derived distribution distilled from resolved
protein
structures. This rigorous analysis scrutinizes deviations between atom positions within the model and the anticipated
distribution, effectively yielding an error factor tailored to each residue.
Culminating in an error confidence value for the entire structure, ERRAT substantiates its prowess as a
judicious evaluator of structural fidelity.
In the realm of experimental benchmarks, structures hitherto resolved tend to exhibit values surpassing the 90
threshold.
Our model boasts an ERRAT score of 91.02, echoing its adeptness in achieving a remarkable accord with this benchmark.
Ramachandran Plot:
The Ramachandran Plot6 stands as a critical tool for validating the
structural
integrity of proteins, providing a comprehensive assessment of the permissible orientations of phi (φ) and psi (Ψ)
dihedral
angles within amino acid residues. This analysis offers profound insights into the spatial arrangement and stability
of the
protein's three-dimensional structure.
Within this framework, the plot categorizes the conformational space into distinct regions, each indicating the
feasibility
of different angles for phi and psi rotations. The most favorable and allowed regions are recognized, while less
favored but
still acceptable regions are also discerned. Meanwhile, regions where steric clashes and energetic constraints impose
limitations are designated as disallowed.
For quantitative evaluation of structural quality, the number of data points lying outside the most favorable region
serves as
a robust metric. In our specific protein model, the Ramachandran Plot was generated using
PROCHECK7 which provided enlightening
observations: around 83.2% of amino acid residues conform to optimal orientations, underscoring a predominant presence
of
permitted arrangements. Additionally, 11.3% of residues occupy a marginally less ideal yet still tolerable
conformational
space, while 4.2% exhibit conformations considered less favored. The remaining 1.3% of residues fall within disallowed
regions, indicating instances of steric clashes and unfavorable energetics.
RMSD Score:
The Root Mean Square Deviation (RMSD) stands as a pivotal metric within the realm of structural biology, providing a
quantitative gauge of the dissimilarity between two protein structures. Its role as a formidable tool is paramount in
evaluating the accuracy and reliability of predicted protein structures, as showcased in the context of the
FAST-PETase-Linker-MHETase complex we are investigating.
RMSD quantifies the average displacement of atoms between a reference structure, often derived from experimental
analysis
and the envisioned predictive structure. Within our distinctive scenario—focused on predicting a protein complex
comprising
both FAST-PETase and MHETase—it's crucial to meticulously assess the alignment of each component
(FAST-PETase and MHETase) in isolation. This precise evaluation is conveyed through the provision of two distinct RMSD
values:
FAST-PETase RMSD: With a specific value of 0.384, this metric indicates the average atomic displacement when comparing
the
envisioned FAST-PETase structure against its corresponding reference structure, spanning all 242 amino acid residues.
MHETase RMSD: In contrast, boasting a recorded value of 0.515, this metric encapsulates the mean atomic displacement
between
the predicted MHETase structure and its corresponding reference structure, spanning its full range of 501 amino acid
residues.
The presentation of these dual RMSD values serves as a strategic approach to glean heightened insights into the
resultant
predictions. By distinctly evaluating the alignment of both FAST-PETase and MHETase, we gain a more nuanced
perspective for
quantitatively assessing the quality of our predictive outcomes. Lower RMSD values generally correspond to heightened
ongruence between the predicted and reference structures, signifying a more accurate prediction. Conversely, higher
RMSD
values may indicate deviations from the reference structure, thereby highlighting areas that could benefit from
further
refinement in the prediction process.
In a comprehensive endeavor to enhance the presentation of our findings, we have created a visual representation that
showcases
the alignment between the predictions generated by AlphaFold (depicted in blue) and the experimental 3D structures of
both
FAST-PETase and MHETase.
Figure 5: Structural Alignment Visualization - AlphaFold Predictions vs. Experimental 3D Structures
This video, marked as Figure 5, serves as a visual aid, enabling viewers to witness the dynamic correlation between
the
predictive outcomes and the real-world structural conformation. Through this visualization, we aim to provide a more
holistic understanding of the accuracy and alignment of our predictions within the FAST-PETase-Linker-MHETase complex.
Molecular Docking
Docking is a pivotal computational technique, serving as a potent tool for the investigation of protein-protein
interactions and the assessment of binding affinity between molecules. In the context of our study concerning the
FAST-PETase: Linker: MHETase complex, docking plays an indispensable role in elucidating the intricate molecular
interactions that underlie its functionality.
Why Docking?
The experimental determination of the 3D structure of this complex remains elusive. However, recent advancements in
computational biology, exemplified by AlphaFold2, have provided us with the means to predict its 3D structure with
remarkable precision. Docking emerges as a pivotal tool as we dive deeper into comprehending the intricate
interactions
within this complex. Its primary objective is to elucidate binding energy and discern which component, be it the
individual
proteins FAST-PETase and MHETase or our composite assembly, exhibits a stronger affinity.
Furthermore, docking aids in the identification of potential challenges. For instance, it enables us to evaluate
whether
the binding sites of FAST-PETase and MHETase may overlap or clash. The presence of conflicts at these binding sites
could
indicate the necessity for structural adjustments within the complex to optimize its functionality.
Probing Protein-Protein Interactions:
Docking empowers us to investigate the interactions between proteins, specifically FAST-PETase and MHETase when linked
by
a linker molecule. It facilitates the precise localization of critical binding sites, commonly known as active sites,
where these proteins make contact. To visualize and communicate our findings, we have presented the following videos,
which illustrate our exploration of these active sites. This visual representation aids in our assessment of potential
interference between them.
Figure 6: Binding Sites Analysis in the FAST-PETase : Linker : MHETase Complex (Color Key: Blue - FAST-PETase,
Green
- MHETase)
Figure 7: Analysis of the Interactions Between the FAST-PETase and MHETase Domains
For this second figure, we applied a relaxation process to our initial AlphaFold prediction to minimize the
energy within the system. This step allowed us to observe the interactions between the FAST-PETase and
MHETase domains. Fortunately, ColabFold natively provides us with this process.
Upon closer examination, the blue region of the protein represents the FAST-PETase domain, while the green
region represents the MHETase domain. In the image, two amino acids highlighted in red, Asp112 from the
FAST-PETase domain and Arg535 from the MHETase domain, form a salt bridge. A salt bridge is a combination
of two non-covalent interactions: hydrogen bonding and ionic bonding. Although non-covalent interactions
are typically considered relatively weak, these small stabilizing interactions collectively contribute
significantly to the overall stability of a conformer.
Additionally, three other amino acids, depicted as sticks, are hydrophobic residues. These residues tend
to cluster in regions unfavorable for interactions with water, a behavior known as the hydrophobic effect—a
thermodynamic phenomenon. Fortunately, none of these amino acids are part of the binding site or active
site of the FAST-PETase, which leads us to conclude that there are no interfering binding sites.
While performing a molecular dynamics simulation to delve deeper into these interactions would be beneficial,
we find this initial test satisfactory to assess whether our proteins interfere with each other.
Assessing Binding Affinity:
Docking simulations play a pivotal role in the evaluation of binding affinity within the
FAST-PETase : Linker : MHETase complex, comparing it with the binding affinity of FAST-PETase and MHETase
individually, towards the ligands PET and MHET, respectively. Through these simulations, we can precisely
quantify the strength of the interactions between these proteins and each ligand, shedding light on the
nature of their binding.
Rather than opting for a computationally intensive flexible docking approach, we adopted a constrained docking
strategy. In this approach, we maintained the receptor in a rigid state while allowing flexibility in the
ligand. This decision aligns with the prevailing methodology employed by most contemporary programs.
Additionally, it mitigates the computational demands on our systems, safeguarding their performance.
To accomplish our experiments, we employed two distinct algorithms: one for sampling, which predicts the
various possible conformations, or "poses", that a ligand can assume when binding to the protein, and
another for scoring, which computes the binding energies associated with each pose in relation to the protein.
The conformation or pose with the lowest energy is identified as the most favorable and is considered the
optimal binding configuration.
For the sampling algorithm, we employed the genetic algorithm AutoDock. This algorithm operates in a
stochastic manner, generating a myriad of diverse conformations. From this pool of conformations,
the algorithm selectively retains those with scores surpassing a predefined threshold. Ultimately, this
process yields the top 9 poses characterized by the lowest energy levels.
For our analysis, we conducted a total of four experiments, consisting of two experiments utilizing the
predicted 3D structure obtained through AlphaFold2 and two experiments employing the PDB files of FAST-PETase
(structure 7SH6) and MHETase (structure 6QZ2) downloaded from the Protein Data
Bank8.
For the ligands, we retrieved the 3D structures of PET and MHET from the ZINC15
Database9. A thorough literature review confirmed the already
established binding sites of the proteins. Consequently, we employed a specific docking approach, wherein
we precisely defined the region where the ligands should bind. Additionally, we utilized
OpenBabel10 to standardize the structure format to match that
of the proteins.
To prepare the 3D structures of the proteins, we employed AutoDockTools
1.5.711. This involved removing water molecules and all
non-protein components, verifying and rectifying any missing atoms, adding hydrogen atoms to facilitate
bonding between the ligand and the protein's binding site, and applying Kollman Charges to the protein.
We also used MGLTools to establish the grid box around the binding sites of the proteins.
For the docking simulations, we chose to utilize AutoDock Vina12
due to its exceptional accuracy and high-speed performance. The comprehensive results obtained from AutoDock
Vina can be found in the supplementary data. From this data, we generated Figures 8 and 9 using PyMOL,
illustrating the 3D interactions between our protein and the ligands.
Figure 8: Binding Site-Ligand Interaction of the FAST-PETase Binding Site in the FAST-PETase : Linker :
MHETase Complex
Figure 9: Binding Site-Ligand Interaction of the MHETase Binding Site in the FAST-PETase : Linker :
MHETase Complex
Furthermore, we conducted 2D analysis for all four experiments using
LigPlot+13 and have included these results in the supplementary
data14.
Energy Affinity Findings:
Upon analyzing the docking results involving FAST-PETase, MHETase, and our complex, several intriguing trends
have emerged:
Protein-ligand Complex
Affinity (\(kcal\times mol^{-1}\))
FAST-PETase : Linker : MHETase (PET)
-6.3
FAST-PETase : Linker : MHETase (MHET)
-6.2
MHETase (MHET)
-5.8
FAST-PETase (PET)
-5.6
Figure 10: Table displaying the energy values of the proteins when bound to their respective
ligands.
Upon comparing the optimal ligand conformations for each protein, as determined by their energy affinity
values, it becomes evident that our linker-bound protein exhibits a slightly higher binding affinity to both
MHET and PET at their respective binding sites, as opposed to using FAST-PETase and MHETase separately.
This suggests that the fusion of FAST-PETase and MHETase enzymes with a linker may not detrimentally impact
binding affinity in comparison to non-linked proteins, and, in fact, may even enhance it. The docking results
notably indicate a significant improvement in binding affinity within the FAST-PETase : Linker : MHETase
complex, especially with regard to PET, when compared to FAST-PETase alone. Conversely, MHET affinity
experiences a modest increase when our complex is employed. This suggests that the presence of the linker
and the interaction between FAST-PETase and MHETase could potentially enhance overall binding affinity.
It's important to note that while our docking results provide valuable insights into the initial binding
affinities, it's possible to achieve more accurate observations of ligand attraction over time through
advanced methods like molecular dynamics simulations, which offer a dynamic perspective of the complex
interactions between proteins and ligands.
Molecular Dynamics
As we conclude our exploration of molecular docking, we now set our sights on a thrilling journey into the dynamic
behavior of our complex. Molecular dynamics (MD) serves as a computational simulation method that unveils the
physical movements of atoms and molecules. During MD simulations, these atoms and molecules interact over a fixed
time, providing insight into the dynamic evolution of the system. In its most common form, the trajectories of
these entities are determined by numerically solving Newton's equations of motion for a system of interacting
particles. The forces governing these interactions, along with their potential energies, are typically computed
using interatomic potentials or molecular mechanical force fields, making MD an ideal choice for investigating our
complex.
Why Molecular Dynamics?
Molecular dynamics offers a multitude of advantages, including the ability to observe atomic-level protein
movements and interactions over time, detect structural alterations, assess domain affinity, predict the impact of
modifications on protein stability, corroborate AlphaFold2-derived information, and evaluate protein flexibility
in response to varying environmental conditions, such as temperature changes. By employing MD, we aim to gain a
comprehensive understanding of how our proteins adapt and flex within their surroundings and to assess protein
stability under diverse circumstances.
Cost-Effective and Accessible MD Simulations:
The utilization of MD simulations presents a multitude of computational challenges, including the potential strain
on personal computers and the substantial resource demands associated with prolonged simulations. These challenges
are accentuated when dealing with a complex of our scale, encompassing more than 900 amino acids. In addition to
these challenges, our university, regrettably, faces budget constraints that prevent us from maintaining a
functional cluster infrastructure.
In light of these obstacles, we proactively sought a solution by harnessing the capabilities of cloud computing,
enabling us to conduct cost-effective and accessible MD simulations for our complex. Thankfully, our search led
us to discover a robust pipeline15 ideally suited to address these
specific challenges.
Within this collaborative notebook-based pipeline, we primarily utilized the OpenMM
716 simulation engine for all aspects of our MD simulations, except
for force field calculations. For these critical force field calculations, we relied on AMBER's force field
module17. The seamless integration of OpenMM and AMBER facilitates
compatibility across a broad range of biomolecular systems and research applications. This integration ensures
that our simulations can be meticulously tailored to address our specific objectives and remain readily adaptable
to accommodate even more complex systems, should such a need arise.
In this context, we made three experiments, varying only the temperature to observe its influence on our complex.
All other parameters were held constant, allowing us to isolate and analyze the temperature variable while keeping
all other conditions consistent.
Simulation Parameters:
In our simulations, we employed the AMBER force field, specifically the ff19SB force
field18, which represents the latest advancement in SB protein force
fields. This updated force field has demonstrated improvements in amino acid-dependent properties, including
helical propensities, and accurately captures dihedral angle preferences observed in actual protein structures
stored in the Protein Data Bank.
The development and validation of the ff19SB force field, as detailed in the associated research paper, highlight
its enhanced predictive power when paired with a more precise water model like
OPC19. This combination proves particularly effective in modeling
sequence-specific behaviors, protein mutations, and rational protein design when compared to the utilization of
the TIP3P model, which exhibits notable limitations when used with the QM-based ff19SB force field. Therefore,
we followed the authors' recommendation and incorporated OPC as our water model.
In our simulations, we utilized a 12 Ångstrom-sized box to encompass our protein and neutralized the system by
introducing Na+ and Cl- ions at a concentration of 0.15. For the equilibration protocol,
we conducted simulations over one nanosecond and conducted experiments at three different temperatures:
20°C (293.15 K), 25°C (298.15 K), and 30°C (303.15 K), all maintained at a pressure of 1 bar.
Regarding the production phase of the MD simulation protocol, we maintained the same configuration as during
equilibration, conducting simulations over one nanosecond (1 ns) for each simulation at the same temperatures
(20°C, 25°C, and 30°C). We acknowledge that this duration may appear limited, but as previously mentioned,
resource constraints necessitated these choices.
MD Simulation Analysis:
RMSD Analysis:
Root Mean Square Deviation (RMSD), serves as a metric that quantifies the deviation of atomic positions over
time when compared to a reference structure. By closely monitoring RMSD, we gain the ability to gauge the
structural stability of our protein throughout the simulation. In a stable system, RMSD values tend to remain
relatively low, while an increasing RMSD may signify structural fluctuations or deviations. Furthermore, RMSD
analysis proves invaluable in assessing the stability of binding interactions between our proteins. Lower
RMSD values for the complex in comparison to its unbound state indicate a robust and stable binding
interaction. Conversely, higher RMSD values may suggest a weaker or transient interaction between the
molecules. Moreover, the identification of unusually high RMSD values or abrupt spikes can serve as early
indicators of noteworthy events within the system. These events may encompass phenomena such as protein
unfolding, aggregation, or substantial perturbations. Recognizing these outliers becomes pivotal in
pinpointing and comprehending anomalous behavior within the simulation.
Figure 11: RMSD Comparison Across Three Temperatures
When analyzing our RMSD results, a consistent profile emerged across all tested temperatures. Notably, we observed
that lower temperatures corresponded to increased stability within our complex, an outcome aligned with our
expectations. Importantly, no outliers were detected, indicating the likelihood of our protein maintaining its
structural integrity.
It is important to acknowledge our constraints regarding simulation duration due to resource limitations. Despite this
limitation, a noteworthy observation was made: at 20°C and 25°C, the RMSD profiles exhibited remarkable similarity,
differing by a mere 0.03 in their respective means (averages). In contrast, the 30°C simulation displayed a slightly
higher difference of 0.23+. While this difference is relatively small, it suggests that 30°C might be approaching a
temperature range where protein stability could become compromised. Conversely, 20°C and 25°C appear to be favorable
temperatures for our protein.
However, it is imperative to emphasize the need for extended simulation times to bolster the robustness of our
conclusions. Additionally, introducing a 15°C simulation could provide valuable insights into the
temperature-dependent behavior of our protein.
Furthermore, comparing our complex's stability with FAST-PETase and MHETase alone could shed light on which entity
exhibits superior stability. These ongoing investigations will enhance our understanding of our complex's behavior and
inform future research directions.
RMSF Analysis:
Root Mean Square Fluctuation (RMSF) analysis is a valuable tool for identifying regions within the protein that
exhibit varying degrees of flexibility.
Figure 12: Comparison of RMSF at Three Temperatures
Here, we observe strikingly similar profiles across all three temperature experiments, characterized by fluctuations
at the protein's beginning and end, as well as pronounced
flexibility in our designated flexible linker region.
Figure 13: Molecular Dynamics Simulation at 30°C with RMSF Spectrum
This figure offers a visual representation of the molecular dynamics simulation at 30°C, using RMSF as a spectrum. In
this representation, the reddest areas denote regions with
the highest fluctuation, while the bluest areas signify the most stable regions.
Upon closer examination of these figures, it becomes evident that our flexible linker consistently exhibits the
expected flexibility required for its functional role. These results
further bolster our confidence in the 3D structure prediction generated by Alphafold, as the agreement between the
predicted and observed dynamics is remarkably consistent.
Rg Analysis:
The radius of gyration (Rg) is a metric used to gauge the compactness or dispersion of biomolecules, including our
protein.
When applied to molecular dynamics simulations, analyzing the Rg graph yields valuable insights into the structural
dynamics
of the molecule.
Figure 14: Comparison of Rg at Three Temperatures
In Figure 14, we observe that at 20°C and 25°C, the Rg profiles exhibit remarkable similarity, characterized by a
steady
or slight decrease. This suggests that the molecule maintains its overall structural compactness, indicating relative
stability with no major structural changes.
Conversely, when examining the 30°C simulation, we note a profile trending slightly upward, indicating a modest
expansion or
unfolding of the molecule. This implies that the molecule has become somewhat less compact and is undergoing
structural
alterations, such as a polymer chain extending.
It's important to note that while we've adjusted the graph's scale for clarity, the actual Rg values among the three
simulations remain nearly identical.
In our molecular dynamics simulations, the monitoring of Rg functioned as a critical quality control metric,
bolstering our
confidence in the insights derived from RMSD analysis, even in instances where we couldn't meet the recommended
simulation duration.
Pairwise Correlation Analysis:
In our final step, we conducted a pairwise correlation analysis using Pearson's Cross-Correlation (CC) graphs.
This analysis provides essential insights into the interplay of motions between pairs of residues throughout our
molecular
dynamics simulation. To enhance clarity, we created a heatmap representation of the CC matrix.
Figure 15: Comparison of Pearson's Cross-Correlation Graphs at Three Temperatures
The values within the CC graph span a range from -1 to 1, denoting both the magnitude and direction of correlation.
Positive values, depicted in shades of purple and tending towards 1, signify a positive correlation. This suggests
that when one
residue undergoes a particular movement or conformational change, the other residue tends to follow suit. Positive
correlations
often point to coordinated motions or structural stability. Conversely, negative values, visible in shades of green
and
approaching -1, indicate a negative correlation. In this scenario, when one residue experiences a specific motion or
conformational
change, the other tends to move or change in the opposite direction. Negative correlations may indicate
anti-correlated motions
or structural changes that counterbalance each other. White regions featuring values near zero indicate minimal or no
correlation,
implying that the movements of these residues are independent of one another.
Upon analyzing the CC graph, we observe a strong positive correlation within the FAST-PETase domain. Conversely, in
the linker
region, located around the 300th amino acid residue, we identify negative correlations. These negative correlations
might highlight
regions that counterbalance each other, contributing to structural integrity maintenance.
Notably, at 20°C, the graph tends to display slightly more regions in purple, suggesting pairs of residues that move
in concert.
Nevertheless, across all three temperature simulations, the CC graphs exhibit remarkable similarity. However, it's
important to
note that while these observations are intriguing, they warrant further investigation through extended simulation
times to
establish significance conclusively.
Final Considerations
In our pursuit of elucidating the structural intricacies of the FAST-PETase : Linker : MHETase complex, we have
ventured through a
diverse array of computational tools and analyses. This journey has yielded several critical insights into the
complex's behavior
and characteristics. However, as we conclude this model page, we must critically assess our accomplishments,
acknowledge areas for
potential enhancement, and offer guidance to future teams embarking on similar endeavors.
Our successful prediction of the complex's 3D structure using AlphaFold2 is a noteworthy achievement, supported by a
remarkable
91.02% similarity as evaluated by the ERRAT score. However, the computational resource intensity associated with
AlphaFold2
presents a limitation. Future teams should explore strategies to optimize resource allocation while maintaining
prediction
accuracy.
Molecular docking allowed us to explore protein-protein interactions and binding affinity within the complex. The
results suggest
that the linker-bound protein may enhance binding affinity, but further experiments are required to confirm this.
Future research
should consider the impact of linker length and conformation on binding affinity.
Our MD simulations offered insights into structural stability, flexibility, and the influence of temperature. While
resource
constraints limited our simulation duration, we observed significant insights. Future teams should aim for extended
simulations
and consider lower temperatures to comprehensively assess complex behavior.
While our findings are significant, we acknowledge the need for continued research to further solidify our insights.
We encourage
future teams to build upon our work, leveraging advances in computational biology and biotechnology to unlock deeper
insights into
this complex and its potential applications in plastic degradation.
Overall Recommendations:
Resource Optimization: We strongly recommend that future teams prioritize resource optimization. Exploring efficient
resource
allocation strategies for resource-intensive tasks, such as AlphaFold2 predictions and MD simulations, can
significantly enhance
computational capabilities. Collaboration with institutions offering high-performance computing resources could prove
advantageous.
If we were to start over, we would certainly consider this approach.
Temperature and pH Variation: To obtain a more comprehensive understanding of the complex's behavior across a
broader spectrum
of conditions, we encourage the exploration of simulations at varying temperatures and pH levels. This expanded
parameter space can
provide valuable insights into the complex's versatility and adaptability.
Linker Optimization: Investigate the influence of linker length and conformation on binding affinity. Optimizing
these aspects
can maximize the functionality of the complex, potentially leading to improved performance.
Extended Simulations: Whenever feasible, extend simulation durations to ensure the robustness of observations and
conclusions.
Longer simulations can reveal nuanced behaviors and validate the stability of biomolecular systems.
Experiment Validation: While we faced challenges in our wetlab experiments, we emphasize the importance of
validating
computational findings through experimental techniques whenever possible. Experimental validation can significantly
enhance the
model's reliability and impact.
In the future, we hope to see greater emphasis on the importance of cost-effective and efficient MD simulations in
advancing
biomolecular understanding. Access to computational resources should not be limited by financial constraints, and
collaborative
efforts should be encouraged to overcome such barriers.
For those facing budget constraints similar to ours, we recommend exploring our
contributions page. There, we delve into the role
of cloud computing in scientific research, highlighting its advantages and adoption trends. We provide insights on how
to make
the most of cloud computing resources, ensuring that even those with limited resources can leverage its potential
effectively.
References:
3D Structure Prediction Studies:
1 Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021).
Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589.
2 Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold:
making protein folding accessible to all. Nature methods, 19(6), 679-682.
4 Mariani, V., Biasini, M., Barbato, A., & Schwede, T. (2013). lDDT: a local superposition-free score
for comparing protein structures and models using distance difference tests. Bioinformatics, 29(21), 2722-2728.
5 Colovos, C., & Yeates, T. O. (1993). Verification of protein structures: patterns of nonbonded atomic
interactions. Protein science, 2(9), 1511-1519.
7 Laskowski, R. A., MacArthur, M. W., Moss, D. S., & Thornton, J. M. (1993). PROCHECK: a program to
check the stereochemical quality of protein structures. Journal of applied crystallography, 26(2), 283-291.
8 Burley, S. K., Bhikadiya, C., Bi, C., Bittrich, S., Chao, H., Chen, L., ... & Zardecki, C.
(2023). RCSB Protein Data Bank (RCSB. org): delivery of experimentally-determined PDB structures alongside
one million computed structure models of proteins from artificial intelligence/machine learning.
Nucleic acids research, 51(D1), D488-D508.
9 Irwin, J. J., & Shoichet, B. K. (2005). ZINC − a free database of commercially available
compounds for virtual screening. Journal of chemical information and modeling, 45(1), 177-182.
10 O'Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R.
(2011). Open Babel: An open chemical toolbox. Journal of cheminformatics, 3(1), 1-14.
11 Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S., & Olson,
A. J. (2009). AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. Journal
of computational chemistry, 30(16), 2785-2791.
12 Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking
with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry,
31(2), 455-461.
15 Arantes, P. R., Polêto, M. D., Pedebos, C., & Ligabue-Braun, R. (2021). Making it rain: cloud-based
molecular simulations for everyone. Journal of Chemical Information and Modeling, 61(10), 4852-4856.
16 Eastman, P., Swails, J., Chodera, J. D., McGibbon, R. T., Zhao, Y., Beauchamp, K. A., ... & Pande,
V. S. (2017). OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLoS
computational biology, 13(7), e1005659.
17 Case, D. A., Aktulga, H. M., Belfon, K., Ben-Shalom, I., Brozell, S. R., Cerutti, D. S., ... &
Kollman, P. A. (2021). Amber 2021. University of California, San Francisco.
18 Tian, C., Kasavajhala, K., Belfon, K. A., Raguette, L., Huang, H., Migues, A. N., ... &
Simmerling, C. (2019). ff19SB: Amino-acid-specific protein backbone parameters trained against quantum mechanics
energy surfaces in solution. Journal of chemical theory and computation, 16(1), 528-552.
19 Izadi, S., Anandakrishnan, R., & Onufriev, A. V. (2014). Building water models: a different
approach. The journal of physical chemistry letters, 5(21), 3863-3871.