Contributing our Learnings for Future Teams
Our project RareCycle is innovative in many ways. For example, our Hardware is built with decentralized, community-based applications in mind. It also goes beyond laboratory scales, constructed with cheap and easily accessible materials. We have also incorporated state-of-the-art modeling in our project and performed for the first time molecular dynamics for our metal-binding peptides. Often, we had to work through impasses, adapting our plans and getting to now the necessary tools and software with the help of experts. For future iGEM teams, we would like to contribute specific instructions and troubleshooting help with soft- and hardware so that they are able to build upon our advancements.
How to Model Molecular Dynamics
When getting in touch with molecular dynamics for the first time the options can be overwhelming. To guide beginners, just as we were a few months ago, here you can find an introduction to simulating molecular dynamics with AlphaFold2 and GROMACS. Our full modeling project can be found here!
1. Defining the Objective / Problem
Before evaluating which software, code or script to use, the objective of the performed simulation must be defined precisely. To do this, you should thoroughly understand the problem you desire to simulate. Then you can progress creating a list with requirements the software you use must meet. Only then should you start thinking about how to do it. This step might seem obvious, however in molecular dynamics simulations small differences like what kind of binding mechanisms occur can make a huge difference.
2. Choosing the Suitable Software
When choosing suitable software, there are a few things to consider. The first of these is whether you want to exclusively stick to open-source software. If you choose to go with this path, it makes it easier to pass on your work to future iGEM-Teams. For our team this was an obvious choice, especially with regards to GROMACS and AlphaFold2 being a powerful tool.
Another important aspect is the availability of computing power at your university or home computer. The most significant flaw of GROMACS is the need for noteworthy computing power if you eye on reasonable simulation runtimes. To give an example in numbers: One of our rather outdated home computers which is fitted with a GeForce GTX1050 and an Intel core i5 from around 2016 managed to compute one production run (100ns simulation time) within 18h. Note that computing times are highly dependent on the number of simulated particles and the overall simulation time. These computing times were sufficient for us because we could start a production run at night and could analyze it the day after in the afternoon. However, your routine may differ. For computing times within minutes, you will need a cluster at your universities’ supercomputer. If there is no computing center available, you might consider paying for a cloud computing solution or look for an alternative software solution. Finally, you should be aware that GROMACS runs on Linux-based distributions exclusively. The entire handling is done from the command line. If you’re new to this kind of operating system it’s not a big deal as there is sufficient help on the internet. Just be sure to include this part in your timeline.
Regarding AlphaFold2, you will need significant computing power as well. For our team, the actual bottleneck of running AlphaFold2 locally was the massive memory requirements. You will need at least 32GB of RAM and on top of that around 35TB worth of memory on your local disc. Especially the need for this large amount of memory space is troubling if you want to compute on a cluster. Often, you do not have permission to block large amounts of memory space on clusters. Fortunately, Google offers an online version of AlphaFold2 in form of a Jupyter-Notebook called ColabFold. The simulation will be less precise but for rather small peptides this solution will be perfectly fine. The upside is obviously that ColabFold makes use of Google’s cloud computing capacities and not your local resources.
3. Working with ColabFold and GROMACS
In case you also decided to go with ColabFold and GROMACS just as we did, here is our more detailed procedure on how to model molecular dynamics.
ColabFold
To use GROMACS, you will need a protein structure file. Their filename ending is ‘.pdb’. Either you already have a protein structure file, for example from a research paper, or you only have the amino-acid sequence for your peptide. In the first case you don’t need ColabFold. You can immediately proceed to the modelling with GROMACS. Otherwise, you will need to create a protein structure file with ColabFold in advance.
The actual simulation will be run entirely in GROMACS. Therefore, the precise structure and folding of our peptide from the ColabFold simulation is not superbly significant. However, ColabFold will give us a pretty good first guess on our peptide-structure by making use of the large databank and a neural-network. But most importantly, it generates us a pdb-file which we can use in further steps. As the behavior of the peptide in the given environment will be simulated with GROMACS anyway, we don’t need to bother optimizing the simulation with ColabFold. Consequently, we work with the default-settings of ColabFold and simply plug in our amino-acid sequence. After running the entire Jupyter-Notebook you can download the structure files. Note that ColabFold will give you multiple structure files. These are simply different guesses for the predicted structure. As explained earlier, the exact structure doesn’t matter so you can simply take the guess with the highest accuracy as long the predicted structure doesn’t appear to be entirely off. You can visualize the pdb-files with online pdb-file viewers. We used the Mol* 3D Viewer which can be accessed through rcsb.org/3d-view.
GROMACS
There are plenty of GROMACS-tutorials out there. From our point of view, the most helpful tutorial was ‘Lysozyme in water’, which is available on mdtutorials.com/gmx/lysozyme/index.html. The following text can be seen as a supplementary guide including our own experience. Exact commands are elaborated in the mentioned tutorial 1 and in the GROMACS documentation 2.
First, you will need to process the pdb-file to a gmx-file so GROMACS can work on it. This is done with the command ‘pdb2gmx’. We chose the AMBER-ff99SB forcefield as it suits the TIP3P water model we used. Technically this step isn’t necessary, however it prevents some errors related to the interaction of the structure file and chosen force field. Like for most GROMACS-commands, you always give an input file and command-specific options for GROMACS-commands. This results in an output file. In this case the output is the topology file called ‘.top’. This file contains alle the information on the peptide structure like bonds, pairs, angles and dihedrals. Next up, you will define the environment in that your peptide will be modelled. This includes your solvent and the surrounding geometry.
To keep things simple in the beginning, we choose a cubic geometry. We define a box with suitable dimensions for our peptide making use of the command ‘gmx editconf’. We position the peptide in the center of the box and set the distance to the box-sides as 1 nm. In combination with periodic boundary conditions, it is basically assured that there is a sufficient distance between two periodic images of our peptide. As soon as we defined our box, we got a ‘.gro’ file, which will be further processed in the following steps. As a side note: There are many box geometries. While a cube is easy to handle, you can also optimize your geometry with regards to the number of water molecules and your peptide still fitting in the box. Ultimately, you try to model as little water molecules as possible while not affecting the peptide.
Before simulating the interaction between rare-earth metal ions and our peptide, the peptide must be ‘relaxed’. This basically means that the peptide folds into its most energy-efficient structure. Obviously, this structure is highly dependent on the solution medium.
We start off by relaxing the peptide in water and a few Na- and Cl-ions to neutralize the systems charge (commonly, the peptide is not neutrally charged). Consequently, we define the solvent as water with the command ‘gmx solvate’. As an input we pass on the gro-file we just created and get an updated gro-file as an output. Subsequently, we add ions to our system to neutralize the net-charge. This can be done with the command ‘gmx grompp’. ‘To run this command, you will need a ‘molecular dynamic parameter; .mdp’ file. A mdp-file is a simple txt-file, in which the parameter for, let’s say your ion-configuration, is described. You will encounter mdp-files in the later steps of the simulation again. mdp-files can be found online. Just be sure that they work with your chosen forcefield. You might need to adjust a few points for your specific problem. On top of that you need your most recent gro-file. ‘gmx grompp’ will merge these files to an ‘tpr-file’ which can then be processed by the command ‘gmx genion’. That way, the desired neutral net-charge of our system can be specified. tpr-files are binary files and crucial to run production runs on the MD-engine of GROMACS.
To prevent effects like steric clashes when doing the production run, we do an energy minimization in advance with the command ‘gmx mdrun’. This command will also be used for the main production run later. Just like when adding ions to our system, GROMACS needs an already assembled binary tpr-file. To create this file with ‘gmx grompp’ you will need a different mdp-file for energy minimization in addition to your gro-file. The needed mdp-file can be found online (for example in the tutorial ‘Lysozyme in water’). 1 The created output files can be analyzed with the command ‘gmx energy’. The output-files are so-called ‘xvg-files’ and can be analyzed with the Xmgrace plotting tool. In our case the tool didn’t properly work. Fortunately, xvg-files are basically txt-files and can easily be processed in python. A script to quickly plot your results can be downloaded here. To tell whether your energy minimization was successful, you need to make sure that the potential energy of the system has adequately decreased. The potential energy should be around negative 105−106 kJ/mol and nearly constant.
After equilibrating the system regarding the geometry and solvent we need to set the proper temperature and density. This is to ensure that our system won’t collapse when starting the production run. We perform two separate equilibration phases. The first one will be conducted under ‘NVT’-conditions, meaning that the number of particles, the systems volume and the temperature will be constant (isothermal and isochoric). Note that the temperature of the system won’t be at a constant volume in the beginning but will quickly reach the target temperature. At this point the temperature should only be affected by minor noise.
Secondly, we will do an equilibration phase under ‘NPT’-conditions. In this case the number of particles, pressure and the temperature will be constant (isothermal and isobaric). Both equilibration phases are performed with the MD-engine of GROMACS. The necessary mdp-files can be found online. You may consider adapting the thermostat- and barostat-model. We found that the ‘Berentsen’ and ‘Nose-Hoover’ model work well as a thermostat. You should avoid the ‘Anderson’ model. Whatever exact model you use, make sure to use a thermostat model with velocity rescaling. To analyze your results, you can use ‘gmx energy’ again and plot the extracted data with the provided plotting tool. While analyzing you might encounter a highly fluctuating pressure. This is expected; however, the density should be fluctuating around a constant value.
Now we can finally proceed with the production run. Like always, you need to assemble your binary tpr-file first and then perform the MD-simulation with ‘gmx mdrun’. The simulation time of your production run should be around 100 ns. In these 100 ns our peptide will be relaxed so we can simulate the interaction between rare earth metal ions and our peptide in the following steps. Be aware that the production run will take some time as discussed in ‘Choosing the suitable software’. You can accelerate computing time by enabling GPU-support. To do this you must install GROMACS with GPU support! If you missed this option when installing GROMACS the first time you must reinstall GROMACS. If you want to use a Nvidia graphics card, be sure that it supports CUDA.
To analyze the results, we will look at the root-mean-square deviation (RMSD) of the individual distance between every amino acid in the given sequence. The RMSD should reach a constant level. In this case we can conclude that the peptide is not folding anymore and has at least reached a local energy minimum. To ensure that we also find a global energy minimum we can redo the entire simulation with slightly different starting conditions and check whether we get the same energy minimum.
Our BioBricks Parts
We have built upon the success of past iGEM teams to contribute important BioBricks parts for the recycling of rare earth elements from e-waste. iGEM Bonn 2021 successfully cloned Lanmodulin into pet19b and kindly provided us with their plasmid. To make it suitable for surface expression including C-terminal fusion, the LanM gene was amplified with designed primers eliminating the stop codon. In addition to that, we created a construct containing the lanthanide binding “most common standard tag” and other lanthanide binding peptides capable of being purified, immobilized and binding lanthanides. This part can be found in the registry and is available for all future iGEM teams interested in the recycling of Rare-Earth Elements. Have a look at our Parts page to learn more about that!
How to Build your own Bioremediation Unit
With our easily reproducible MycoFlux fungal bioreactor, we aim to enable iGEM teams around the world to cultivate their fungi under optimal conditions. As exemplified by our project RareCycle, fungal bioremediation is a promising topic with many applications - even in industrial settings. Our main contribution is a system that allows researchers and communities to explore fungal filtering and extraction processes beyond the lab scale. Our tutorials allow other iGEM teams to build their own functionalized fungal cultivation systems. Combined with our tutorial on fungal cultivation and the guides on the cultivation of different fungi we gathered 3,4, we empower everyone to plan, construct and incorporate their own fungal bioreactors into their respective projects. Lean more about our hardware on our Hardware Page!
Access and download the tutorial to replicate our Hardware project here!
Tutorial:
References
- From Proteins to Perturbed Hamiltonians: A Suite of Tutorials for the GROMACS-2018 Molecular Simulation Package, v1.0Living J. Comp. Mol. Sci. 1 (1): 5068
- GROMACS 2023.2 ManualZenodo
- The 5 Easiest Mushrooms to GrowRetrieved on 06.10.2023
- Growing Gourmet and Medicinal Mushrooms (Third Edition)Berkeley: Ten Speed Press