Simulation and Verification of CRISPR-Cas13a System

Overview

Introduction: CRISPR/Cas13a system is a complex that includes Cas13a protein and crRNA, and the specific binding of crRNA to its target RNA activates the cleavage activity of the Cas13a/crRNA complex. After identifying that miR-21-5p is an optimal biomarker of breast cancer, we decided to engineer a CRISPR/Cas13a system that targets miR-21-5p. We designed crRNA that can specifically bind to miR-21-5p, and used molecular docking and molecular dynamics simulation to evaluate the stability of the Cas13a/crRNA complex that we have engineered for the detection of miR-21-5p.

Methods and results: We designed four different crRNAs that could potentially bind to miR-21-5p. Through thermodynamic evaluation with ViennaRNA[1], we determined that crRNA full length is the optimal sequence to detect miR-21-5p. Subsequently, we utilized HDOCK[2] and MOE[3] to simulate the assembly of the CRISPR/Cas13a complex, which verified the binding potential of crRNA full length from the perspective of interaction. We used molecular dynamics simulation to analyze the stability of the Cas13a/crRNA complex, which verified the stability of this complex from the perspective of energy. Together, these simulations demonstrated that the Cas13a/crRNA complex is efficient to bind to miR-21-5p, indicating our engineered CRISPR/Cas13a system should be reliable for the detection of miR-21-5p.

Design and Analysis of crRNA Sequences

CRISPR RNA (crRNA) consists of a loop region (repeat part) and a linear guide sequence (detection part). While the loop region is responsible for the interactions between crRNA and Cas protein, the linear guide sequence includes 20 or more nucleotides that is complementary to the target RNA, which is responsible for the interaction with target RNA. A stable and specific crRNA can aid Cas13a/crRNA complex to interact with its target RNA more efficiently.

We found the sequence of miR-21-5p in miRbase, and used it for our subsequent study (Table 1).

miRNA Sequence
miR-21-5p UAGCUUAUCAGACUGAUGUUGA
Table 1: The sequence of miR-21-5p

Based on the sequence of miR-21-5p, we designed 4 candidate crRNA sequences, which have the canonical loop region and different linear guide sequences that contained 20 or more nucleotides (crRNA full length include 22 nucleotides) complementary to miR-21-5p.

crRNA Sequence
crRNA1 GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC AACAUCAGUCUGAUAAGCUA
crRNA2 GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC CAACAUCAGUCUGAUAAGCU
crRNA3 GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC UCAACAUCAGUCUGAUAAGC
crRNA full length GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC UCAACAUCAGUCUGAUAAGCUA
Table 2: Sequences of candidate crRNAs

The predicted conformation of the 4 crRNAs are shown below[5]

Figure 1: Schematic diagram of the conformation of (A) crRNA1, (B) crRNA2, (C) crRNA3, (D) crRNA full length, with the following color-coded annotations: Green represents stems (canonical helices), yellow indicates interior loops, blue signifies hairpin loops, and orange highlights the 5' and 3' unpaired regions.

To identify the crRNA with the best interaction efficiency with miR-21-5p, we analyzed these crRNAs with a variety of approaches[4]. First, we evaluated the thermodynamic characteristics of these crRNAs with the following parameters [6].

  • Minimum free energy of crRNA sequence (MFE)
  • GC content
  • Minimum free energy of heterodimer, i.e., crRNA and miRNA complex (MFE heterodimer)

MFE evaluates the stability of conformation, which is calculated by RNAfold of ViennaRNA. The smaller the MFE score is, the more stable the conformation is. GC-content showed the proportion of Guanine and Cytosine in the sequence. Compared to other types of interactions between nucleotides, the pair of guanine and cytosine forms more hydrogen bonds, and therefore the sequence with higher GC-content should be more stable. MFE heterodimer evaluates the stability of the complex, which is also calculated by RNAcofold of ViennaRNA. A smaller score of MFE heterodimer indicates a more stable crRNA/miRNA complex and a higher interaction efficiency of crRNA with miRNA. The scores of MFE, GC content and MFE heterodimer are shown in the table below.

crRNA MFE (kcal/mol) GC content MFE heterodimer (kcal/mol)
crRNA1 -9.8 0.39 -41.6
crRNA2 -9.7 0.41 -40.9
crRNA3 -9.6 0.41 -39.8
crRNA full length -9.6 0.40 -42.5
Table 3: Thermodynamic parameters of different crRNAs

While the differences of these crRNAs on MFE and GC scores are minimum, the difference of these crRNAs on MFE heterodimer score is more obvious. Actually, MFE heterodimer is the most important parameter, since crRNA and miRNA work together in actual chemical reactions. The crRNA full length has the lowest MFE heterodimer, indicating that it has the best stability and bond forming properties. Notably, subsequent experimental data also demonstrates that crRNA full length is indeed the most efficient crRNA for the interaction with miR-21-5p.

Molecular docking - assembly inspection of CRISPR/Cas13a system

In addition to evaluating crRNA alone, we also need to verify whether Cas13a/crRNA complex can be assembled properly, since proper assembly is crucial for the function of CRISPR/Cas13a system. We simulated the assembly of Cas13a/crRNA complex with molecular docking, a method that is usually used for analyzing receptor-ligand interactions. Based on the properties of receptor and ligand, molecules with various possible conformations are docked and scored based on energy and binding conditions.

Based on the extent of model simplifications, molecular docking can be divided into rigid docking, semi flexible docking, and flexible docking. As Cas13a protein and crRNA belong to macromolecules, which are not suitable for flexible docking analysis, we analyzed the interactions between LwaCas13a protein and crRNA using a rigid docking algorithm, HDOCK server, and a semi flexible docking algorithm, MOE software.

Homologous modeling of protein & Preparation of docking files

The first step of molecular docking is to obtain the PDB structure files of receptors and ligands. Since PDB file of the LwaCas13a protein (1152AA)[7] is not available from the UniProt[8], we used a homologous modeling method to predict the structure and obtain the PDB structure file of LwaCas13a. Due to the 82.56% similarity between protein sequences of LwaCas13a and LbuCas13a, we used LbuCas13a as the template protein. We adopted the best result of SWISS-MODEL homology modeling as the PDB file[9]of LwaCas13a for subsequent docking simulations. Results of the homologous modeling are displayed with PyMol software[10] (Figure 2-4).

Figure 2: Template protein (LbuCas13a)[11]
Figure 3: Predicted structure of LwaCas13a with SWISS-MODEL
Figure 4: The overlay diagram of protein structures of LbuCas13a and LwaCas13a (blue-LbuCas13a, template; green-LwaCas13a)

LwaCas13a protein consists of 2 main lobes, the Recognition Lobe (REC) and the Nuclease Lobe (NUC). While the Recognition Lobe contains 2 domains, the NTD and Helical-1 domain, the NUC Lobe contains 5 domains, HEPN1-I, Helical-2, HEPN1-II, Linker and HEPN2[12]. Once crRNA sequence binds to the REC Lobe and target RNA, the LwaCas13a complex changes conformation and activates the trans-cleavage activity of CRISPR/Cas13a system [13]. The schematic diagrams of LwaCas13a structure are shown below.

Figure 5: Schematic diagram of REC and NUC lobes of LwaCas13a (yellow-REC blue-NUC)
Figure 6: Schematic diagram of different domains of LwaCas13a (red-NTD green-Helical-1 blue-HEPN1-I yellow-Helical-2 magenta-HEPN1-Ⅱ cyan-Linker orange-HEPN2)

To obtain the PDB file of crRNA, we input the previously obtained conformation information of crRNA and use 3dRNA server[14] to predict its three-dimensional structure. The optimal structure predicted by 3dRNA server is used as the ligand PDB file for molecular docking.

Figure 7: Predicted structure of crRNA full length with 3dRNA
Figure 8: Predicted structure of crRNA1 with 3dRNA
Figure 9: Predicted structure of crRNA2 with 3dRNA
Figure 10: Predicted structure of crRNA3 with 3dRNA

Evaluation of crRNA from molecular docking

We used HDOCK server to dock crRNA full length and other crRNAs on LwaCas13a protein, and compared their docking from two perspectives, docking score and RMSD (Figure 11 & 12). While docking score represents binding status of the receptor and ligand, RMSD evaluates the distance between two molecules. Low docking score and RMSD indicate stable docking. The docking score and RMSD of crRNA full length are lower than other crRNAs, demonstrating the superiority of crRNA full length on docking with LwaCas13a protein.

Figure 11: Docking scores of crRNAs on LwaCas13a protein
Figure 12: RMSD of crRNAs on LwaCas13a protein

Molecular docking retest-MOE

As HDOCK is a rigid docking algorithm based on a template, its docking score cannot always represent the binding energy faithfully. Therefore, we used MOE software for semi flexible docking of LwaCas13a and crRNA full length, and analyzed the interactions between LwaCas13a and crRNA full length[15].

Figure 13: LwaCas13a/crRNA full length complex
Figure 14: Interactions and Hydrogen Bonds between LwaCas13a and crRNA

The majority of interactions take place between the REC lobe and crRNA. While most interactions are consistent with previous published literatures[16-18] a small portion of the interactions is different, which may be caused by the minor differences between the structures of LwaCas13a and LbuCas13a proteins. It is worthy to note that different crRNAs can also lead to different scopes of action, which might lead to differences in the interaction between crRNA and Cas13a protein. Finally, the complex is docked with miRNA. Ternary complex is obtained by using MOE software, which is displayed by Discovery Studio Visualizer[19].

Figure 15: The predicted structure of LwaCas13a/crRNA/miRNA ternary complex

The number of binding sites among LwaCas13a, crRNA and miRNA are shown below[20].

Name Number of Binding Sites
miR-21-5p Helical-1 crRNA
4 8
Table 4: Number of binding sites between miR-21-5p and LwaCas13a/crRNA complex
Figure 16: Schematic diagram of interaction among LwaCas13a, crRNA and miRNA

In our simulation, LwaCas13a, crRNA, and miRNA can be successfully assembled into ternary complex, and strong interactions have been formed between each part (Figure 16). Consistently, the results of our wet laboratory experiment demonstrated that miR-21-5p could activate the cleavage activities of the LwaCas13a/crRNA complex, demonstrating that they formed a functional LwaCas13a/crRNA/miR-21-5p ternary complex together.

Molecular Dynamic Simulation

To analyze the stability of our CRISPR/Cas13a system, we used GROMACS software to implement molecular dynamics simulations[21]. The simulation conditions were carried out at a static temperature of 300K and normal pressure. The force field used Amber99sb-ildn. The solvent was water (tip3p model), and the total charge of the simulated system was neutralized by adding an appropriate amount of Na+ ions. Energy minimization was performed. Subsequently, isothermal and isovolumetric ensemble (NVT) equilibrium and isothermal and isobaric ensemble (NPT) equilibrium were also performed, with a coupling constant of 0.1ps and duration of 100ps. These steps helped our molecular dynamics simulations close to actual reaction conditions. Finally, molecular dynamics simulations were performed with a step size of 0.2fs and a total duration of 20ns.

After completing the simulation, GROMACS tools were used to analyze the trajectory and calculate the root mean square deviation (RMSD), root mean square fluctuation (RMSF), solvent accessible area (SASA), and hydrogen bonds (Hbonds) of the crRNA, and data was plotted into graphs, as shown in Figures 17-20.

RMSD represents the distance between the same atoms in different structures, e.g., target structure and reference structure. Stable and low RMSD indicates high stability of the molecule. In Figure 17, we can see that the line becomes flatten after a short period of simulation, demonstrating that the structure of crRNA full length is stable and can be used for docking.

Figure 17: RMSD of crRNA full length

RMSF is a time-to-time average of the change in atomic position, which reflects the flexibility of the crRNA during the entire simulation process. RMSF of crRNA determines the deviation of each residue from a reference position. Areas farther away from the active site and flexible areas generally have large RMSF. Through the analysis of the data in Figure 18, the flexibility of crRNA residues was analyzed.

Figure 18: RMSF of crRNA full length

The next step is to calculate the hydrogen bonds formed within the crRNA. When RNA structure changes, the number of hydrogen bonds also changes. In Figure 19, we can see that after 2500 ps, the number of hydrogen bonds enters into a stable state, which shows that the crRNA structure remains stable.

Figure 19: Hbonds of crRNA full length

For the analysis of SASA, the SASA of the crRNA does not change significantly and fluctuates around 115 nm2 after 3000 ps, which demonstrates that the crRNA structure is stable. In summary, analysis of RMSD, RMSF, SASA, and Hbonds all verifies the stability of crRNA.

Figure 20: SASA of crRNA full length

Furthermore, we conduct molecular dynamics analysis of LwaCas13a protein. The setting of GROMACS adopts the setting parameters mentioned above. The total simulation time is 20ns and the step size is 0.02fs. Figures 21, 22, 23, and 24 show the RMSD, RMSF, SASA, and Hbonds of LwaCas13a protein respectively, which demonstrates that LwaCas13a protein has good stability and can be used for subsequent study.

Figure 21: RMSD of LwaCas13a
Figure 22: RMSF of LwaCas13a
Figure 23: SASA of LwaCas13a
Figure 24: Hbonds of LwaCas13a

Finally, we performed molecular dynamics analysis of LwaCas13a/crRNA complex. In Figure 25, the RMSD reaches a relatively stable range after 4000 ps. In Figure 27, the solvent accessible area also remains within a stable range, demonstrating that the LwaCas13a/crRNA complex has good stability.

Figure 25: RMSD of complex
Figure 26: RMSF of complex
Figure 27: SASA of complex
Figure 28: Hbonds of complex

Figure 28 shows the number of hydrogen bonds formed by the complex. The number of hydrogen bonds remains relatively stable, suggesting that the initial conformation does not change significantly.

The binding energy between protein and crRNA is mainly Coulomb energy, which is generated from electrostatic interactions. Figure 29 shows the binding energy between LwaCas13a protein and crRNA.

Figure 29: Binding Free Energy of complex

Together, Low and stable RMSD, stable SASA and Hbonds indicate that the LwaCas13a/crRNA complex is stable. Furthermore, the complex has low binding energy, which further confirms that the LwaCas13a protein and crRNA form a stable complex.

Conclusion

We designed crRNA (crRNA full length) for our CRISPR/Cas13a system for specific detection of miR-21-5p. Molecular docking and molecular dynamics simulations verified the stability of our CRISPR/Cas13a system, which demonstrated that our CRISPR/Cas13a system has great capability on detection of miR-21-5p.

References

[1] R. Lorenz et al., (2011) "ViennaRNA Package 2.0" Algorithms Mol. Biol., vol. 6, no. 1, pp. 1-14. doi: 10.1186/1748-7188-6-26.

[2] Yan Y, Tao H, He J, Huang SY. The HDOCK server for integrated protein-protein docking. Nat Protoc. 2020 May;15(5):1829-1852. doi: 10.1038/s41596-020-0312-x.

[3] Molecular Operating Environment (MOE) (2014) Montreal, Quebec, Canada: Chemical Computing Group Inc.https://www.chemcomp.com/.

[4] L. Wang, J. Zhou, Q. Wang, Y. Wang, and C. Kang, (2020) "Rapid design and development of CRISPR-Cas13a targeting SARS-CoV-2 spike protein" Theranostics, vol. 11, no. 2, pp. 649-664. doi: 10.7150/thno.51479.

[5] Gendron P, Lemieux S, Major F. Quantitative analysis of nucleic acid three-dimensional structures. J Mol Biol. 2001 May 18;308(5):919-36. doi: 10.1006/jmbi.2001.4626.

[6] T. H. T. Chau, D. H. A. Mai, D. N. Pham, H. T. Q. Le, and E. Y. Lee, (2020) "Developments of riboswitches and toehold switches for molecular detection-biosensing and molecular diagnostics" Int. J. Mol. Sci., vol. 21, no. 9. doi: 10.3390/ijms21093192.

[7] Liu L, Li X, Wang J, Wang M, Chen P, Yin M, Li J, Sheng G, Wang Y. Two Distant Catalytic Sites Are Responsible for C2c2 RNase Activities. Cell. 2017 Jan 12;168(1-2):121-134.e12. doi: 10.1016/j.cell.2016.12.031.

[8] UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531. doi: 10.1093/nar/gkac1052.

[9] Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303. doi: 10.1093/nar/gky427.

[10] Schrödinger, L., & DeLano, W. (2020). PyMOL. Retrieved from http://www.pymol.org/pymol.

[11] Bienert S, Waterhouse A, de Beer TA, Tauriello G, Studer G, Bordoli L, Schwede T. The SWISS-MODEL Repository-new features and functionality. Nucleic Acids Res. 2017 Jan 4;45(D1):D313-D319. doi: 10.1093/nar/gkw1132.

[12] Yang J, Song Y, Deng X, Vanegas JA, You Z, Zhang Y, Weng Z, Avery L, Dieckhaus KD, Peddi A, Gao Y, Zhang Y, Gao X. Engineered LwaCas13a with enhanced collateral activity for nucleic acid detection. Nat Chem Biol. 2023 Jan;19(1):45-54. doi: 10.1038/s41589-022-01135-y.

[13] Liu L, Li X, Ma J, Li Z, You L, Wang J, Wang M, Zhang X, Wang Y. The Molecular Architecture for RNA-Guided RNA Cleavage by Cas13a. Cell. 2017 Aug 10;170(4):714-726.e10. doi: 10.1016/j.cell.2017.06.050.

[14] Wang J, Mao K, Zhao Y, Zeng C, Xiang J, Zhang Y, Xiao Y. Optimization of RNA 3D structure prediction using evolutionary restraints of nucleotide-nucleotide interactions from direct coupling analysis. Nucleic Acids Res. 2017 Jun 20;45(11):6299-6309. doi: 10.1093/nar/gkx386.

[15] Adasme MF, Linnemann KL, Bolz SN, Kaiser F, Salentin S, Haupt VJ, Schroeder M. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 2021 Jul 2;49(W1):W530-W534. doi: 10.1093/nar/gkab294.

[16] Abudayyeh OO, Gootenberg JS, Essletzbichler P, Han S, Joung J, Belanto JJ, Verdine V, Cox DBT, Kellner MJ, Regev A, Lander ES, Voytas DF, Ting AY, Zhang F. RNA targeting with CRISPR-Cas13. Nature. 2017 Oct 12;550(7675):280-284. doi: 10.1038/nature24049.

[17] Sinha S, Molina Vargas AM, Arantes PR, Patel A, O'Connell MR, Palermo G. RNA-mediated Allosteric Activation in CRISPR-Cas13a. bioRxiv [Preprint]. 2023 Jul 27:2023.07.27.550797. doi: 10.1101/2023.07.27.550797.

[18] Tambe A, East-Seletsky A, Knott GJ, Doudna JA, O'Connell MR. RNA Binding and HEPN-Nuclease Activation Are Decoupled in CRISPR-Cas13a. Cell Rep. 2018 Jul 24;24(4):1025-1036. doi: 10.1016/j.celrep.2018.06.105.

[19] Accelrys. Discovery Studio Visualizer, v20.1.0.19295, BIOVIA, Dassault Systèmes, San Diego, 2020.

[20] Schlee S, Straub K, Schwab T, Kinateder T, Merkl R, Sterner R. Prediction of quaternary structure by analysis of hot spot residues in protein-protein interfaces: the case of anthranilate phosphoribosyltransferases. Proteins. 2019 Oct;87(10):815-825. doi: 10.1002/prot.25744.

[21] RNA structural dynamics as captured by molecular simulations: A comprehensive overview. doi: 10.1021/acs.chemrev.7b00427.

[22] GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputersl. doi: 10.1016/J.SOFTX.2015.06.001

[23] GROMACS: Fast, flexible, and freeliu. doi: 10.1002/jcc.20291.