Simulation and Verification of CRISPR-Cas13a System
Overview
Introduction:
CRISPR/Cas13a system is a complex that includes Cas13a protein and crRNA, and the specific
binding of
crRNA to its target RNA activates the cleavage activity of the Cas13a/crRNA complex. After
identifying
that miR-21-5p is an optimal biomarker of breast cancer, we decided to engineer a CRISPR/Cas13a
system
that targets miR-21-5p. We designed crRNA that can specifically bind to miR-21-5p, and used
molecular
docking and molecular dynamics simulation to evaluate the stability of the Cas13a/crRNA complex
that we
have engineered for the detection of miR-21-5p.
Methods and results:
We designed four different crRNAs that could potentially bind to miR-21-5p. Through
thermodynamic
evaluation with ViennaRNA[1], we
determined that
crRNA full length is the optimal sequence to detect
miR-21-5p. Subsequently, we utilized HDOCK[2]
and
MOE[3] to simulate the assembly of the
CRISPR/Cas13a
complex, which verified the binding potential of crRNA full length from the perspective of
interaction.
We used molecular dynamics simulation to analyze the stability of the Cas13a/crRNA complex,
which
verified the stability of this complex from the perspective of energy. Together, these
simulations
demonstrated that the Cas13a/crRNA complex is efficient to bind to miR-21-5p, indicating our
engineered
CRISPR/Cas13a system should be reliable for the detection of miR-21-5p.
Design and Analysis of crRNA Sequences
CRISPR RNA (crRNA) consists of a loop region (repeat part) and a linear guide sequence
(detection part).
While the loop region is responsible for the interactions between crRNA and Cas protein, the
linear
guide sequence includes 20 or more nucleotides that is complementary to the target RNA, which is
responsible for the interaction with target RNA. A stable and specific crRNA can aid
Cas13a/crRNA
complex to interact with its target RNA more efficiently.
We found the sequence of miR-21-5p in miRbase, and used it for our subsequent study (Table 1).
miRNA |
Sequence |
miR-21-5p |
UAGCUUAUCAGACUGAUGUUGA |
Table 1: The sequence of miR-21-5p
Based on the sequence of miR-21-5p, we designed 4 candidate crRNA sequences, which have the
canonical
loop region and different linear guide sequences that contained 20 or more nucleotides (crRNA
full
length include 22 nucleotides) complementary to miR-21-5p.
crRNA |
Sequence |
crRNA1 |
GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC
AACAUCAGUCUGAUAAGCUA
|
crRNA2 |
GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC
CAACAUCAGUCUGAUAAGCU
|
crRNA3 |
GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC
UCAACAUCAGUCUGAUAAGC |
crRNA full length |
GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC
UCAACAUCAGUCUGAUAAGCUA |
Table 2: Sequences of candidate crRNAs
The predicted conformation of the 4 crRNAs are shown below[5]
Figure 1: Schematic diagram of the conformation of (A) crRNA1, (B)
crRNA2, (C)
crRNA3, (D) crRNA full length, with the following color-coded annotations: Green represents
stems
(canonical helices), yellow indicates interior loops, blue signifies hairpin loops, and orange
highlights the 5' and 3' unpaired regions.
To identify the crRNA with the best interaction efficiency with miR-21-5p, we analyzed these
crRNAs with
a variety of approaches[4]. First, we
evaluated
the thermodynamic characteristics of these crRNAs with
the following parameters [6].
- Minimum free energy of crRNA sequence (MFE)
- GC content
- Minimum free energy of heterodimer, i.e., crRNA and miRNA complex (MFE heterodimer)
MFE evaluates the stability of conformation, which is calculated by RNAfold of ViennaRNA. The
smaller
the MFE score is, the more stable the conformation is. GC-content showed the proportion of
Guanine and
Cytosine in the sequence. Compared to other types of interactions between nucleotides, the pair
of
guanine and cytosine forms more hydrogen bonds, and therefore the sequence with higher
GC-content should
be more stable. MFE heterodimer evaluates the stability of the complex, which is also calculated
by
RNAcofold of ViennaRNA. A smaller score of MFE heterodimer indicates a more stable crRNA/miRNA
complex
and a higher interaction efficiency of crRNA with miRNA. The scores of MFE, GC content and MFE
heterodimer are shown in the table below.
crRNA |
MFE (kcal/mol) |
GC content |
MFE heterodimer (kcal/mol) |
crRNA1 |
-9.8 |
0.39 |
-41.6 |
crRNA2 |
-9.7 |
0.41 |
-40.9 |
crRNA3 |
-9.6 |
0.41 |
-39.8 |
crRNA full length |
-9.6 |
0.40 |
-42.5 |
Table 3: Thermodynamic parameters of different crRNAs
While the differences of these crRNAs on MFE and GC scores are minimum, the difference of these
crRNAs
on MFE heterodimer score is more obvious. Actually, MFE heterodimer is the most important
parameter,
since crRNA and miRNA work together in actual chemical reactions. The crRNA full length has the
lowest
MFE heterodimer, indicating that it has the best stability and bond forming properties. Notably,
subsequent experimental data also demonstrates that crRNA full length is indeed the most
efficient crRNA
for the interaction with miR-21-5p.
Molecular docking - assembly inspection of CRISPR/Cas13a system
In addition to evaluating crRNA alone, we also need to verify whether Cas13a/crRNA complex can
be
assembled properly, since proper assembly is crucial for the function of CRISPR/Cas13a system.
We
simulated the assembly of Cas13a/crRNA complex with molecular docking, a method that is usually
used for
analyzing receptor-ligand interactions. Based on the properties of receptor and ligand,
molecules with
various possible conformations are docked and scored based on energy and binding conditions.
Based on the extent of model simplifications, molecular docking can be divided into rigid
docking, semi
flexible docking, and flexible docking. As Cas13a protein and crRNA belong to macromolecules,
which are
not suitable for flexible docking analysis, we analyzed the interactions between LwaCas13a
protein and
crRNA using a rigid docking algorithm, HDOCK server, and a semi flexible docking algorithm, MOE
software.
Homologous modeling of protein & Preparation of docking files
The first step of molecular docking is to obtain the PDB structure files of receptors and
ligands. Since
PDB file of the LwaCas13a protein (1152AA)[7]
is
not available from the UniProt[8], we used
a
homologous
modeling method to predict the structure and obtain the PDB structure file of LwaCas13a. Due to
the
82.56% similarity between protein sequences of LwaCas13a and LbuCas13a, we used LbuCas13a as the
template protein. We adopted the best result of SWISS-MODEL homology modeling as the PDB
file[9]of
LwaCas13a for subsequent docking simulations. Results of the homologous modeling are displayed
with
PyMol software[10] (Figure 2-4).
Figure 2: Template protein (LbuCas13a)[11]
Figure 3: Predicted structure of LwaCas13a with SWISS-MODEL
Figure 4: The overlay diagram of protein structures of LbuCas13a and
LwaCas13a
(blue-LbuCas13a, template; green-LwaCas13a)
LwaCas13a protein consists of 2 main lobes, the Recognition Lobe (REC) and the Nuclease Lobe
(NUC).
While the Recognition Lobe contains 2 domains, the NTD and Helical-1 domain, the NUC Lobe
contains 5
domains, HEPN1-I, Helical-2, HEPN1-II, Linker and HEPN2[12]. Once crRNA sequence binds to the REC Lobe
and target RNA, the LwaCas13a complex changes conformation and activates the trans-cleavage
activity of
CRISPR/Cas13a system [13]. The schematic
diagrams
of LwaCas13a structure are shown below.
Figure 5: Schematic diagram of REC and NUC lobes of LwaCas13a
(yellow-REC blue-NUC)
Figure 6: Schematic diagram of different domains of LwaCas13a
(red-NTD green-Helical-1 blue-HEPN1-I yellow-Helical-2 magenta-HEPN1-Ⅱ cyan-Linker orange-HEPN2)
To obtain the PDB file of crRNA, we input the previously obtained conformation information of
crRNA and
use 3dRNA server[14] to predict its
three-dimensional structure. The optimal structure predicted by
3dRNA server is used as the ligand PDB file for molecular docking.
Figure 7: Predicted structure of crRNA full length with 3dRNA
Figure 8: Predicted structure of crRNA1 with 3dRNA
Figure 9: Predicted structure of crRNA2 with 3dRNA
Figure 10: Predicted structure of crRNA3 with 3dRNA
Evaluation of crRNA from molecular docking
We used HDOCK server to dock crRNA full length and other crRNAs on LwaCas13a protein, and
compared their
docking from two perspectives, docking score and RMSD (Figure 11 & 12). While docking score
represents
binding status of the receptor and ligand, RMSD evaluates the distance between two molecules.
Low
docking score and RMSD indicate stable docking. The docking score and RMSD of crRNA full length
are
lower than other crRNAs, demonstrating the superiority of crRNA full length on docking with
LwaCas13a
protein.
Figure 11: Docking scores of crRNAs on LwaCas13a protein
Figure 12: RMSD of crRNAs on LwaCas13a protein
Molecular docking retest-MOE
As HDOCK is a rigid docking algorithm based on a template, its docking score cannot always
represent the
binding energy faithfully. Therefore, we used MOE software for semi flexible docking of
LwaCas13a and
crRNA full length, and analyzed the interactions between LwaCas13a and crRNA full length[15].
Figure 13: LwaCas13a/crRNA full length complex
Figure 14: Interactions and Hydrogen Bonds between LwaCas13a and
crRNA
The majority of interactions take place between the REC lobe and crRNA. While most interactions
are
consistent with previous published literatures[16-18] a small portion of the interactions is
different,
which may be caused by the minor differences between the structures of LwaCas13a and LbuCas13a
proteins.
It is worthy to note that different crRNAs can also lead to different scopes of action, which
might lead
to differences in the interaction between crRNA and Cas13a protein.
Finally, the complex is docked with miRNA. Ternary complex is obtained by using MOE software,
which is
displayed by Discovery Studio Visualizer[19].
Figure 15: The predicted structure of LwaCas13a/crRNA/miRNA
ternary complex
The number of binding sites among LwaCas13a, crRNA and miRNA are shown below[20].
Name |
Number of Binding Sites |
miR-21-5p |
Helical-1 |
crRNA |
4 |
8 |
Table 4: Number of binding sites between miR-21-5p and LwaCas13a/crRNA
complex
Figure 16: Schematic diagram of interaction among LwaCas13a, crRNA and
miRNA
In our simulation, LwaCas13a, crRNA, and miRNA can be successfully assembled into ternary
complex, and
strong interactions have been formed between each part (Figure 16). Consistently, the results of
our wet
laboratory experiment demonstrated that miR-21-5p could activate the cleavage activities of the
LwaCas13a/crRNA complex, demonstrating that they formed a functional LwaCas13a/crRNA/miR-21-5p
ternary
complex together.
Molecular Dynamic Simulation
To analyze the stability of our CRISPR/Cas13a system, we used GROMACS software to implement
molecular
dynamics simulations[21]. The simulation
conditions were carried out at a static temperature of 300K and
normal pressure. The force field used Amber99sb-ildn. The solvent was water (tip3p model), and
the total
charge of the simulated system was neutralized by adding an appropriate amount of Na+ ions.
Energy
minimization was performed. Subsequently, isothermal and isovolumetric ensemble (NVT)
equilibrium and
isothermal and isobaric ensemble (NPT) equilibrium were also performed, with a coupling constant
of
0.1ps and duration of 100ps. These steps helped our molecular dynamics simulations close to
actual
reaction conditions. Finally, molecular dynamics simulations were performed with a step size of
0.2fs
and a total duration of 20ns.
After completing the simulation, GROMACS tools were used to analyze the trajectory and calculate
the
root mean square deviation (RMSD), root mean square fluctuation (RMSF), solvent accessible area
(SASA),
and hydrogen bonds (Hbonds) of the crRNA, and data was plotted into graphs, as shown in Figures
17-20.
RMSD represents the distance between the same atoms in different structures, e.g., target
structure and
reference structure. Stable and low RMSD indicates high stability of the molecule. In Figure 17,
we can
see that the line becomes flatten after a short period of simulation, demonstrating that the
structure
of crRNA full length is stable and can be used for docking.
Figure 17: RMSD of crRNA full length
RMSF is a time-to-time average of the change in atomic position, which reflects the flexibility
of the
crRNA during the entire simulation process. RMSF of crRNA determines the deviation of each
residue from
a reference position. Areas farther away from the active site and flexible areas generally have
large
RMSF. Through the analysis of the data in Figure 18, the flexibility of crRNA residues was
analyzed.
Figure 18: RMSF of crRNA full length
The next step is to calculate the hydrogen bonds formed within the crRNA. When RNA structure
changes,
the number of hydrogen bonds also changes. In Figure 19, we can see that after 2500 ps, the
number of
hydrogen bonds enters into a stable state, which shows that the crRNA structure remains stable.
Figure 19: Hbonds of crRNA full length
For the analysis of SASA, the SASA of the crRNA does not change significantly and fluctuates
around 115
nm2 after 3000 ps, which demonstrates that the crRNA structure is stable. In summary, analysis
of RMSD,
RMSF, SASA, and Hbonds all verifies the stability of crRNA.
Figure 20: SASA of crRNA full length
Furthermore, we conduct molecular dynamics analysis of LwaCas13a protein. The setting of GROMACS
adopts
the setting parameters mentioned above. The total simulation time is 20ns and the step size is
0.02fs.
Figures 21, 22, 23, and 24 show the RMSD, RMSF, SASA, and Hbonds of LwaCas13a protein
respectively,
which demonstrates that LwaCas13a protein has good stability and can be used for subsequent
study.
Figure 21: RMSD of LwaCas13a
Figure 22: RMSF of LwaCas13a
Figure 23: SASA of LwaCas13a
Figure 24: Hbonds of LwaCas13a
Finally, we performed molecular dynamics analysis of LwaCas13a/crRNA complex. In Figure 25, the
RMSD
reaches a relatively stable range after 4000 ps. In Figure 27, the solvent accessible area also
remains
within a stable range, demonstrating that the LwaCas13a/crRNA complex has good stability.
Figure 25: RMSD of complex
Figure 26: RMSF of complex
Figure 27: SASA of complex
Figure 28: Hbonds of complex
Figure 28 shows the number of hydrogen bonds formed by the complex. The number of hydrogen bonds
remains
relatively stable, suggesting that the initial conformation does not change significantly.
The binding energy between protein and crRNA is mainly Coulomb energy, which is generated from
electrostatic interactions. Figure 29 shows the binding energy between LwaCas13a protein and
crRNA.
Figure 29: Binding Free Energy of complex
Together, Low and stable RMSD, stable SASA and Hbonds indicate that the LwaCas13a/crRNA complex
is
stable. Furthermore, the complex has low binding energy, which further confirms that the
LwaCas13a
protein and crRNA form a stable complex.
Conclusion
We designed crRNA (crRNA full length) for our CRISPR/Cas13a system for specific detection of
miR-21-5p.
Molecular docking and molecular dynamics simulations verified the stability of our CRISPR/Cas13a
system,
which demonstrated that our CRISPR/Cas13a system has great capability on detection of miR-21-5p.
References
[1] R. Lorenz et al., (2011) "ViennaRNA Package 2.0" Algorithms Mol. Biol., vol. 6, no. 1, pp.
1-14.
doi: 10.1186/1748-7188-6-26.
[2] Yan Y, Tao H, He J, Huang SY. The HDOCK server for integrated protein-protein docking. Nat
Protoc.
2020 May;15(5):1829-1852. doi: 10.1038/s41596-020-0312-x.
[3] Molecular Operating Environment (MOE) (2014) Montreal, Quebec, Canada: Chemical Computing
Group
Inc.https://www.chemcomp.com/.
[4] L. Wang, J. Zhou, Q. Wang, Y. Wang, and C. Kang, (2020) "Rapid design and development of
CRISPR-Cas13a targeting SARS-CoV-2 spike protein" Theranostics, vol. 11, no. 2, pp. 649-664.
doi:
10.7150/thno.51479.
[5] Gendron P, Lemieux S, Major F. Quantitative analysis of nucleic acid three-dimensional
structures. J
Mol Biol. 2001 May 18;308(5):919-36. doi: 10.1006/jmbi.2001.4626.
[6] T. H. T. Chau, D. H. A. Mai, D. N. Pham, H. T. Q. Le, and E. Y. Lee, (2020) "Developments of
riboswitches and toehold switches for molecular detection-biosensing and molecular diagnostics"
Int. J.
Mol. Sci., vol. 21, no. 9. doi: 10.3390/ijms21093192.
[7] Liu L, Li X, Wang J, Wang M, Chen P, Yin M, Li J, Sheng G, Wang Y. Two Distant Catalytic
Sites Are
Responsible for C2c2 RNase Activities. Cell. 2017 Jan 12;168(1-2):121-134.e12. doi:
10.1016/j.cell.2016.12.031.
[8] UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res.
2023
Jan 6;51(D1):D523-D531. doi: 10.1093/nar/gkac1052.
[9] Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP,
Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures
and
complexes. Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303. doi: 10.1093/nar/gky427.
[10] Schrödinger, L., & DeLano, W. (2020). PyMOL. Retrieved from http://www.pymol.org/pymol.
[11] Bienert S, Waterhouse A, de Beer TA, Tauriello G, Studer G, Bordoli L, Schwede T. The
SWISS-MODEL
Repository-new features and functionality. Nucleic Acids Res. 2017 Jan 4;45(D1):D313-D319. doi:
10.1093/nar/gkw1132.
[12] Yang J, Song Y, Deng X, Vanegas JA, You Z, Zhang Y, Weng Z, Avery L, Dieckhaus KD, Peddi A,
Gao Y,
Zhang Y, Gao X. Engineered LwaCas13a with enhanced collateral activity for nucleic acid
detection. Nat
Chem Biol. 2023 Jan;19(1):45-54. doi: 10.1038/s41589-022-01135-y.
[13] Liu L, Li X, Ma J, Li Z, You L, Wang J, Wang M, Zhang X, Wang Y. The Molecular Architecture
for
RNA-Guided RNA Cleavage by Cas13a. Cell. 2017 Aug 10;170(4):714-726.e10. doi:
10.1016/j.cell.2017.06.050.
[14] Wang J, Mao K, Zhao Y, Zeng C, Xiang J, Zhang Y, Xiao Y. Optimization of RNA 3D structure
prediction using evolutionary restraints of nucleotide-nucleotide interactions from direct
coupling
analysis. Nucleic Acids Res. 2017 Jun 20;45(11):6299-6309. doi: 10.1093/nar/gkx386.
[15] Adasme MF, Linnemann KL, Bolz SN, Kaiser F, Salentin S, Haupt VJ, Schroeder M. PLIP 2021:
expanding
the scope of the protein-ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 2021 Jul
2;49(W1):W530-W534. doi: 10.1093/nar/gkab294.
[16] Abudayyeh OO, Gootenberg JS, Essletzbichler P, Han S, Joung J, Belanto JJ, Verdine V, Cox
DBT,
Kellner MJ, Regev A, Lander ES, Voytas DF, Ting AY, Zhang F. RNA targeting with CRISPR-Cas13.
Nature.
2017 Oct 12;550(7675):280-284. doi: 10.1038/nature24049.
[17] Sinha S, Molina Vargas AM, Arantes PR, Patel A, O'Connell MR, Palermo G. RNA-mediated
Allosteric
Activation in CRISPR-Cas13a. bioRxiv [Preprint]. 2023 Jul 27:2023.07.27.550797. doi:
10.1101/2023.07.27.550797.
[18] Tambe A, East-Seletsky A, Knott GJ, Doudna JA, O'Connell MR. RNA Binding and HEPN-Nuclease
Activation Are Decoupled in CRISPR-Cas13a. Cell Rep. 2018 Jul 24;24(4):1025-1036. doi:
10.1016/j.celrep.2018.06.105.
[19] Accelrys. Discovery Studio Visualizer, v20.1.0.19295, BIOVIA, Dassault Systèmes, San Diego,
2020.
[20] Schlee S, Straub K, Schwab T, Kinateder T, Merkl R, Sterner R. Prediction of quaternary
structure
by analysis of hot spot residues in protein-protein interfaces: the case of anthranilate
phosphoribosyltransferases. Proteins. 2019 Oct;87(10):815-825. doi: 10.1002/prot.25744.
[21] RNA structural dynamics as captured by molecular simulations: A comprehensive overview.
doi:
10.1021/acs.chemrev.7b00427.
[22] GROMACS: High performance molecular simulations through multi-level parallelism from
laptops to
supercomputersl. doi: 10.1016/J.SOFTX.2015.06.001
[23] GROMACS: Fast, flexible, and freeliu. doi: 10.1002/jcc.20291.