Overview
In the CTC-FAST project, we replace the conventional antibody-based CTC capture by the high affinity between folic acid (FA) and folate receptor (FR) on the CTCs. The CTC labeling is replaced by fluorescence protein mGreenLantern (mGL) fusing with the high specificity of FRɑ recognizing peptide, C7 (Xing, L., et al. 2018).
To anchor the FA onto the wall of the main chamber in CTC-FAST device, we applied DNA tetrahedron as a switchable platform. To generate the single stranded DNAs (ssDNAs) that could self-assemble into DNA tetrahedron, we apply the rolling circle replication (RCR) mechanism to produce ssDNA in E. coli (Li, C., et al. 2023). The splicing of DNA tetrahedron from circular RCR ssDNA is mediated by cis-autopslicing sequence and zinc ion (Jia, Y., et al. 2021). The docking of DNA tetrahedron is mediated by the interaction between zinc finger protein (ZFP) on the wall of the main detection chamber and the ZFP recognizing motif within two edges of the DNA tetrahedron (Conrado, R. J., et al. 2012). Finally, the overhang at one endpoint of DNA tetrahedron is designed to complement to FA-conjugated ssDNA (FA-ssDNA) or other aptamers in the future. Together, the replacement of antibody by DNA tetrahedron make CTC detection more convenient in the way of storage and transport.
Regarding the labeling of captured CTCs, we fused the mGreenLantern (mGL) green fluorescent protein, which is six fold brighter than eGFP, with C7 peptide by a peptide linker. To determine the length of peptide linker, we docked the different mGL-linker-C7 onto FR-alpha by Frodock (Ramírez-Aportela et. al, 2016). Finally, confirmed that the mGL-4A-C7 protein could label the FRɑ positive SKOV3 cells.
Introduction
The RCR mechanism: RepA protein, RCORI105 and RCORI65
Rolling Circle Replication (RCR) is a fundamental molecular process that plays an important role in DNA replication and various molecular biology applications. It involves one-way amplification of circular DNA samples, producing long single-stranded or double-stranded DNA products with repetitive sequences.
In RCR, each replication initiator enzyme (e.g. RepA) recognizes its corresponding double-stranded origin (DSO) of the target plasmid (e.g. pC194), and creates a nick on one of the DNA strands. The DNA polymerase uses an unnicked strand as a template to elongate the nicked strand. Finally, the elongated nicked strand is completely replaced by the newly synthesised strand, and the replication initiator enzyme will then ligate the elongated nicked strand into a circular ssDNA. The circular ssDNA could serve as a template for secondard strand synthesis (Ruiz-Masó et al 2015).
▲ Fig. 1 The mechanism of RCR replication
Deoxyribozyme: the cis-auto and trans-auto splicing of ssDNA
Deoxyribozyme are DNA molecules that harbor catalytic activity. Since the discovery in 1994, Deoxyribosomes are proven to cleave RNA phosphodiester bond and mediate DNA phosphorylation, adenylation, deglycosylation and ligation. Recently research indicated that DNA can directly self-hydrolyze DNA phosphodiester bonds by forming specific structure and using zinc ion as a cofactor.
There are two classes of deoxyribozymes: Class I deoxyribozyme contains 15 conserved nucleotides within a loop region flanked by one or two base-paired stems. The class I deoxyribozyme recognizes substrate sequences of GTTGAAG and hydrolyze the phosphodiester bond between the dinucleotide ApA. The class II deoxyribozyme arries 32 nucleotides within an unpaired bulge that is flanked by two base-paired stems (Gu et al. 2013). To apply the deoxyribozymes to auto-splicing of ssDNA, we selected the classI-R1 sequence for cis-auto splicing and class I-R3 for trans-auto splicing.
▲ Fig 2. The secondary structure of cis-auto (left) and trans-auto (right) motifs.
ssDNA binding proteins: the EcSBP
Single-stranded DNA binding protein (SBP) sequester and protect the ssDNA. In
E. coli, the SBP (EcSBP; Genebank: QGY26441.1) is associated with the genome stability. The N-terminal of EcSBP is responsible for the ssDNA binding.
▲ Fig. 3: The structure of tetrameric SBPs from Shishmarev et. al. 2014
ZFP proteins: the zinc finger binding domain from PBSII and Zif268
Zinc finger protein (ZFP) are abundantly expressed and have diverse functions. ZFP can interact with DNA, RNA, and proteins. The classic zinc finger structure is maintained by zinc ion, which coordinates cysteine and histidine in different combinations. Importantly, the zinc finger shows DNA binding specificity. In our design, we apply the zinc finger domain form Zif268 (Bulyk et. al. 2001) and PBSII (Conrado, R. J., et al. 2012) for DNA tetrahedron docking.
mGL proteins:
The mGreenLantern (mGL) protein is recently developed for the application for neuronal image. Compared to the conventional eGFP, the mGL protein has up to sixfold greater brightness in cells (Campbell et. al. 2020).
▲ Fig. 4: The spectrum of mGL excitation and emission is quoted from fpbase (Lambert, 2019)
Peptide specific for FRα : C7 peptide
The sequence of C7 peptide is Met-His-Thr-Ala-Pro-Gly-Trp-Gly-Tyr-Arg-Leu-Ser, which is developed from the phage library screening. The ability of C7 peptide to recognize FRα is proved by FACS analysis of SKOV3 cell lines in vitro and the in vivo tumor staining. The equilibrium dissociation constant (KD) value between C7 and FRα was 0.3 μM. Importantly, the computational modeling and interaction analysis indicated that C7 binds with FRα close at the entrance of the pocket, which avoid the competition to folic acid (Xing et al. 2018).
DNA tetrahedron:
Over the past few years, the swift advancement of DNA nanotechnology has brought forth fresh perspectives for the detection and therapy of cancer. These nanostructures, assembled from single-stranded DNA, exhibit several features that make them suitable candidates for innovative cancer detection strategies.
One of the main advantages of DNA tetrahedron-based approaches in cancer detection is their high programmability and specificity. These structures can be precisely designed to carry various functional components, including ligands. This allows us to create tailored probes capable of targeting specific cancer-related biomarkers with a high degree of accuracy. Additionally, the DNA tetrahedron's inherent stability and rigidity contribute to the stability of the detection platform. This stability is essential for maintaining the integrity of the assembled nanostructure during various stages of the detection process, including sample preparation, target binding, and signal readout.
In summary, DNA tetrahedron structures offer a promising platform for the development of innovative cancer detection methods. Their programmability, versatility, and ability to carry multiple functional components make them well-suited for designing highly sensitive and specific assays that hold the potential to contribute significantly to improved cancer detection and patient outcomes.
Design
The design of DNA tetrahedron before engineering:
In the beginning, we decided to use FA-conjugated DNA tetrahedron for CTC capture. Soon after we learned that the design of an FA-conjugated adaptor ssDNA and a DNA tetrahedron with an overhang complementary to the adaptor shows advantage of aptamer switch in the future. Accordingly, we designed four separated ssDNA, which partially complement each other, to assemble the DNA tetrahedron. To dock the FA-conjugated adaptor ssDNA to the DNA tetrahedron, we extended the TD-2 into TD-2F to create an overhang.
The sequences of four separated ssDNA TD-1, TD-2, TD-3, TD-4, and TD-2F are listed below:
- TD-1:
ACATTCCTAAGTCTGAAacATTACAGCTTGCTACACgaGAAGAGCCGCCATAGTA
- TD-2:
TATCACCAGGCAGTTGAcaGTGTAGCAAGCTGTAATagATGCGAGGGTCCAATAC
- TD-3:
TCAACTGCCTGGTGATAaaACGACACTACGTGGGAAtcTACTATGGCGGCTCTTC
- TD-4:
TTCAGACTTAGGAATGTgcTTCCCACGTAGTGTCGTttGTATTGGACCCTCGCAT
- TD-2F (TD2 with overhang for adaptor ssDNA recognition):
GCTTGCACGCGTGCtattaatATCACCAGGCAGTTGAcaGTGTAGCAAGCTGTAATagATGCGAGGGTCCAATAC
▲ Fig. 5: The drawing shows the design of DNA tetrahedron before engineering
After experimental confirmation, we successfully generated the FA-conjugated adaptor ssDNA
(link to proof of concept). However, the assembly of tetrahedron shows unexpected polyhedrons formation. We then engineered the tetrahedron design and fused four separated ssDNAs into one tetrahedral ssDNA
(link to engineering).
The design of DNA tetrahedron assembled from trtrahedral ssDNA:
The folding pathway of the tetrahedral ssDNA is illustrated in the figure below. Among six edges, five are composed of 21 bp double helix. The last edge is a “twin double helices” to compensate for the necessary reverse of reverse polarity of complementary DNA strands. The 5’ terminus of tetrahedral ssDNA starts at one endpoint, and the PBSII and Zif268 binding motifs are located at the farest two edges, respectively (Conrado, R. J., et al. 2011). The PBSII and Zif268 binding motifs are applied to anchor tetrahedrons on the wall of the main chamber in CTC-FAST device through interaction with PBSII and Zif268 proteins.
Since the commercial production of tetrahedral ssDNA is hampered by its highly complementary sequence, we decided to apply the RCR mechanism to generate the tetrahedral ssDNA (see below section for details). The products from RCR will be circular ssDNA, therefore, we decided to fuse the cis-auto splicing sequence at the 5’ terminus tetrahedral ssDNA. After auto-splicing, the RCR generated circular ssDNA will fold into tetradron, and a 18 nucleotides overhang, which are the residues after splicing, will left at 3’ terminus. The 3’ terminus of FA-conjugated adaptor ssDNA will be complement to this overhang, and the 5’ terminus will contain the trans-auto splicing sequence. Finally, the exogenous ssDNA complementary to the trans-auto splicing sequence could be added to cleave the adaptor ssDNA and release the captured CTCs.
▲ Fig. 6: The design design of DNA tetrahedron assembled from tetrahhedral ssDNA.
The basic part of PBSII-Zif268 fusion protein:
BBa K4674004
The composite part for PBSII-Zif268 fusion protein expression:
BBa_K4674013
Trans-auto splicing sequence (not submitted as a part):
ATCAGGTCGATCGAGT-gttgaag-CTGGATGCTAGTGCAT
Adaptor ssDNA (not submitted as a part):
ATCAGGTCGATCGAGT-gttgaag-CTGGATGCTAGTGCAT-TCAACTTAATGCTCAACTA
The italic is sequences complementary to the overhang after cis-auto splicing.
The exogenous ssDNA for cleavage (not submitted as a part):
ATGCACTAGCATCCAG-TAGTTGAGCT-ACTCGATCGACCTGAT
The design of RCR mediated ssDNA production:
To produce ssDNA, we decided to construct the RCR mechanism in
E. coli. We selected the replication initiator enzyme RepA, and the corresponding DSO, RCORI, from the target R-plasmid pC194 (Noirot-Gros et al., 1994). The last 105 nucleotides of pC194 DSO (RCORI-105) serves as the start point of RCR replication, while the first 65 nucleotides is applied as stop point. The cis-auto splicing sequence and tetrahedral ssDNA sequence are inserted between RCORI-105 and RCORI-65. For the activation of RCR, we cloned the RepA gene into pET15b, by which the protein expression could be regulated by lac operator. Furthermore, we insert another ribosome binding sequence (RBS) and the coding sequence of
E. coli ssDNA binging protein (SSBP) to protect the RCR-generated ssDNA. Finally, to purify the SSBP bound ssDNA, we fused a 6xHis-tag to SSBP.
▲ Fig. 7: The design of RCR mediated circular ssDNA generation
The RCR associated basic and composite parts are list below:
The basic part of RCORI-105:
BBa_K4674007
The basic part of RCORI-65:
2BBa_K4674008
The basic part of Tetraheronal ssDNA with Cis-auto splicing motif:
BBa_K4674006
The composite part of RCR targeting cassette:
BBa_K4674012
The basic part of RepA protein:
BBa_K4674002
The basic part of SSBP protein:
BBa_K4674003
The composite part of RepA and SSBP protein expression:
BBa_K4674011
Fig. 8: The composite parst of RepA and SSB expressing cassette and the RCR targeting cassette.
The design of CTC labeling: mGL-4A-C7 protein
To efficiently label the CTCs captured by DNA tetrahedron, we first perform modeling
(link to model) and determine to link mGL protein and C7 dedocapeptide with 4 alanines (mGL-4A-C7). To expression the mGL-4A-C7 fusion protein, we design and cloned the basic part of mGL-4A-C7 (BBa_K4674001) into the pET15b. The fusion protein was induced by IPTG and purified by FPLC. Finally, to confirm whether the mGL-4A-C7 is functional, we perform the labeling experiments by SKOV3 cell line, which is the mimic of the captured CTC. (Due to the highly expression of FRα in SKOV3 cells)
Fig. 9: The schematic diagram of mGL-4A-C7 protein for labeling CTCs. The mGL is modified from GFP and the C7 is fused to the C-terminal of mGL by a 4A linker.
The associated parts are list below:
The basic part of mGL:
BBa_K4674000
The basic part of mGL-4A-C7:
BBa_K4674001
The composite part of mGL-4A-C7 fusion protein expression:
BBa_K4674010
References
Bulyk, M L et al. (2001)
Exploring the DNA-binding specificities of zinc fingers with DNA microarrays
Proc Natl Acad Sci U S A. 19;98(13):7158-63.
Campbell Bc et al. (2020)
mGreenLantern: a bright monomeric fluorescent protein with rapid expression and cell filling properties for neuronal imaging
Proceedings of the National Academy of Sciences, , 202000942.
Conrado, R. J., et al.(2012).
DNA-guided assembly of biosynthetic pathways promotes improved catalytic efficiency.
Nucleic acids research, 40(4), 1879–1889.
Hongzhou Gu et al. (2013)
Small, highly-active DNAs that hydrolyze DNA.
J Am Chem Soc. 135(24): 9121–9129.
Jia, Y., et al. (2021).
DNA-Catalyzed Efficient Production of Single-Stranded DNA Nanostructures.
ScienceDirect.
Lambert, TJ (2019)
FPbase: a community-editable fluorescent protein database.
Nature Methods. 16, 277–278.
Li, C., et al. (2023).
Construction of rolling circle amplification products-based pure nucleic acid nanostructures for biomedical applications.
Acta biomaterialia, 160, 1–13.
Noirot-Gros, M F et al. (1994)
Active site of the replication protein of the rolling circle plasmid pC194.
EMBO J 15;13(18):4412-20.
Ruiz-Masó J, et al. (2015)
Plasmid rolling-circle replication.
microbial spectrum.3;10.1128
Shishmarev D, et al. (2014)
Intramolecular binding mode of the C-terminus of Escherichia coli single-stranded DNA binding protein determined by nuclear magnetic resonance spectroscopy
Nucleic Acids Res. 42(4):2750-7.
Xing, L., et al. (2018)
Identification of a peptide for folate receptor alpha by phage display and its tumor targeting activity in ovary cancer xenograft.
Sci Rep 8, 8426.