“Wholeness is not achieved by cutting off a portion of one’s being, but by integration of the contraries.” - Carl Jung

A description of the difficulties associated with the common use of Lactobacillus crispatus today:

Lactobacillus crispatus (L. crispatus) has many potential benefits as a probiotic and in treating diseases[1], [2]. However, it has not been widely adopted in research and industry due to the difficulties associated with its engineering. To make it more effective at treating diseases, certain improvements need to be made. These improvements could include giving it the ability to produce toxins that target harmful pathogens and increasing its ability to adhere to specific sites. These improvements require precise genetic engineering interventions. As part of the Lactobacillus family, genetic transformation to Lactobacillus crispatus is difficult [3], [4]. Moreover, it is also challenging to retain plasmids in this bacterium for extended periods of time [5], [6]. As a result, maximizing its therapeutic potential is further complicated.

In response to these challenges and to facilitate enhanced accessibility to genetic manipulation, we have pursued a revolutionary system predicated on enduring and steadfast integration into the bacterial genome, using the CRISPR Cas system. This integration circumvents plasmid retention constraints, presenting a sustained, comprehensive, straightforward, and effective approach to executing genetic modifications within the probiotic bacteria [3], [4], [7], [8].

We have created a built-in tool which enables the replacement of the currently implemented cassette using common protocols such as cutting with restriction enzymes and ligation. This cassette could be replaced with any target gene that industry or researchers desire, thus greatly increasing L. crispatus’ genetic manipulation potential

Clustered regularly interspaced short palindromic repeats (CRISPR) and associated proteins (Cas) is an adaptive immune system found in many bacteria and archaea. It functions as a defense mechanism to prevent these organisms from being infected by phages, viruses, and other foreign genetic elements. The system uses repeat-spacer arrays and endonuclease-active Cas proteins to detect and destroy foreign genetic material[8], [9].

There are two classes of CRISPR-Cas systems: Class 1 and Class 2. Among these classes, there are six types of CRISPR-Cas systems, labeled I to VI, as well as several subtypes. CRISPR-Cas proteins are effector proteins of the CRISPR system. They recognize foreign DNA and RNA, cleave target genetic material, and facilitate the integration of new spacers into the CRISPR Array so that the bacteria can destroy foreign DNA that it has encountered previously. In molecular biology, the CRISPR-Cas system has become one of the most extensively used genome editing tools worldwide due to its simple design, low cost, high efficiency, reproducibility, and short cycle time [8].

CRISPR Cas 9 is currently the most widely studied and applied system for genome editing. However, the Lactobacillus crispatus strain naturally contains Cas 3 CRISPR system proteins, suggesting that Cas 3 would be the preferrable system of choice in our case.

The implementation of the CRISPR-Cas system for genetic engineering involves the use of Cas proteins, protospacer adjacent motif (PAM sequence), guide RNA with spacers and repeats, and donor DNA with homology arms and target gene sequences. The design of these elements varies based on the type of CRISPR-Cas system being utilized and thus, our project necessitated the tailoring of each element to the CRISPR-Cas 3 system.


Schematic illustration of the CRISPR mechanism.

The elements in play when designing CRISPR-Cas systems can be divided into those found within the organism’s genome and those provided exogenously. Protospacers are specific DNA sequences found in bacterial genomes. CRISPR Cas 9 protospacers are ~20 base pairs (bp) long, whereas CRISPR Cas 3 protospacers are 33 base pairs long[3], [7]–[9]. The Cas protein cleaves this sequence, which enables various types of genetic editing such as, knockout, knock in, and gene silencing.

Additionally, within the target’s genome, we have the protospacer adjacent motif, a short, conserved sequence (2-3 nucleotides) in the target DNA recognized by the Cas protein found next to the aforementioned protospacer sequence. It is essential for the Cas protein to distinguish between self and foreign DNA. The CRISPR Cas 9 system pam sequence is NGG, while the CRISPR Cas 3 system pam sequence is AA/AAA[3], [7]–[9].

Once we have decided on the area of interest that we wish to target using the protospacer and PAM sequence, we can design the guide RNA. Guide RNA or gRNA refers to a man-made RNA molecule that has a specific sequence, called a spacer, that is complementary to our desired protospacer sequence. The gRNA is made up of a scaffold sequence (repeats) and a spacer sequence which provides a secondary structure that assists in the induction creation of the nick.

Cas system proteins induce double strand breaks in the target genome. To perform a knock in event, which involves integrating a new gene of our choice into the genome, donor DNA is required. The donor includes a target gene sequence that should be flanked on both sides by arms that contain homologous sequences to the target genome, upstream and downstream relative to the intended nick region.

Further information on CRISPR can be found in a multitude of informative literature sources[3], [7]–[10].

Our project aims to perform genomic editing in Lactobacillus crispatus using the CRISPR-Cas 3 system, which is an endogenous system within the bacteria. This is accomplished via transformation of a plasmid which contains gRNA, which targets a highly expressed gene, and donor DNA which includes the gene to be knocked in. This approach is based on prior research that achieved successful genomic editing in the Lactobacillus crispatus strain NCK 1350 [1]. Following is a detailed explanation of the design process, ensuring that the system is suitable for our target strain, and adhering to the CRISPR methodology.

Our research into Lactobacillus crispatus' probiotic properties led us to search for a specific strain to purchase. We chose to purchase L. crispatus strain ATCC 33820 due to numerous factors, including the fact that the entire genome sequence of this strain is accessible and has been mentioned in prior research papers. Furthermore, proteins of the CRISPR Cas 3 system were identified within the genome of this strain. In parallel, we verified that the proteins of the Cas 3 system were also found in the strain, Lactobacillus crispatus NCK 1350, which was used in the article that showed success in editing L. crispatus’ genome with the CRISPR-Cas 3 system [1]. This verification was accomplished by searching this strains’ genome on the NCBI website for the Cas 3 proteins (GenBank: SGWL00000000.1). After comparing the blast system results, we found that these system proteins, coming from two different bacterial strains, had a similarity of 99.24%. As a result of this discovery, we selected the ATCC 33820 strain for our research and designed an action plan that included genome editing using the CRISPR Cas 3 system.

For efficient genomic editing, it’s essential to identify a gene that is likely to be expressed in the bacteria to ensure that there is access to the desired sequence for the required proteins to initiate the nick. Our primary objective was to locate such a gene in our ATCC 33820 genome. This gene must on the one hand be highly expressed, and on the other hand, damage or removal of the gene should not prove lethal to the bacteria.

After conducting a thorough review of various proteins, which involved analyzing their functionality, identifying their interacting proteins, and searching for alternative pathways with similar functions, it was decided to work with the TetM gene, which codes for a protein that confers resistance to the antibiotic, tetracycline [11]. Interfering with the expression of the TetM gene falls directly in line with the main goals of our project. These include reducing the widespread use of antibiotics, as well as reducing and preventing the proliferation of antibiotic resistance among bacteria.

The TetM gene fulfills our criteria as an essential gene for bacterial expression. Its expression can be triggered and verified by adding tetracycline to the growth medium and continuing with genetic editing for only those bacteria that survived this selection. The genetic engineering of the bacteria that express the TetM gene will be performed in the absence of tetracycline. This is due to the fact that once we integrate our gene of interest within the sequence of the TetM gene, the bacteria will lose its resistance. Damage to the TetM gene enables a simple validation method to confirm genomic editing success. Following successful transformation of the plasmid, colonies can be transferred to a plate containing tetracycline. The expected outcome after genomic editing is a reduction in the function of the gene and colony count to the point of complete absence of colony growth on the tetracycline containing plate. This transformation can be validated on the basis of the inserted gene as will be discussed below.

A comprehensive study of the CRISPR Cas 3 system was conducted after identifying the target area for a knock-in procedure, focusing on its use in Lactobacillus crispatus. There were several key findings in the study, including the presence of specific pam motifs, the length requirements for homologous arms and protospacers, and the location of the nick in the DNA.

In addition, a literature review was conducted, and plasmids successfully transfected in Crispus were searched for. The pTRKH2 plasmid used by C. Hidalgo-Cantabrana et al.[3] was ordered. This plasmid is a shuttle vector for Escherichia coli (E. coli) and Gram-positive bacteria. It has a high copy number in gram-positive bacteria and has a selectable marker of erythromycin for both E. coli and the Gram-positive bacteria [12].

The next pivotal step was the target plasmid design. This process was primarily focused on the meticulous planning of the guide RNA and donor DNA segments.

The guide RNA (gRNA) comprises several integral elements, including a promoter region, repeat sequences flanking the protospacer, a terminator segment, and recognition sites for specific restriction enzymes. The gRNA cassette has been engineered to incorporate these recognition sites at its termini, thereby enabling seamless, simple, and effective integration into the pTRKH2 plasmid.

The DNA repeat sequence specific to Lactobacillus crispatus strain NCK 1350 was documented [3]. To ascertain its applicability to the strain ATCC 33820, an inquiry into the presence of this genetic motif within the bacterial genome was conducted. A matching sequence was discerned within the node9 region of the genome, appearing at multiple adjacent loci, which appear to represent the CRISPR array within our strain. The presence of this sequence within the ATCC 33820 strain substantiates the viability of employing this sequence in our design endeavors as the DNA repeat sequence that flanks the protospacer. This discovery and the identification of the CRISPR array holds significant value as it will serve as a foundation for characterizing a distinct leader promoter exclusive to our research.

Figure 1: Schematic illustration of the CRISPR array found in the bacterial genome [13].

Protospacer Adjacent Motif (PAM) sequences were investigated within the TetM gene. They were searched within the known bacterial genome in the area of TetM to identify the area of interest for the insertion of our gene. Following the identification of a suitable PAM sequence within the TetM gene, a sequence of 33 bases downstream was selected as the protospacer sequence. Protospacer dimensions align with those previously outlined [3].

A more comprehensive investigation was required to determine the leader promoter sequence as per the main article's specifications [3]. Although the article did not explicitly present the sequence, it specified that it came from the beginning of the CRISPR array region in the NCK 1350 bacterial genome. The CRISPR array region contains DNA that has been accumulated from foreign sources along with the DNA repeat sequences which allow clearance of foreign DNA which has already been added to the array. It also specified that 70% of the sequence consisted of A and T nucleotides. To pinpoint the leader promoter, the DNA repeat sequences that are found within the CRISPR array were identified within the NCK 1350 bacterium genome. Following the identification of these sequences, the beginning of the CRISPR array was identified by recognizing the first instance of the repeat sequence. An upstream segment spanning 225 base pairs was recognized as the initiation site for transcription of the CRISPR array. This sequence exhibited the requisite 70% AT content and adhered to established promoter criteria in the scientific literature of 100 to 1000 base pairs [14]. This sequence was then defined as the NCK 1350 leader promoter. Given the proven functionality of the NCK 1350 leader promoter in the referenced article [3], this sequence serves as a positive confirmation of our design process.

Considering the absence of specifically characterized promoters for Lactobacillus crispatus in the existing literature and the lack of bioinformatic tools tailored for promoter identification based on the complete genome sequence of this bacterium, we deemed it pertinent to undertake the endeavor of characterizing a distinct promoter specific to our bacterial strain. This process for identifying the ATCC 33820 leader promotor aligns with our prior NCK 1350 work mentioned above.

As mentioned earlier, the repeat sequence from NCK 1350 was identified within the ATCC 33820 strain as well. The first step of identifying the leader promotor was locating the beginning of the CRISPR array by searching for the first instance of the repeat sequence. A 183-base pair segment in the bacterial genome was then selected upstream of the CRISPR array region. According to the genome, this segment extends until the termination of the preceding gene, "Era". In terms of its length, the chosen segment complies with the criteria established for promoters. Furthermore, its AT content closely resembles that of the leader promoter described for NCK 1350 in the referenced article, registering at 68%. Based on these findings, we can reasonably assume that the leader promotor specific to the ATCC 33820 strain was successfully identified. We aim to contribute to the scientific community by providing insights into a promoter specific to the commercially relevant bacterium ATCC 33820. Additionally, we plan to further characterize its attributes to enhance our understanding of this bacterial strain. This endeavor could greatly advance our knowledge of the ATCC 33820.

To enhance confidence in our strategic design, we employed the BPROM tool for the prediction of bacterial promoter regions within the leader promoter sequences[15]. BPROM is a bacterial Sigma 70 promoter recognition program with 80% accuracy. It improves gene and operon predictions in bacterial genomes by identifying regions before open reading frames. The tool identified concise promoter motifs along with the -35 and -10 regions in each sequence. However, it is worth noting that the tool indicated the absence of an appropriate reference for comparative analysis. Given this consideration, we opted to proceed with the originally planned sequences, as opposed to the shorter motifs identified by the BPROM tool. We made this decision after carefully evaluating all the available evidence.

We chose to integrate a characterized Rho terminator, BBa_B1006, sourced from the iGEM registry [16]. This decision was in line with the collaborative ethos at the heart of the iGEM project. Additionally, it demonstrates our dedication to promoting, sharing, and validating previous iGEM contributions.

Recognition sequences for unique restriction enzymes, specifically those possessing a minimum of six base pairs, were appended to both termini of the cassette. This modification was implemented to facilitate the seamless integration of the cassette into the pTRKH2 plasmid. Specifically, recognition sites were selected to ensure a single cleavage event, thereby enhancing the precision of the insertion process.

Figure 2: Schematic illustration of the guide RNA cassette structure [13].

The donor DNA comprises several integral elements, including a promoter region, homologous arms, RBS, the gene of interest, a terminator, and Internal and external recognition sites for restriction enzymes. The terminator, along with the two leader promoter sequences, are the same as used in the guide RNA (gRNA).

Two homologous arms, spanning 1000 base pairs each, were meticulously devised: one situated upstream and the other downstream of the targeted genomic region where the CRISPR Cas 3 system, is intended to effectuate a nick. This length was determined based on prior research which recognized its efficacy in enhancing homologous recombination events [1].

In the pursuit of identifying a suitable Ribosome Binding Site (RBS) for the designated bacterial strain, a comparative analysis was conducted involving the 16S rRNA regions of both Escherichia coli and Lactobacillus crispatus [17]. The results of this analysis revealed a difference in only one base between the regions, signaling a substantial degree of similarity and congruence. Given this observation, an RBS designed for Escherichia coli was selected with the assumption that Lactobacillus crispatus would recognize this region [18]. It is worth emphasizing that this examination was necessitated by the absence of available information regarding the specific RBS characteristics of Lactobacillus crispatus 33820.

The gene encoding resistance to ampicillin was selected as the reporter gene for this study. It will be integrated through a knock-in process, inside the region associated with the TetM gene within the bacterial genome. This strategy encompasses both knock-in and knock-down processes. Ampicillin resistance gene sequence was taken from Takara's p-BES plasmid [19], and its codons have been optimized to match Lactobacillus crispatus [20], [21]. The rationale behind selecting the ampicillin resistance gene as a reporter gene stems from its utility in facilitating rapid assessment of the integration efficiency. This choice enables swift screening and validation of successful treatments as will be explained in greater detail below.

Similar to the design of the guide RNA (gRNA), distinctive recognition sequences for restriction enzymes were incorporated at both termini of the donor DNA cassette. This modification facilitates efficient integration into the pTRKH2 plasmid.

To make the system a simple, efficient, and versatile tool for both research and industrial purposes, two internal recognition sequences of restriction enzymes were designed. The pTRKH2 plasmid lacks those recognition sequences, characterized by six-base recognition patterns, guaranteeing that the plasmid will not undergo further cleavage. This strategic design enables straightforward substitution and insertion of any desired genetic component for expression in the probiotic bacterium Lactobacillus crispatus.

Figure 3: Schematic illustration of the donor RNA cassette structure [13].

This planning approach culminates in the assembly of the pTRKH2 plasmid, housing both guide RNA (gRNA) and donor DNA featuring an interchangeable target region. This configuration makes our tool accessible and adaptable to a wide range of research requirements. As a result of this development, the probiotic bacteria, L. crispatus 33820, can be used in a wide range of scientific and industrial applications.

The desired plasmid is assembled through two main steps during the cloning process. Firstly, two restriction enzymes are used to cut the pTRKH2 plasmid and the two guide RNAs. The two gRNAs differ only in the source of the promoter used, one from the ATCC 33820 genome and the second from the NCK 1350 genome. After ligation, two lines of plasmid containing gRNA are obtained, which are ready for the second cloning step. After amplifying the first colony product in Escherichia coli bacteria, the In-Fusion reaction inserts the donor DNA. For practical reasons, our donor DNA was divided into two parts due to its extensive length as can be seen on our results page. Two distinct versions of the pTRKH2 plasmid, each containing both guide RNA (gRNA) and donor DNA, will be available for transformation into Lactobacillus crispatus at the end of this procedure.

Figure 4: Schematic illustration of homology-based plasmid assembly, such as Gibson assembly and InFusion.

Figure 5: Schematic illustration of the assembly of our plasmid. In pink and blue are the two different gRNAs, in green is the donor DNA.

After the transformation of the assembled plasmids into Lactobacillus crispatus, two results are expected to be received.

At the genomic level, the AmpR gene sequence was designed to be inserted within the TetM gene in the Lactobacillus crispatus 33820 genome. The result is a knock-in of the Ampicillin resistance gene along with a decrease in functionality of the Tetracycline resistance gene.

Figure 6: Schematic illustration of the integration of the donor DNA (ampicillin resistance) at the targrt site (tetracycline resistance).

The discernible outcome manifests as an alteration in the quantity of colony growth on substrate plates supplemented with the antibiotic ampicillin. This contrasts with those containing tetracycline. This type of variation will be observed between treated bacteria and their untreated counterparts. The efficiency of the transformation and subsequent integration will be evident through the augmented proliferation of genetically modified bacteria on substrates fortified with ampicillin. Conversely, a notable reduction, potentially to the point of complete inhibition will be observed in the number of colonies on substrates supplemented with the antibiotic tetracycline.

Figure 7: Schematic illustration of the expected growth on different antibiotics-containing petri dishes before and after the proposed genomic integration process.

Our unique system, dubbed CRIS, was purposely engineered to facilitate a streamlined, easy to use, and accessible approach to editing the Lactobacillus crispatus genome. This endeavor aims to yield a tailored variant of the probiotic bacterium, customized to specific requirements.

By means of a straightforward procedure, involving concurrent cleavage by the two restriction enzymes MluI and SphI, the plasmid segment harboring the AmpR gene can be substituted with any desired insert for expression by the bacterium, thus realizing the goal of genetic engineering of L. crispatus.

Figure 8: Schematic illustration of replacing the donor DNA.

Our newly devised tool facilitates the direct integration of genes into bacterial genomes, allowing for constitutive and sustained genomic editing. The introduced gene insert will replicate within the cellular pathways, ensuring a continuous production of the desired product irrespective of the utilized culture medium. This approach differs from the conventional genetic engineering method involving plasmids, which necessitates the use of antibiotics for plasmid maintenance in bacteria. This common practice heightens the potential for horizontal transfer of antibiotic resistance genes among laboratory bacteria, thereby elevating the risk of antibiotic resistance development. Consequently, our solution addresses a critical treatment goal advocated by the World Health Organization (WHO), mitigating the risk of antibiotic resistance emergence through vertical gene transfer [22].

Employing the CRIS system enhances the likelihood of sustained expression of the target gene within the bacterium, thereby simplifying the handling of probiotic strains like Lactobacillus crispatus. By design, our system allows the introduction of any desired gene using simple restriction of our final plasmid. By using this system, the existing cassette can easily be replaced with any desired cassette such as those from the protect, attack and compete elements of our project, enabling their simple integration and added functionality to this probiotic bacteria. This heightened accessibility extends the applicability of the bacterium across various domains in both laboratory and industrial settings, with potential applications yet to be fully realized. Additionally, it is noteworthy that a reduction on the reliance of specific bacterial growth mediums for target protein expression holds significant advantages and may expedite the transition from in vitro to in vivo experimentation. As displayed here, the potential impact of our system is vast. We are eager to further advance its development beyond the initial proof of concept.

On the results page, you can find details about the laboratory process we performed and the results we obtained as well as information regarding experiments that support our design model.


  1. M. Clabaut et al., “Draft Genome Sequence of Lactobacillus crispatus Strain V4, Isolated from a Vaginal Swab from a Young Healthy Nonmenopausal Woman,” Microbiol. Resour. Announc., vol. 8, no. 38, pp. e00856-19, Sep. 2019, doi: 10.1128/MRA.00856-19.
  2. “Cytoprotective Effect of Lactobacillus crispatus CTV-05 against Uropathogenic E. coli - PMC.” Accessed: Oct. 19, 2023. [Online]. Available:
  3. C. Hidalgo-Cantabrana, Y. J. Goh, M. Pan, R. Sanozky-Dawes, and R. Barrangou, “Genome editing using the endogenous type I CRISPR-Cas system in Lactobacillus crispatus,” Proc. Natl. Acad. Sci., vol. 116, no. 32, pp. 15774–15783, Aug. 2019, doi: 10.1073/pnas.1905421116.
  4. D. P. Stephenson, R. J. Moore, and G. E. Allison, “Transformation of, and Heterologous Protein Expression in, Lactobacillus agilis and Lactobacillus vaginalis Isolates from the Chicken Gastrointestinal Tract,” Appl. Environ. Microbiol., vol. 77, no. 1, pp. 220–228, Jan. 2011, doi: 10.1128/AEM.02006-10.
  5. S. S. Beasley, T. M. Takala, J. Reunanen, J. Apajalahti, and P. E. J. Saris, “Characterization and Electrotransformation of Lactobacillus Crispatus Isolated from Chicken Crop and Intestine,” Poult. Sci., vol. 83, no. 1, pp. 45–48, Jan. 2004, doi: 10.1093/ps/83.1.45.
  6. T. A. Kazi et al., “Plasmid-Based Gene Expression Systems for Lactic Acid Bacteria: A Review,” Microorganisms, vol. 10, no. 6, Art. no. 6, Jun. 2022, doi: 10.3390/microorganisms10061132.
  7. Understanding CRISPR-Cas9. [Online Video]. Available:
  8. Y. Xu and Z. Li, “CRISPR-Cas systems: Overview, innovations and applications in human disease research and gene therapy,” Comput. Struct. Biotechnol. J., vol. 18, pp. 2401–2415, Sep. 2020, doi: 10.1016/j.csbj.2020.08.031.
  9. J. D. Watson, Ed., Molecular biology of the gene, 7. ed., Student ed. in Always learning. Boston Munich: Pearson, 2014.
  10. “Optimizing CRISPR Knock-ins: Tips and Tricks For Successful Knock-in Editing,” Synthego. Accessed: Oct. 19, 2023. [Online]. Available:
  11. “ATCC Genome Portal,” ATCC Genome Portal. Accessed: Oct. 16, 2023. [Online]. Available:
  12. “Addgene: pTRKH2.” Accessed: Oct. 16, 2023. [Online]. Available:
  13. “BioRender.” Accessed: Oct. 07, 2023. [Online]. Available:
  14. “Frontiers | Classifying Promoters by Interpreting the Hidden Information of DNA Sequences via Deep Learning and Combination of Continuous FastText N-Grams.” Accessed: Oct. 16, 2023. [Online]. Available:
  15. “BPROM - prediction of bacterial promoters.” Accessed: Oct. 06, 2023. [Online]. Available:
  16. “Part: BBa_B1006:Hard Information -” Accessed: Oct. 07, 2023. [Online]. Available:
  17. “Multalin.” Accessed: Oct. 07, 2023. [Online]. Available:
  18. H. M. Salis, “Chapter two - The Ribosome Binding Site Calculator,” in Methods in Enzymology, vol. 498, C. Voigt, Ed., in Synthetic Biology, Part B, vol. 498. , Academic Press, 2011, pp. 19–42. doi: 10.1016/B978-0-12-385120-8.00002-4.
  19. “3380_UM.pdf.” Accessed: Oct. 07, 2023. [Online]. Available:
  20. “Codon usage table.” Accessed: Oct. 07, 2023. [Online]. Available:
  21. “Genius.” Accessed: Oct. 07, 2023. [Online]. Available:
  22. “WHO outlines 40 research priorities on antimicrobial resistance.” Accessed: Oct. 17, 2023. [Online]. Available: