iGEM Engineering Cycle
The experiments for our project have been designed using the fundamental principles of synthetic biology and the engineering design cycle: research, design, build, test, learn, improve.
Dry Lab Engineering success
In the Dry lab section of the project, we conducted two cycles of Engineering with major aim in tuning the dataset, preprocessing strategy, data augmentation, modeling and hyperparameter selection for optimization.
CYCLE 1:
DESIGN:
We primarily focus on the early detection of Mild Cognitive Impairment (MCI) stage. The dataset is available in Raw structural MRI format. It has to be pre-processed to be suitable for training. Initially, the team used MATLAB’s package with a GUI based interface to perform the pre-processing. The drawback was it was time consuming, it had huge memory requirements because of the GUI. So we decided to move forward in using Python based packages for performing the similar operations and run it on a Cloud platform.
The pre-processing was then done for 2294 images using Google Colab platform which had adequate memory and computation. This was the first dataset used to train the model.
BUILD:
The dataset obtained from the pre-processing is stored in Drive so that it can easily be used by Colab during training steps. There was another problem which occurred because of the storage of a large number of images in Drive. The Model timeout from the input output lines of the drive because it needs to search the whole drive to find out the corresponding image. So, to solve this issue, we hierarchy ordered the images into a folder so as to enable easy access.
After this the pure classical models and hybrid quantum models are built using Pytorch and Pennylane framework. We froze half the classical layers to avoid overfitting the images and trained the classical model. On the other hand, we used an angle embedding followed by a trainable ansatz with entanglement based quantum model for hybrid approach.
TEST:
The testing is done by training the model and evaluating the accuracy during inference. The overall dataset is split into train and test sets for this purpose. We first trained the classical model for 50 epochs using the GPU’s available in Colab. After the 50 epochs, the classical model was able to gain a test accuracy of around 70%. On the other hand, the hybrid approach gave a test accuracy of around 52%.
Even after training for 50 epochs, initially we weren’t able to achieve good accuracy. So one of the possible reasons is that the dataset is not sufficient to train the model.
LEARN:
Thus, in the next iteration for this cycle, we used 3K images and analyzed the performance. The classical model was able to train better and reached a test accuracy of around 90%.
CYCLE 2:
DESIGN:
In this cycle, the team’s primary focus is to use the advantages of Quantum Computing with the deep learning model to gain more accuracy and converge faster. So, we tried to use a hybrid model with the deep learning model as a feature extractor and the quantum model as a classifier. The model was trained using the 3K data based on the results from our previous cycle 1. It was trained on the Google Colab cloud platform with GPU being enabled.
We downsized the features from the classical model to map directly into the Y angles of the quantum model. To prove that this downsizing won’t cause any problem we trained a pure classical model with downsizing and evaluated the performance. We were able to achieve the same accuracy without the downsizing.
BUILD:
In this stage, we built the hybrid Quantum model with the classical feature extracted being downsized to number of qubits. As the initial model, we used Y based embedding followed by trainable ansatz in the X direction. We did this because in the perpendicular dimension we can directly insert a plane between the classes.
But, after training the hybrid QML model, we achieved very bad accuracy of 65% which is much lower than classical. The classical model was also trained during this process to extract better features and pass it to the Quantum model.
TEST:
Based on the training results, the hybrid model also performed poorly in the test set. So we continued to the next iteration of the cycle by using more qubits. This took a lot of training time and memory and so we moved to increasing the depth in the next iteration. This causes the barren plateau issue and so we need to find a mid way between these two.
LEARN:
As the iteration proceeded, we figured out that embedding and training the ansatz in the same Y axis was helpful and the hybrid model reached an accuracy of 80% in the test set.
Wet Lab Engineering success
We conducted two cycles of Engineering with three primary aims, design a Clostridium compatible CRISPR genetic system, map out synthetic pathways and enable production of the biotherapeutic.
CYCLE 1:
DESIGN:
These two methods of CRISPR gene editing were designed using CRISPR-sP. Cas9 system to be transfected into C. butyricum. We attempted to harness the Sp. Cas9 system, identifying a PAM sequence and designing a synthetic CRISPR array.
BUILD:
The target genes and mutations were designed, including gene deletions and modifications.
TEST:
The expression of toxic cas9 gene resulted in really poor transformation, spurring us to develop an easy-applicable and high-efficient genome editing tool.The CRISPR-Cas9 and endogenous CRISPR systems were tested by transforming plasmids into strains and inducing them, followed by screening mutants. Protocol required the mutations were verified by PCR, sequencing and phenotypic assays. The edits were tested for effects on sporulation, solvent production
Unexpected result: Transformation of plasmids into C. butyricum failed.
Potential Reasons: Cas9 causes toxicity and transformation inefficiencies, hence there is a need for inducible control? Leaky expression of Cas9 before induction could kill cells, mutations may occur at low efficiencies or produce unintended additional mutations.
LEARN AND IMPROVE:
Learned Cas9 causes toxicity and transformation inefficiencies, hence the need for inducible control. To achieve this and add a layer of safety, we decided to use the Ribo switch developed by the University of Nottingham iGEM team in 2019.
Extensive literature review helped us determine that endogenous CRISPR machinery of C.butyricum gave higher efficiency, demonstrating the value in harnessing native CRISPR machinery (Zhou et. al,2021) .Alternatively, we can use an Anti-CRISPR system to control toxicity
Some other design considerations while we were revisiting the design include using other CRISPR nucleases such as Cas12a or a base editing nickase to create a single stranded break. However, this made editing Clostridium a little more challenging and we decided against either of these methods (Huang et. al,2016).
Instead of the single plasmid system, we decided to adopt a double plasmid system with the gRNA and Cas9 on separate plasmids. This could help us pinpoint our mistakes better in future iterations.
CYCLE 2-3:
DESIGN:
The choice between a one-plasmid or two-plasmid CRISPR-recombineering system involves important design tradeoffs.
In the second cycle of this experiment, we created a two plasmid system, one to house the Cas9 nuclease and the other for the gRNA. We also attempted to use the native megaplasmid of C. butyricum (Zhou et. al, 2021). However our design and experiments failed this time around as well.
In the last design cycle, we chose between Anti-CRISPR (Wasels et. al,2020) and codon optimized Sp. Cas9 (Mamou et. al, 2020). We chose to implement the codon optimized Sp.Cas9 to make our engineered probiotic.
BUILD:
Our part:
The pCas9 plasmid contains the cas9 gene under control of an inducible promoter (Pcm-tetO2/1) that is repressed by TetR (Mamou et. al,2023).
- The pGRNA-template plasmid contains the guide RNA (gRNA) expression cassette and the editing template (Wasels et. al, 2017).
- The gRNA targets the bacterial chromosome site for Cas9-induced double strand break (DSB).
- The editing template contains homology arms for repairing the DSB via homologous recombination.
- Addition of anhydrotetracycline induces cas9 expression and activation of the CRISPR-Cas9 system.
- This induces a DSB at the target chromosome site that can only be repaired using the provided editing template, allowing selection of properly edited cells.
Advantages:
Inducible control over Cas9 expression reduces toxicity.
Editing template facilitates repair of Cas9-induced DSB.
Two plasmid systems separate cas9 from gRNA for flexibility.
Allows iterative rounds of genome editing.
Enables scarless, markerless genome modifications.
TEST:
LEARN AND IMPROVE:
Created and characterized a two plasmid genetic toolkit as a modular platform to engineer probiotics for the iGEM Competition,these can be adapted to various genus and strain with the codon optimized Sp.Cas9. Therefore, these can be expanded to non model gut commensals.
Designed an auxotrophic measure to allow the survival of the strain with the supplement of unnatural amino acid for biocontainment (See safety page).
This two plasmid system has shown to be effective in editing the Clostridium genus, and is apt to create a butyrate upregulating probiotic.
References
Zhou X, Wang X, Luo H, Wang Y, Wang Y, Tu T, Qin X, Su X, Bai Y, Yao B, Huang H, Zhang J. Exploiting heterologous and endogenous CRISPR-Cas systems for genome editing in the probiotic Clostridium butyricum. Biotechnol Bioeng. 2021 Jul;118(7):2448-2459. doi: 10.1002/bit.27753. Epub 2021 Apr 14. PMID: 33719068.
Vento JM, Crook N, Beisel CL. Barriers to genome editing with CRISPR in bacteria. J Ind Microbiol Biotechnol. 2019 Oct;46(9-10):1327-1341. doi: 10.1007/s10295-019-02195-1. Epub 2019 Jun 5. PMID: 31165970; PMCID: PMC7301779.
Huang H, Chai C, Li N, et al. (2016) CRISPR/Cas9-based efficient genome editing in Clostridium ljungdahlii, an autotrophic gas-fermenting bacterium. ACS Synth Biol 5:1355–1361
Mamou Diallo, Rémi Hocq, Florent Collas, Gwladys Chartier, François Wasels, Hani Surya Wijaya, Marc W.T. Werten, Emil J.H. Wolbert, Servé W.M. Kengen, John van der Oost, Nicolas Lopes Ferreira, Ana M. López-Contreras,Adaptation and application of a two-plasmid inducible CRISPR-Cas9 system in Clostridium beijerinckii,Volume 172, 2020, Pages 51-60,ISSN 1046-2023,https://doi.org/10.1016/j.ymeth.2019.07.022.
François Wasels, Jennifer Jean-Marie, Florent Collas, Ana M. López-Contreras, Nicolas Lopes Ferreira,A two-plasmid inducible CRISPR/Cas9 genome editing tool for Clostridium acetobutylicum,Journal of Microbiological Methods, Volume 140, 2017, Pages 5-11, ISSN 0167-7012,https://doi.org/10.1016/j.mimet.2017.06.010.
Wasels F, Chartier G, Hocq R, Lopes Ferreira N. A CRISPR/Anti-CRISPR Genome Editing Approach Underlines the Synergy of Butanol Dehydrogenases in Clostridium acetobutylicum DSM 792. Appl Environ Microbiol. 2020 Jun 17;86(13):e00408-20. doi: 10.1128/AEM.00408-20. PMID: 32385078; PMCID: PMC7301843.