Synthetic biology can be described as the genetic redesign of living organisms to fulfil a purpose, and with synthetic biology comes bioengineering.

Throughout the whole process of creating AlgaGenix we have used the bioengineering principles while designing the different parts of our genetic circuits and therefore understand, modify and control the living systems we are working with. The engineering design cycle (Define, Ask, Design, Build, Test, and Learn) serves as the basis for every bioengineering process and our project is no exception.

One significant advantage of using this cycle as an outline for our project development is that once you start your first engineering cycle, if anything unexpected happened you can re-enter it at any point to further develop and improve the process and your final product.

Define

Identifying and defining the problem is a crucial first step for every project.

Our project was primarily motivated by one of our main local problems: water nitrification.

Due to climate crisis and intensive human activity such as farming, agriculture, industry and sewerage, good-quality water has never been so scarce. Our water reservoirs are at all-time low levels; however, the greater issue is that this little water that we still have remaining is of really poor quality, its main pollutants are nitrates, which are building up to create a significant strain on our resources.

This issue has a significant effect on society, studies show that more than 40% of Catalonia’s municipalities- around 3 million people*- are affected by this problematic as levels of nitrates in local aquifers skyrocket. Finding a solution is urgent as nitrates have a major negative impact on health; oxidizing the iron molecules from the hemoglobin in our blood causing a deficiency in the oxygen transport through our bodies.

Ask

Now that we have the problem we aim to tackle clearly defined, the next step is to ask ourselves "How can we help? How can we make a difference? What are the most urgent points that need to be addressed?"

Our goal is to design a functional, affordable unit able to absorb and convert pollutants, so that any person affected by contaminated waters of this nature can feel safe while drinking.

The solution we came up with is AlgaGenix, where we aim to improve the natural ability of the microalgae model organism Chlamydomonas reinhardtii to absorb nitrates so that these are removed from polluted water, and we hope to go one step further and use these absorbed nitrates to produce a value-added product: cytokinin-rich biomass, which can later be used as a fertiliser, while improving water quality.

To see an extended explanation of the questions our project revolves around, please visit Our approach.

Design

Finding the key genes involved in the nitrates absorption route and cytokine generation was the first step in the construction of AlgaGenix.

Five genes can be distilled from the literature: Nitrate Reductase, Nitrites Reductase and Glutamate Synthetase for nitrates and IPT and LOG for cytokines. From now on, the different vectors containing these genes will be referred as “pathway genes”.

Please visit the parts page for further details on how each gene functions and is used in our project.

Before getting started with the enhancement of nitrates assimilation and cytokine production, we wanted to ensure that we would be able to detect said changes in the concentration of nitrates in polluted water.

To this end, we decided to synthesize a Nitrates Sensor similar to that of Crozet et al. (2018), yet with a small change: our sensor doesn't have an antibiotic resistance gene linked to the reporter (GFP in the case of the paper), instead, we decided to make distinct cassettes, one containing mVenus (pCM1: mVenus), one with mCherry (pCM1: mCherry), as well as an individual cassette for hygromycin resistance (pCM1: HygroR).

pCM1 will include a promoter (constitutive or nitrates-dependant), the gene and a constitutive terminator.

Resistance cassette will be used for colony selection and ensuring the assembly of the vectors. Moving into the fluorescent cassettes, the purpose of having two reporters is to be able to measure the plasmid's expression level depending on both, the location of its insertion within the Chlamydomonas genome using a constitutive promoter (PPSAD assembled into pCM1: mCherry) and at the same time measure the expression dependent on nitrates using a promoter highly induced on nitrate (PNIT, paired in pCM1: mVenus).

To summarize, on one hand, the mCherry signal is expected to be showing constitutively and its intensity will depend on the part of the genome it is inserted in. On the other hand, we will see mVenus’s signal in the presence of nitrates, thanks to the PNIT promoter.

Later, these 3 cassettes will be combined to create the final Level M of the sensor and included in the Level Ms of the different pathway genes.

After designing our sensor, we proceeded to design the pathway genes.

We decided to use individual genes, paired genes, and a combination of all genes. All of these Level M would be created with the same Level 1 included; this would eliminate the need for several Level 1s with the same gene to be produced and allow for the most effective use of our resources.

Level 1s are transcriptional units with different pieces available for assembly, which gives us a wide range of options to connect the different parts of our vector.

Usually, Level 1s have an A1-B2 region that includes promoter+5’UTR, B3-B5 with coding sequences, and B6-C1 containing terminator+ 3’UTR.

All AlgaGenix’s Level 1s will be put together using the following strategy:
- A1-B2: PPSAD promoter including 5’UTR region
- B3: coding sequence for the target gene which may be: NR, NiR, GS, IPT or LOG.
- B4: this piece will be an F2A self-cleaving peptidase that will act as a link with the next B5 piece.
- B5: will include a NanoLuc reporter
- B6-C1: TPSAD terminator including 3’UTR

There is one exception that slightly differs from this scheme, with the Cp-GS Level 1 we have reduced the space occupied by the PPSAD promoter and 5’UTR to be able to fit a chloroplast transit peptide as a B2 piece without having to redesign all the other parts.

The goal of these Level 1s is to overexpress the genes.

Design of level M:

All Level M also follow the same strategy only changing the pathway genes that are enhanced by changing the gene of Level 1 that we would be assembling into each Level M.

General chassis is formed by four cassettes:
- Resistance cassette: pCM1 HygroR
- Gene cassette: pCM1 NR/NiR/GS/Cp-GS/IPT/LOG
- mVenus cassette: pCM1mVenus (nitrate-dependant)
- mCherry cassette: pCM1mCherry (constitutive)

We have organized the Level Ms into three big groups depending on the combination degree of the pathway genes:
- Individual genes
- Pair genes
- Combination of all genes

Position of the cassettes inside the Level M will vary depending on the combination degree:

Individual genes

The aim of the Level M is to confirm the expression and position of the integrated vectors.

It will have the resistance cassette (pCM1: HygroR) on position 4, the gene cassette (which can be pCM1: NR, NiR, GS, Cp-GS, IPT or LOG) on position 5, followed by the two reporter cassettes; first mCherry (pCM1: mCherry with the PPSAD promoter) on position 6 and mVenus (pCM1: mVenus with the PNIT promoter) on position 7.

Pair genes

The goal is to see the effects of overexpressing the gene pairs of each step; NR+NiR being first steps of nitrates assimilation, and GS+CpGS final step of nitrates assimilation. Then, IPT+LOG for cytokine synthesis.

In this assembly it was important to think how to move the different pieces in order to make the less changes possible; the more pieces we can keep having in the same position as in the Level M for individual genes, the better. At the end, we have moved the resistance cassette to position 3 (this means we have to assemble it with the proper overhangs for position 3), NR, GS and IPT to position 4 but we are keeping, NiR, CpGS, LOG, mCherry and mVenus cassettes in the same positions

Combination of all genes

Lastly, to see the effect of overexpressing all genes at the same time, the final Level M would be the combination of all pair genes, the problem is that MoClo’s Level M doesn’t have enough positions to fit the resistance, the reporters and the cassettes of all genes in the same vector. It has up to 7 positions and we would need to fit 9.

To solve this, we decided that we would create 2 more resistance cassettes with resistance for different antibiotics and perform three different transformations:
1. First transformation with the NR-NiR paired assembly with HygroR
2. Second transformation with paired GS-Cp-GS with a different resistance.
3. Third and last transformation with paired IPT-LOG with another resistance.

Once we had all the cloning plans, we went ahead with adapting the sequences for MoClo.

We added to each MoClo piece its own unique overhangs in order to be assembled on the right backbone vector and in the right place. In addition, to make our sequences suitable for MoClo, we had to eliminate the recognition sites for BbsI and BsaI by changing them with synonymous codons.

The following step was to purchase the designed sequences. However, this is the point we first had to stop our engineering cycle.

Chlamydomonas reinhardtii’s genes have a characteristically high GC content, which made it impossible to synthesize them commercially.

Building

In order to solve this huge obstacle, we designed an optimizing code.

Chlamydomona reinhardtii is a unicellular green alga that serves as a model organism for research in several fields. One challenge that researchers encounter when working with Chlamydomonas is its high GC content in its genetic code, which often leads to issues during gene synthesis due to repetitive sequences. Here, we present a Python script developed to optimize the sequences of Chlamydomonas genes with a very high GC content, making them more suitable for gene synthesis.

The initial step of the optimization process involves providing an input sequence with a high local GC concentration, which is characterized by repetitive sequences. This sequence served as the basis for further analysis and optimization.

To address the high GC content issue, we have designed a codon table that associates each triplet codon with its corresponding amino acid and a fractional value. This fractional value represents the optimized probability of using a particular synonymous codon. By utilizing this codon table, we can make informed choices for synonymous codons to reduce the overall GC content in the gene sequence.

One of the critical challenges when dealing with high GC content in Chlamydomonas gene sequences is the presence of repeated sequences. Repetitive sequences exacerbate the problem during gene synthesis, as they lead to an overabundance of guanine (G) and cytosine (C) nucleotides. Therefore, the script is designed to identify these repetitive patterns within the gene sequence.

The heart of the optimization process involves an iterative loop. The script repeatedly makes codon changes based on the codon table and monitors the resulting sequence for two criteria:

Acceptable GC content: The GC content of the sequence should be lowered to a level where gene synthesis becomes feasible. In the case mentioned, the target was to achieve a GC content that was below the problematic 66.30% threshold.

Reduced Repetition: The script aims to reduce the frequency of repetitive sequences to ensure that the synthesized gene is unique and non-repetitive.

The iteration continues until the sequence meets both criteria. In the case that the GC content and repetition issues persist, the codon changes are applied up to 10,000 times to arrive at an optimized sequence.

In conclusion, the Python script described provides an effective solution to the challenge of high GC content in Chlamydomona reinhardtii’s gene sequences. By utilizing an optimized codon table and iterative optimization, the script can produce gene sequences with reduced GC content and minimal repetition, making them suitable for gene synthesis.

This approach ensures that Chlamydomonas genes can be synthesized without encountering the issues associated with high GC content and repetitive sequences, thus facilitating research and genetic engineering involving important model organisms.

Test

To verify the sequences we extracted from the code, we confirmed that the GC content had been approved by the company that was responsible for synthesizing them. We then matched the sequences with the originals to see if the amino acids had not changed.

The company accepted the sequences for synthesis and the alignment was a success.

Building

Once the sequences arrived, we continued with our second building step: the assembly of the constructs.

MoClo is based on the Golden Gate assembly and uses type II S restriction enzymes, BbsI is used to assemble pCM0s, BsaI is used to assemble pCM1s, and BbsI once more is used to assemble pCMMs.

All transformations were first done in E. coli to accelerate the process, because Chlamydomonas reinhardtii needs up to 10 days to grow, whereas with E.coli we could see colonies the following day after plating.

We successfully assembled and transformed both Level 1 and Level M of the sensor in E.coli and Chlamydomonas reinhardtii.

Moreover, we were able to assemble Level 1s of all the pathway genes, and currently working on transforming them into E.coli and Chlamydomonas reinhardtii.

Test

We tested the successful transformation of the sensor in Chlamydomonas reinhardtii:

We are currently checking its function and the transformation of pathway genes into Chlamydomonas reinhardtii.