Engineering Success

Demonstrate engineering success in a part of your project by going through at least one iteration of the engineering design cycle.

Purpose And Function

The purpose of our design is to allow E. coli bacteria to produce caddisfly silk solution: specifically, silk of the caddisfly species Glyphotaelius pellucidus. The silk made by caddisflies (including G. pellucidus) consists of two proteins: a long “heavy chain” h-fibroin and a “light chain” l-fibroin; these two proteins are fused together via disulfide bond formation to help create the structure of the caddisfly silk. [1]

Choosing A Species:

The first design-related decision was choosing which particular species of caddisfly to use for this project. First, for a species to be used for expressing its silk in E. coli, it needs to have the sequence for its silk proteins published; this alone narrows the possible choices considerably. Second, in order to further narrow our choices, we must pick between the three broad types of caddisflies: cocoon-makers, case-makers, and retreat-makers (i.e. net spinners). [2] The larvae of cocoon-making caddisflies use their silk to form cocoons, the larvae of case-making caddisflies use their silk to create protective cases around their growing bodies, and the larvae of retreat-making caddisflies use their silk to spin nets which catch food particles. [3]

For this project, we chose a case-making caddisfly for several reasons. First, retreat-making caddisflies exhibit a high proline content, ranging from 9.9 to 12.3%, while case-makers have a much lower proportion of proline, ranging from 4 to 5.6%. [2] Since proline inhibits secondary structure formation in proteins, a lower proline content will make it easier to ensure the constructed sequence is capable of forming beta sheets. Second, the repeat pattern in the h-fibroin gene of retreat-makers is much more complicated than that of case-makers,3 which would make it more difficult to truncate the h-fibroin gene of a retreat maker. (See next section below, “Overcoming Challenges In Building The Design,” for more details on why the h-fibroin has to be truncated in size.)

Cocoon-making caddisflies have lower proline content, and an even more simple genetic repeat pattern than that of case-makers. [2] However, we decided against using a cocoon-making caddisfly, due to concerns that cocoon silk would not work well as an adhesive. In contrast to silk sticking to itself in order to form a cocoon, the silk spun by case-making caddisfly larvae sticks to pebbles and other underwater objects, in order to form the larvae’s protective case.

Prof. Russell Stewart of the University of Utah pointed us to a pre-print (now a published study), which sequenced the genes of several different caddisfly species, including different case-making species. While the paper describes in words the genetic repeat patterns for each of the species sequenced, there is only one case-making caddisfly species for which the paper has a visual diagram, showcasing the structure of the h-fibroin gene: Glyphotaelius pellucidus. [3] Knowing that a visual illustration of the h-fibroin genetic pattern would greatly help us in truncating the gene, we chose G. pellucidus, it being the only case-making caddisfly species for which a visual diagram of the h-fibroin gene exists.

The above figure comes from Heckenhauer, J., & Stewart, R. J., et al. (2023, August 18). “Characterization of the primary structure of the major silk gene, h-fibroin, across caddisfly (Trichoptera) suborders.” DOI: 10.1016/j.isci.2023.107253

Overcoming Challenges In Building The Design:

The first design challenge arises with the length of the heavy chain: the h-fibroin of G. pellucidus, without introns, is 23,595 base pairs long - simply too long to give to bacteria as a regular plasmid. [2] A second challenge is that the h-fibroin of Caddisflies, including G. pellucidus, has too many repeated motifs in its DNA to be ordered and synthesized from a biotech company - in our case, our sponsor IDT (Integrated DNA Technologies). With regards to G. pellucidus, the h-fibroin contains two DNA motifs, labeled RM1 and RM2, which are repeated throughout most of the protein. [3] Our multiple approaches to sequence analysis, which includes scripts for the identification of repeats and visualization of GC content, may be found on our software GitLab.

Fortunately, this repetitive nature of the h-fibroin allows for a solution to both the first and second challenges: the length of the h-fibroin can be truncated by reducing the number of repeats. In our design, we truncated the h-fibroin of G. pellucidus to contain only one RM1 and one RM2 region.

One might ask if changing the gene, by shrinking it via a reduction in the number of repeats, will affect the efficacy of the silk proteins produced. As Prof. Stewart informed us, however, research has shown that the number of repeats naturally varies within a population of caddisfly larvae of the same species. [2],[5] This suggests that an exact number of genetic repeats is not needed for silk protein efficacy.

The third design challenge comes from the need for disulfide bond formation. As mentioned before, the silk structure of caddisflies involves the h-fibroin protein and the l-fibroin protein being connected together with disulfide bonding.[1] But E. coli are not normally well adapted to disulfide bond formation in proteins they express.[6] One solution implemented was use of T7 Shuffle strain E. coli as a chassis, since that strain of bacteria has been engineered with the purpose of boosting disulfide bond formation in proteins expressed. [7] A second solution was co-expressing the fusion protein PDI-GPx7 - a protein created to further boost disulfide bonding for proteins forming in T7 Shuffle strain E. coli. [8]

A fourth design challenge, unfortunately, was created by our use of T7 Shuffle bacteria. T7 Shuffle E. coli are resistant to the same antibiotic, Spectinomycin, used to plate the level 2 Golden Gate plasmid backbone (pJUMP49-2A(sfGFP)) used in this project. [9] This could severely hinder protein purification, as we would not know which colonies to select for purification, since we would have no way of knowing which E. coli colonies took up the plasmid and which did not. (If the T7 Shuffle strain was not naturally resistant to Spectinomycin, we would be able to assume any surviving colony must have been given resistance from the backbone of our assembly.) The best possible solution was adding mCherry to the fourth slot in our Golden Gate plasmid assembly; this way, we will be able to see which colonies take up the plasmid and which don’t, as colonies which do take up the second assembly will glow red.

This brings us to the final challenge: protein purification. In order to conduct protein purification, we added a His-Tag and a Sumo sequence to the c-terminus of the l-fibroin. We specifically chose to add these sequences to the c-terminus of the l-fibroin, for the following reasons. One, the n-terminus of the l-fibroin is the docking site for the disulfide bond with the c-terminus of the h-fibroin [10]; thus neither the n-terminus of the l-fibroin nor the c-terminus of the h-fibroin could work. Two, the h-fibroin is a monomer whereby, in forming the silk structure, it uses its n-terminus to attach to other h-fibroins [11]; thus the n-terminus of the h-fibroin could not work either. Therefore, the c-terminus of the l-fibroin is the only possible location where a His-Tag and Sumo sequence can be added for protein purification.

Testing The Design, And Learning From The Results:

A. Testing The Assembly

One key test we implemented on our design is determining if the plasmid, described above, is properly assembled. One way to test assembly is the color of the bacterial colonies transformed. If the colonies are green, that would suggest the GFP sequence from the plasmid backbone was not removed by the enzyme (for our project, this would be Bsa1 enzyme), and thus was not replaced with the four parts (truncated h-fibroin, l-fibroin, fusion protein, and mCherry) listed above.

Green colonies

After our first attempt at assembly resulted in green colonies (shown above), indicating a failed attempt, we did two more tests. First, a restriction digest, to check if our Bsa1 enzyme is actually cutting; and whole plasmid sequencing, to see if the desired plasmid ended up being produced. The results of the restriction digest were visualized using a gel, as shown below. Read from left to right, lanes 1 and 7 are 1kb plus ladders, and lanes 2 and 8 are 100bp ladders. In the middle are the pJUMP1A-h-fibroin, pJUMP1B-l-fibroin, pJUMP1C-fusion protein, and pJUMP1D-mCherry respectively:

diagnostic restriction digest gel

The gel bands indicate the Bsa1 enzyme used was not cutting. The expected bands that should be in lane 3 of the gel above, if the digest worked, are shown below in the virtual digest. (All five lanes in the image below are for the pJUMP1A-H-fibroin.) Since there is no band around 800 to 900 bp, or any near 100 bp, there were no cuts. The three bands on the gel are from the lengths of the different uncut plasmids and the h-fib gene fragment.

virtual restriction digest for pJUMP1A-h-fibroin

A similar lack of expected results occurred with the other lanes in the gel. The expected results are shown below:

virtual restriction digest for pJUMP1B-l-fibroin
virtual restriction digest for pJUMP1C-fusion protein
virtual restriction digest for pJUMP1D-mCherry

We learned from these results that the Bsa1 we were using was not functional, and that new Bsa1 enzyme needed to be obtained for the next attempt at assembly.

The whole plasmid sequencing further confirmed that the enzyme did not remove GFP from the plasmid backbone, as shown below:

virtual restriction digest for pJUMP1A-h-fibroin
annotated plasmid for pJUMP1B-l-fibroin
annotated plasmid for pJUMP1C-fusion protein
annotated plasmid for pJUMP1D-mCherry

B. Testing Protein Expression

Another key test is simply obtaining a product at the end. We designed the plasmid to have a His-Tag on the l-fibroin c-terminus, in order to do protein purification. Therefore, if the plasmid is expressed correctly inside the Shuffle T7 cells, we will be able to purify a protein product.

C. For Future Projects: Ordering A Bigger H-Fibroin

Russell Stewart advised us that the smallest possible truncated h-fibroin could function as a control, against which longer truncated versions can be compared for silk properties such as tensile strength and adhesion.

In our endeavor to make a bigger sequence: we tested it on the IDT wizard and ran into a few issues with the compatibility score. The GC rich regions would have to be removed in order for the sequence to be compatible with the IDT regions. Additionally, sequences that were too repetitive would have to be axed. We discarded the idea of attempting to create a longer h-fibroin than our control because it seemed unlikely that we would be able to express and then purify enough protein to compare the qualities of the variations.

D. Our Plan For Before Jamboree

Once we acquire new BsaI, we will repeat the Level 1 assembly with BsaI into pJUMP29-1A thru 1D. After transformation in stable competent cells, we will inoculate the transformed cells in LB and miniprep to get DNA for the four level 1 transcriptional units. Then we will do Level 2 assembly with BsmBI into pJUMP49-2A. Following transformation, inoculation, and miniprep, we will send our final assembled plasmid for whole plasmid sequencing to get confirmation of successful assembly. Next, we will transform the level 2 assembled plasmid into Shuffle T7 Express cells for protein expression. From the plate, we will pick the red colonies that express mCherry to inoculate in a large liquid culture. Lastly, we will use that culture to perform His-Tag protein purification using column filtration and use a SUMO protease to cleave off the His-Tag.