SOFTWARE

SeqPredict

“SeqPredict” is a machine learning-based tool developed by the iGEM team, SVCE-CHENNAI 2023. We aim to streamline the process of gene circuit design and assembly, saving teams valuable time and effort by developing software that can locate RBS, Anderson Promoter Regions, and Flanking Regions within a given DNA sequence. These biobricks play an important role in the design and development of successful synthetic biology projects.

The developed code provided below is a Python program that uses biopython and the Tkinter library to create a graphical user interface (GUI) for analyzing the location of DNA features such as RBS, Start Codon, and Flanking DNA sequences.

It searches for a specific DNA motif (RBS - Ribosome Binding Site) and identifies its position in both the forward and reverse directions. Additionally, it calculates the length of the flanking region and checks whether it falls within an ideal range.


[You can see our source code here: SeqPredict]


Here's a step-by-step protocol to use and get the output from this code:

1. Install Required Libraries:

Ensure that you have the necessary libraries installed. You need to have Tkinter, Biopython, and Bio installed. You can install them using pip if they are not already installed.

2. GUI Application Opens:

A GUI window titled "iGEM SVCE-CHENNAI 23" should open if you click on the 'run'. This window contains the following elements:

  • A label "Enter Sequence."

  • An input field for entering a DNA sequence.

  • A "FIND" button to start the analysis.

  • A "CLEAR ENTRY" button to clear the input field.

P1

3. Enter DNA Sequence:
Enter the DNA sequence you want to analyze into the input field. The sequence can only contain the characters 'A,' 'C,' 'G,' and 'T.' Other characters are not allowed.

P2

4. Click "FIND":
After entering the DNA sequence, click the "FIND" button to initiate the analysis.

P3

5. Output:
The program will analyze the DNA sequence for the presence of an RBS (Ribosome Binding Site) in both the forward and reverse directions. The analysis includes the following information:
If an RBS is found in the forward direction:

  • The position of the RBS

  • The RBS sequence

  • The position of the start codon (ATG)

  • The length of the flanking region

  • The flanking region sequence

A message indicating whether the flanking region is ideal (between 5-7 nucleotides) or not.

P4

If no RBS is found in the forward direction, it will display a message indicating that “NO RBS SEQUENCE FOUND IN FORWARD DIRECTION”
If an RBS is found in the reverse direction:
It will show the same messages as the forward direction.

P5

If no RBS is found in the reverse direction, it will display a message indicating that “NO RBS SEQUENCE FOUND IN REVERSE DIRECTION”
7. Clear Entry: You can click the "CLEAR ENTRY" button to clear the input field and start a new analysis.
8. Repeat Analysis: You can repeat the analysis by entering a new DNA sequence and clicking the "FIND" button again.
9. Close the GUI: You can close the GUI window when you are done with the analysis.

PromoterStrengthPredict 2.0

“PromoterStrengthPredict”, a tool developed by our alumni of the SVCE-CHENNAI 2017 team, adapts a machine learning approach to predict the strength of Sigma 70 promoters in E. coli, thus streamlining and enhancing promoter selection for the iGEM projects. We developed an upgraded version of this software, “PromoterStrengthPredict 2.0”, which involves finding the RBS strength when the sequence is given as the input. The output will be a 2-D plot between the RBS sequence score (X-axis) and the RBS strength value (Y-axis).

[You can see our source code here: PromoterStrengthPredictor 2.0]


Here's a step-by-step protocol to use and get the output from this code:
1. Install Required Libraries:
Ensure that you have the necessary Python libraries installed. You need to have matplotlib.pyplot, Pylab, Numpy, Biopython, and Bio installed. You can install them using pip if they are not already installed.
2. Enter Promoter Sequence:
  • Enter the newly characterized data to be added to the dataset.

  • Enter the promoter sequence you want to analyze into the input field. (The sequence can only contain the characters 'A,' 'C,' 'G,' and 'T.' Other characters are not allowed.)

  • The executed code will give you the following results:
    -Theta value
    -Predicted promoter strength
    -predicted ln strength
    -R2 value
    -Adjusted R2 value

p2-1

3. Enter RBS Sequence:

  • Enter the newly characterized data to be added to the dataset.

  • Enter the RBS sequence you want to analyze into the input field. (The sequence can only contain the characters 'A,' 'C,' 'G,' and 'T.' Other characters are not allowed.)

  • The executed code will give you the following results:
    -Theta value
    -Predicted RBS strength
    -predicted ln strength
    -R2 value
    -Adjusted R2 value

p2-2

4. Plot Analysis:
The program will analyze the sequences of the RBS (ribosome binding site) and the promoter region and generate a 3-D scatter plot for promoter strength and a 2-D plot for RBS strength prediction.
The 3-D scatter plot for the promoter strength prediction includes the following information:

  • -35 Hexamer score (X-axis)
  • -10 Hexamer score (Y-axis)
  • Promoter strength (Z-axis)
p2-3
The 2-D plot for the RBS strength prediction includes the following information:
  • RBS sequence score (X-axis)
  • RBS strength (Y-axis)

p2-4