DVSensor – a tool for creating DART VADAR sensors


Description


DVSensor is a software tool which allows you to create DART VADAR sensors for any mRNA targets. DART VADAR sensors, first described by Gayet et al. [1], are mRNA constructs which facilitate conditional mRNA translation based on the detection of target mRNA molecules. The DART VADAR mRNA can only be translated after it hybridizes with its target mRNA, which activates the sensor and allows the translation of an encoded payload gene.

DVSensor takes a target mRNA sequence as input, either in FASTA or GenBank format, and generates possible DART VADAR sensor sequences as output. The sensor sequences can then be cloned into a suitable DART VADAR expression vector, such as our pASTERISK.

The software offers a variety of settings for controlling how the sensors are generated. In addition, it comes with a built-in feature for evaluating the specificity of the generated sensor sequences. By querying mRNA databases using BLAST, it can identify potential off-target mRNA transcripts other than the intended target that may also be able to activate the sensor. A menu for documentation and help is available in the software as well. DVSensor runs locally as a web app inside the browser and is available cross-platform. Alternatively, it could also be hosted on a web server.

We developed this software out of a need during our engineering process, but are also providing it as a contribution to the iGEM and the scientific community. Read more on our contribution page why we believe this tool will be immensely helpful for future iGEM teams.


Installation


To use this application, the following software is required:

  • Python (version 3.11 or higher)
  • Python package: nicegui (version 1.3.9 or higher)
  • Python package: Biopython (version 1.81 or higher)

To install python, follow the guides on the official Python website. Python comes with a package manager called pip, which you can use to install python packages. Run the following commands in a terminal / command line (cmd.exe on Windows) to install the required packages:

  • pip install nicegui
  • pip install biopython

On Linux systems, installing python packages using pip may not work, and requires either an installation with the systems package manager, or the use of virtual environments. The official Python website provides more information. Also, this thread on the askubuntu forum may be helpful.

The source code of the application can be found on our GitLab Repository inside the dvsensor/src directory. Download the src directory and launch the application by running the main.py file inside a terminal / command line: python main.py

Installing BLAST: In order to use the BLAST feature of DVSensor, a working system installation of BLAST is required. The latest BLAST installers can be downloaded from the NCBI website or using the package manager on Linux systems. To check if the installation was successful, run blastn -version in a terminal (cmd.exe on Windows) and see if it returns the BLAST version number. In addition to the BLAST executables, a BLAST database is required. These can be downloaded from the NCBI website as well. A useful mRNA database is the RefSeq Select RNA database. Download the archive (refseq_select_rna.tar.gz) from the NCBI website and extract it. The extracted directory contains different files which should not be renamed, deleted or changed. You should also not add any new files to this directory. The database path refers to the path of this directory, and the database name refers to the stem of the filenames inside the directory (e.g. "refseq_select_rna" for the RefSeq Select RNA database). Both the database path and the database name have to be entered in order to run BLAST queries (see usage).

Additionally, there is a build.py file which you can use to create bundled executables with PyInstaller. These bundled executables can then be run on any machine without the need for installing python or any packages. In order to use this file, install the latest version of the PyInstaller package and run the file in the terminal / command line. Unfortunately, we could not provide pre-bundled executables in our repository, since the project storage space is limited. If you decide to create a bundled version, you can simply start the app by launching the dvsensor.exe file on windows or the dvsensor binary on Linux that is created when you run the build.py file.


Usage


1. Start the application and open the web page

Open a terminal (on linux) or a command line (on Windows, cmd.exe) inside the src directory containing the main.py file. Make sure you have Python and the neccessary dependencies installed (see installation). Run the following command to launch the application: python main.py Open a web browser of your choice and enter this url: http://localhost:8080. You should now see the start page of the web app. Note: closing or reloading the browser window will shut down the application. This is intended behavior, but requires a restart of the application. You should only use the buttons inside the application window to navigate between pages, and not the back/forward buttons of your browser.


Main menu

You are presented with two large buttons which allow you to generate new sensors or read the documentation, respectively. Clicking the right button opens the documentation menu (see image below). Clicking the left button opens a new page were you can upload your target mRNA sequence data.


Documentation menu

2. Upload a target sequence

You have two options for uploading a sequence: uploading a file containing a single-entry FASTA record or a GenBank record of your mRNA sequence, or by entering the FASTA/GenBank record manually. The maximum size of a sequence is limited to 500KB.


Upload menu

To upload a file, click on the plus button in the right corner. Select a file and click the upload button. To manually enter a sequence, paste the sequence into the text field and click "continue".

Confirm file upload

3. Edit the sequence record information

The sequence name and NCBI accession number are extracted from the uploaded sequence record. Usually, the sequence name gets interpreted as the accession number, which may not be desired. Therefore, you can change both the sequence name and accession number. Changing the name has no effect on the computation or the results, but the accession number has to be correct in order for BLAST queries to work properly. Click "OK" when you are done.


Sequence information menu
4. Setting options

The next menu provides different options for controlling how the sensors are generated and for controlling the BLAST queries. The different options are explained in the table below. Click "Run Analysis" when the desired options have been set.


Options menu

Option Explanation
Target triplets Select the triplets which should be located in the target mRNA and used for generating sensors. The triplet types differ in their efficiency for evoking a sensor activation (A-to-I editing). A rough ranking (higher to lower efficiency) would be:
CCA≈GCA>UCA≫CAA≫CUA>ACA>CCU≈CGA>CCC.
See [2] for more information.
(default: CCA,GCA, UCA, CAA)
Target regions Select the transcript regions in which targetable triplets should be located. Sensors in the 3'UTR are more efficient than sensors in the CDS or 5'UTR.
See [1] for more information.
(default: 3'UTR)
run blast Choose wether or not BLAST queries should be run to identify potential off-target transcripts.
(default: True)
only include hits overlapping with target triplet Only include a blast hit if the alignment overlaps with the central triplet of the query sequence. This reduces the amount of off-targets reported that have a low chance of activating the sensor, but may increase the risk of false-negatives.
(default: False)
word size BLAST-internal parameter.
Number of exact-matching nucleotides that the BLAST algorithm uses to find matches.
(default: 7)
E-value threshold BLAST-internal parameter.
Significance threshold for an alignment.
(default: 0.05)
Min. percent identity Minimum required fraction of exactly-matching nucleotides in the alignment.
(default: None)
Min. percent query coverage Minimum required fraction of the query sequence covered by the alignment.
(default: None)
Filter results by taxonomic ID BLAST-internal parameter.
Only report blast hits associated with the specified taxonomic ID. This limits the reported BLAST hits to a given organism, which is most likely the desired behavior. The taxonomic ID has to be represented in the BLAST database, otherwise this will produce an error. Taxonomic IDs are integer numbers and can be obtained here.
(default: 9606 (Homo Sapiens))
Path to BLAST database directory Full path to the directory containing the BLAST database files.
BLAST database name Name of the BLAST database - the name usually corresponds to the filename stem of the database files, e.g. "refseq_select_rna" for the RefSeq Select RNA database.

5. Run the analysis and export the results

The application will locate the selected triplets inside the selected transcript regions, and generate 123 bp sensors centered on those triplets. If the "run blast" option was selected, the trigger sequences (the segments of the target mRNA that are centered around a given triplet) will be queried against the specified database using BLAST to identify potential mRNA transcripts that may produce an off-target sensor activation. The results are reported in a table and are explained below in more detail.


Results page

Column Explanation
Position Position of the targeted triplet (first nucleotide) in the target mRNA.
Triplet Triplet type (e.g. "CCA")
Region Transcript region in which the triplet is located (e.g. "3UTR")
Range Start- and end-positions of the target mRNA segment targeted by the sensor.
%GC GC-fraction of the sensor sequence.
In-frame start/stop codons Number of start- and stop-codons in the sensor which are in-frame with the central stop codon that had to be edited (otherwise the sensor would not work as intended, see [1]).
Potential off-targets List of potential off-target mRNA sequences identified by the BLAST search. The results are reported in the following format (or reported as "N.A." if the BLAST feature was deselected):
<accession number>(E: <e-value> ; Score: <alignment score> ; Cov: <query coverage %>); ...

Note that the reported BLAST hits are not decisively identified as being able to activate the sensor, but only as potential off-targets which warrant closer examination.
Sensor (5->3) RNA-sequence of the sensor, 5'-to-3' orientation (123 bp).
Trigger (5->3) RNA-sequence of the target mRNA segment targeted by the sensor, 5'-to-3' orientation (123 bp).


In the upper left corner, the target mRNA is depicted as a simplified ideogram, with the sizes of the different regions corresponding to their relative lengths. Selecting an entry in the table displays a red window in the ideogram, which indicates the region in the mRNA that is targeted by that particular sensor.

mRNA ideogram

The "cancel" button in the upper right corner allows you to cancel the running analysis. When the analysis is finished or if it was cancelled, a new button appears, which allows you to download the output table as a CSV file. Clicking the "home" button in the lower left corner brings you back to the start page.

Cancel and Export Buttons

Troubleshooting



Solution: Restart the application and wait a few seconds, it may take a while for the page to load the first time.

Solution: The pages may look different depending on the screen size of your device. Use the zoom-in/out feature of your browser to adjust the content size.

Solution: Click the "Allow" button. The application does not send any data over the internet, but needs to open a locally running web server on your computer (which only you have access to). This is why the firewall might block the application.

Solution: You most likely have accidentally reloaded the page, used your browsers navigation buttons or have otherwise shut down the app. Restart the application as usual.

Solution: Click the "go to main menu" button.

Solution: Make sure that the file you are uploading is a valid FASTA or GenBank record. This error usually indicates an error with the file encoding.

Solution: The file / sequence size for sequence record uploads is limited to 500KB, which is more than enough for mRNA records. Make sure that your file does not contain more than one sequence record.

Solution: Make sure that the file you are uploading / the sequence record you are entering is a valid FASTA or GenBank record. Also make sure that only one sequence record is contained within the file, as FASTA files can have multiple sequence records.

Solution: Make sure you have a working system installation of BLAST, in particular the blastn program which comes bundled with the NCBI BLAST+ executables. Run blastn -version in a terminal / command line and see if it returns the blastn version number. Note that you can use DVSensor without the BLAST feature, but you won't receive a report on potential off-targets.

Solution: This could have several reasons, all indicating a problem with the specified BLAST database. One reason could be that you entered the wrong database name. Make sure the name matches the filename stem of the files inside the database directory. You should never rename, delete or add new files in the database directory. If you still get an error, your blast database may be corrupted or there may be another issue. Use the blastdbcheck tool to test the database integrity by running this command inside the database directory: blastdbcheck -db <database name>.

Solution: Make sure you have a working system installation of BLAST, in particular the blastdbcheck program which comes bundled with the NCBI BLAST+ executables. Run blastdbcheck -version in a terminal / command line and see if it returns the blastdbcheck version number.

Solution: Make sure you have entered a correct path to the directory containing the BLAST database files.

Solution: Make sure you entered a correct taxonomic ID and that the database contains sequences associated with that taxonomic ID. If this wasn't the error, check if your blast setup works correctly by running a test query against your database: blastn -db <database_name> -query <query.fasta> ‑taxids <taxonomic ID> (for the RefSeq Select RNA database, <database_name>=refseq_select_rna). This command should be run in a terminal / command line inside the database directory. Choose a FASTA file for <query.fasta> with a record that you are sure is contained in the database.

Solution: On Chromium-based browsers (such as Google Chrome or Brave), you may need to allow automatic downloads for that page. Click the icon in the URL bar that indicates a blocked download and select "Always allow".


Technical Notes


The software was written in Python 3.11 using the nicegui library version 1.3.9 and the Biopython package version 1.81. It has been tested on these platforms:

  • Manjaro Linux x86_64, FireFox v118.0.1, BLAST+ v2.14.1
  • Manjaro Linux x86_64, Brave browser v1.58.131, BLAST+ v2.14.1
  • Windows 10, Chrome v117.0.5938.150, BLAST+ v2.14.1
  • Windows 10, FireFox v118.0.1, BLAST+ v2.14.1
  • Windows 10, Brave browser v1.58.131, BLAST+ v2.14.1

The application is executing IO-blocking operations in a separate thread using the built-in threading module, which may cause issues when the application is inappropriately terminated. A better implementation could make use of the run.cpu_bound or run.io_bound functions which are available in newer versions of nicegui. By default, the local web server runs on port 8080. This cannot be changed using a configuration file, but this is usually not required for a locally running web app anyway. If you want to host this app on a web server, change the port in the source code as described here.


Authors and acknowledgement


This software was developed by Daniel Prib (contact: dpx0@mailbox.org, github: https://github.com/dpx0). The following image assets used in the software were obtained from flaticon.com under a free-to-use license:


  • [1] R. V. Gayet et al., “Autocatalytic base editing for RNA-responsive translational control,” Nat Commun, vol. 14, no. 1, p. 1339, Mar. 2023, doi: 10.1038/s41467-023-36851-z.
  • [2] L. Qu et al., “Programmable RNA editing by recruiting endogenous ADAR using engineered RNAs,” Nat Biotechnol, vol. 37, no. 9, Art. no. 9, Sep. 2019, doi: 10.1038/s41587-019-0178-z.