Our biosensor project utilizes the VbrK/VbrR two-component system (TCS) of Vibrio parahaemolyticus. In 2015, Li et al. identified the VbrK/VbrR response regulator pair to control the expression of β-lactamase, inducing antibiotic resistance in V. parahaemolyticus. In this system, VbrR regulates the expression of BlaA (vpa0477), which encodes for a class A chromosomal carbenicillin-hydrolyzing (CARB) β-lactamase (Chiou et al., 2015). The BlaA gene is located on the reverse strand on chromosome 2 (position 476708 ← 477559) of the V. parahaemolyticus genome (RIMD 2210633). However, the exact promoter region of BlaA and the binding site of the response regulator VbrR are unknown.

Visual Overview of the VbrR binding site and blaA's promoter prediction.

Figure 1: Visual Overview of the VbrR binding site and blaA's promoter prediction. VbrR interacts with the σ-factor and increases its binding affinity to the promoter, following a recruitment of an RNA-polymerase that initiates the transcription at the transcription start site (TSS).

Bacterial promoters are an essential part of the gene located upstream of the transcription start site (TSS). In those regions, transcription factors, such as σ70, bind and recruit an RNA-polymerase, initiating gene transcription. The structure of the binding sites of σ70 transcription factors in the promoter region are well-known and can usually be determined by a -35 and -10 region with respect to the TSS (cf. Fig. 1) (Nordheim et al. 2018). Response regulators of the OmpR/Phob superfamily, to which VbrR belongs (Cho et al. 2021), can interact with the σ70 factor and initiate and enhance its DNA binding affinity (Canals et al. 2021, Lou et al. 2023), thereby controlling the expression of the BlaA gene.

Our project aims to establish the VbrK/VbrR system in E. coli, to detect the presence of β-lactam antibiotics by replacing BlaA with a β-galactosidase as a reporter gene. Therefore, our designed plasmids would benefit from an accurate estimation of the promoter and VbrR's binding site. Since promoters are coherent in bacteria, we decided to search for the promoter and VbrR's binding site in the region between the gene start of BlaA and 300 bp upstream (477560 ← 477860). To simplify the visualization, we inverted the sequence of the promoter and BlaA and selected the first base pair (bp) of the promoter as the overall first bp.

Promoter Prediction

Embarking on the journey of promoter prediction, we delve into the fascinating realm of uncovering regulatory elements that govern gene expression. A pragmatic strategy to detect promoters is to employ one of the many available promoter prediction tools. While early approaches to these tools primarily focused on identifying specific motifs within sequences, contemporary machine-learning techniques have expanded to integrate data from diverse biological sources (Cassiano and Silva-Rocha, 2020). We opted for SAPPHIRE (Coppens and Lavigne, 2020), a neural network-driven classifier tailored for σ70 promoter prediction in Pseudomonas and Salmonella because V. parahaemolyticus is closely related to both. Organisms of these genera belong to the class of gammaproteobacteria (Gupta et al. 2016, Gomila et al. 2015, Liu et al. 2022). It is important to note that machine learning tools like SAPPHIRE base their predictions on a finite dataset - in this instance known promoters from Pseudomonas and Salmonella.

Our project aims to establish the VbrK/VbrR system in E. coli, to detect the presence of β-lactam antibiotics by replacing BlaA with a β-galactosidase as a reporter gene. Therefore, our designed plasmids would benefit from an accurate estimation of the promoter and VbrR's binding site. Since promoters are coherent in bacteria, we decided to search for the promoter and VbrR's binding site in the region between the gene start of BlaA and 300 bp upstream (477560 ← 477860). To simplify the visualization, we inverted the sequence of the promoter and BlaA and selected the first base pair (bp) of the promoter as the overall first bp.

SAPPHIRE promoter prediction results.

Figure 2: SAPPHIRE promoter prediction results A, B: Predicted promoter regions annotated towards p-value and approximated transcription start site (TSS). The black solid lines represent the TSS of blaA located at 300 bp. The black circles indicate the selected sequences from individual experiments. C: blaA and its potential promoter region. Yellow bars indicate the predicted promoter sequences by both experiments (Seq1, Seq2) resulting from the intersection of selected Pseudomonas- and Salmonella-related promotor sequences.

Ultimately, by harnessing the power of computational tools, we are able to illuminate the intricate landscape of promoter regions, enhancing our understanding of the genetic regulation of our reporter genes.

VbrR Binding Site Prediction

With our newfound insights into potential promoter sites in which σ-factors bind, we are well-equipped to ascertain the binding site for VbrR. The VbrK/VbrR TCS is also involved in repressing type 3 secretion systems (T3SS1). After activation of VbrK through S-nitrosylation of a cysteine residue, VbrR down-regulates the expression of exsC, a positive regulator of T3SS1. VbrR represses the expression of exsC by directly binding to the exsC promoter (Fig 3). In this case, VbrR competes with a σ-factor belonging to the σ70 family (Gu et al., 2020). Gu and colleagues determined the binding site of VbrR located in the exsC promoter.

Identified binding site of VbrR by Gu et al. in the exsC promoter.

Figure 3: Identified binding site of VbrR by Gu et al. in the exsC promoter. The figure is adapted from Gu et al. (Gu et al., 2020). The black arrow indicates the TSS of exsC.

Under the assumption that the VbrR binding site upstream of the BlaA promoter would be similar to this 49 bp binding site, we performed local alignments against our 300 bp long sequence containing the putative binding site.

Formula.

Local alignments were used to determine whether two sequences (or subsets of each) are a close match to another. To achieve this all possible ways to match the two sequences were given a score according to the scoring function S(ai, bj), whereby ai, bj denote an observed nucleotide at position i, j of the two sequences a and b, respectively. Since we were not interested in the best hit, i.e. the alignment with the highest score, only, we clustered the first 1000 sequences, sorted by their descending scores, whenever the region covered by two alignments was similar. If they exceeded a limit of 20 bp on one side of the prior sequence, the sequences were discarded from the cluster.

Results are shown in Fig. 3. In total, 5 bins were created using our procedure (Fig. 4A), whereby the first 4 bins hold about 90% of all the alignments (Fig. 4B) and were therefore selected as potential VbrR binding sites (Fig. 4C).

Local alignment results using the 49 bp sequence identified by Gu et al. (Gu et
                        al. 2020).

Figure 4: Local alignment results using the 49 bp sequence identified by Gu et al. (Gu et al. 2020). A: Best 1000 local alignments clustered into 5 clusters. The top score refers to the best alignment included in the cluster. B: Number of sequences belonging to each cluster. C: Position of cluster 1 to 4 in the potential promoter region.

A subsequent study by Hong and colleagues (Hong et al. 2022) resolved the crystal structure of the VbrR-DNA complex. Similar to other response regulators of the OmpR/PhoB superfamily, VbrR exhibits an N-terminal receiver domain (RD) and a C-terminal DNA-binding domain (DBD), and exists mainly as a dimer in solution. Hong et al. identified two 7 bp DNA half-sites S1 (TTCTAAT) and S2 (TTCATCG) within the VbrR binding site that is bound by the two DBD units of the dimer. Notably, those two sites are solely separated by 2bp.

On top of aligning the 49 bp binding site, we decided to map the two motifs against our sequence to identify possible DNA half-sites that allow binding of VbrR. To that end, we used alignments but only evaluated the gap size between the two half-sites and the number of mismatches. A gap size of up to 6 bp and a maximum of 5 mismatches were allowed. No binning was applied at this stage. Fig. 5A lists all hits based on their gap size and number of mismatches. Even though no perfect alignment was found, we decided to favor fewer mismatches and accept larger gap sizes. Thereby, the locations of sequences 4 and 6 were selected as potential binding sites (Fig. 5B) as they also overlapped with the determined binding site in Fig 4.

Alignment results of the two half-site sequences S1 and S2 identified by Hong et al. (Hong et al. 2022).

Figure 5: Alignment results of the two half-site sequences S1 and S2 identified by Hong et al. (Hong et al. 2022). A: Number of mismatches of the 6 alignments between the aligned half-sites and the promoter region. The gap size indicates the gap size between the two half-sites. B: Position of the 6 half-site alignments in the promoter region. Pinkish bars indicate the selected binding sites.

Our promising discoveries pave the way for a deeper understanding of VbrR's binding behavior.

Synergizing Results: Integrating Promoter and Binding Sites Analysis

Finally, we focused on integrating the gathered data. The exact interaction between VbrR and the σ70 factor remains unknown. However, by integrating several studies connected to VbrR's superfamily, and predicting potential promoter sequences using SAPPHIRE, we are now able to determine VbrR's binding sites and their potential function. All predicted promoter regions and VbrR's binding sites are shown in Fig. 6.

By referring to the knowledge from the OmpR/PhoB superfamily and the direct interaction with the binding sites, we are able to classify our suggested binding sites of VbrR into two categories: enhancing (green), located upstream of the -35 region, and σ70-activating (orange), directly located in the -35 and -10 regions. We predict that the promoter region is located within sequence positions 477560 ← 477668 (108 bp), and the enhancing region is positioned at 477707 ← 477761 (54 bp).

Combination of the results of all three approaches.

Figure 6: Combination of the results of all three approaches. (SAPPHIRE, local alignments, half-sites). Results show a promoter region (orange) and a potential enhancer region (green).

With those predictions we are now able to improve our plasmids in terms of size and functionality. However, experimental work is required to validate the predicted binding sites, promoter regions, and their function. We plan to perform Electrophoretic Mobility Shift Assays (EMSA) that would allow us to determine whether VbrR binds to specific DNA fragments. To validate the enhancing potential of VbrR binding sites, we will conduct in-gel fluorescence assays employing GFP, aiming to discern any observable increase in protein expression upon their inclusion.