Model

Production of immunoglobulins

Although immunoglobulins are produced by B cells, Ig synthesis and release depends on the reciprocal interaction of many different cell types of the immune system. Induction of the innate response is essential, as well as the activation and proliferation of T cells and B cells.

However, to evaluate the Ig response, we focused on T cells and B cells, mainly because the innate cellular response counteracts pathogens at a predominantly local level, and leads to the eventual elimination of them in a time frame probably longer compared to the fast neutralization of the humoral response.

The software suggested by professor Viola[1] required us to get in touch with the owner company, explaining our needs and not being sure if they could be satisfied on time. For this reason, we opted for other tools more quick to use, but still as reliable as possible.

For this purpose, we used the Immune Epitope Database & Tools (IEDP)[2] and the epitope prediction algorithm associated.

Using the IEDB database

First of all, we researched if there were any known linear epitopes of any M13 deposited protein in the database. The IEDB database had only G3P and G8P proteins registered, i.e. we could not assess the immunogenicity of the other proteins. The specified parameters of our query are listed below.

  • Epitope: any
  • Assay: T cell, B cell, MHC ligand;
  • Outcome: positive;
  • Organism: Inovirus (M13 bacteriophage);
  • Antigen: any material entity of the particle;
  • MHC restriction: any;
  • Host: Human;
  • Disease: any.

According to the software, the only immunogenic linear epitope known of the virus is TFMYVFSTF[3], which is a fragment (407-415aa) of the G3P protein.

Using prediction softwares

After the research among known data, we proceeded with a new query using prediction softwares listed in the IEDB website to look for the most probable not-known epitopes of the viral proteins. Since there were many different software, we selected a few of them based on our purpose and interests.

T cell epitopes prediction

Firstly, we looked into how the phage proteins would be processed by the immune system.

MHC-I processing prediction

The MHC class I complex is present on the surface of almost any human cell, except for immune cells. It binds intracellular pathogen peptides and exposes them to CD8+ T cells. Although MHC-I analysis is not very relevant for our project since the phage does not infect any human cell (i.e. it is very unlikely that a phage peptide would be exposed on MHC-I), we decided to look for MHC-I epitopes anyway because of the much higher accuracy of those predictions, compared to the MHC-II epitopes predictions.

NetCTL analysis

NetCTL[4][5] is a predictor of T cell epitopes along a protein sequence. This software requires a protein sequence as input data, giving in output any possible linear epitope with relative MHC binding affinity, cleavage affinity, TAP transport score and an overall prediction score. Thus, we use the software to select the most probable epitopes exposed on MHC-I proteins on the cell surface. Since the algorithm required to select an HLA supertype family, here is a table that lists HLA supertype classification[6]. We used the default A1 supertype value.

Parameters used in the queries:

  • Prediction method: NetCTL;
  • Weight on C terminal cleavage: 0.15;
  • Weight on TAP transport efficiency: 0.05;
  • Select Supertype: A1 supertype;
  • Threshold: 0.75.

Explaining output values:

  • #: residue position;
  • Peptide Sequence: amino acid residue;
  • Predicted MHC Binding Affinity: the value is given as 1–log50K(aff), where log50K is the logarithm with base 50.000, and aff is the affinity in nM units;
  • Rescale Binding Affinity: the predicted binding affinity is normalized by the 1st percentile score;
  • C Terminal Cleavage Affinity: predicted proteasomal cleavage score;
  • TAP Transport Efficiency: predicted TAP transport efficiency;
  • Prediction Score: overall prediction score.

G3P protein

Prediction score vs aa position, the red line represents the threshold.

G6P protein

Prediction score vs aa position, the red line represents the threshold.

G7P protein

Prediction score vs aa position, the red line represents the threshold.

G8P protein

Prediction score vs aa position, the red line represents the threshold.

G9P protein

Prediction score vs aa position, the red line represents the threshold.

MHC-II binding prediction

The MHC class II complex is present on the surface of antigen presenting cells. It binds internalized pathogen peptides and exposes them to CD4+ T cells. Since the phage tropism spares with high confidence any human cells, MHC-II antigen processing and binding affinity is the focus of our interest.

We analyzed the MHC-II processing of the protein, using MHC-II binding prediction software[7-17].

Since the MHC-II binding epitopes prediction tool is a bit more complex than the previous one, we add some more explanation on how it works for a more comprehensive reading the tool:

  • breaks the sequence into all possible peptides of chosen lengths;
  • predicts the binding affinity for each peptide based on the method;
  • compares the predicted affinity to that of a large set of randomly selected peptides;
  • assigns a percentile rank depending on individual predicted affinity;
  • through the consensus method, picks the median rank of the methods used - consensus percentile rank.

Parameters used in the queries:

  • Prediction method: NetMHClIpan 4.1 EL (recommended epitope predictor-2023.09);
  • Select species/locus: Human, HLA-DR;
  • Select alfa & beta chains separately if applicable: no;
  • Select full HLA reference set: no;
  • Select 7-allele HLA reference set: yes;
  • Select length(s): default (15).
Explaining output values

The predicted output is given in units of IC50 (nM) for the combinatorial library and SMM_align. Therefore a lower number indicates higher affinity. As a rough guideline, peptides with IC50 values <50 nM are considered high affinity, <500 nM intermediate affinity and <5000 nM low affinity. Most known epitopes have high or intermediate affinity. Some epitopes have low affinity, but no known T-cell epitope has an IC50 value greater than 5000.

For each peptide, a percentile rank for each of the three methods (combinatorial library, SMM_align and Sturniolo) is generated by comparing the peptide's score against the scores of five million random 15 mers selected from SWISSPROT database. And the adjusted percentile rank is the percentile rank adjusted based on the frequency of peptide lengths.

A small numbered percentile rank indicates high affinity. The median percentile rank of the three methods were then used to generate the rank for consensus method. The data for percentile rank was updated in the release 2.22 by recalculating the percentile data with a consistent sample peptides datasets for all the methods.

The cutoff of percentile rank was set on 10.0 (percentile rank ≤ 10.0), as the tool instructions suggested. Since there are several possible alleles for MHC-II complex, the number of results is very high: each predicted epitope can bound more than one couple of alleles, and vice versa.

Proteins analyzed, with no. of results in brackets:

  • G3P protein (246)
  • G6P protein (53)
  • G7P protein (1)
  • G8P protein (33)
  • G9P protein (2)

Immunogenicity prediction

CD4 T cell immunogenicity prediction software analysis

We then used another software to assess what linear epitopes of the phage proteins are immunogenic for T cells. For this purpose, we used the CD4 T cell immunogenicity prediction software[18][19], mainly because it allowed us to assess the immunogenicity for all seven most common alleles in the population[20] for each query. The software requires the protein sequence as input; the user can predict the T cell immunogenicity using the 7-allele method[20], immunogenicity method and combined method (IEDB recommended). The combined method predicts the final score that combines the predictions from the 7-allele method and immunogenicity method. In the results, the "Immunogenicity Score" ranges from 0 to 100, with low values identifying more immunogenic peptides and high values non-immunogenic peptides. And the "Combined Score" is given by combining both HLA binding and the immunogenicity prediction which presumably incorporate the capacity of being recognized by TCR. And low value identifying high capacity of being recognized by TCR.

Parameters used in the queries:

  • Prediction: IEDB recommend (combined);
  • Sort Peptides by: position in protein;
  • Select maximum percentile rank threshold: 50.

Proteins analyzed, with no. of results in brackets:

  • G3P protein (9)
  • G6P protein (10)
  • G7P protein (0)
  • G8P protein (5)
  • G9P protein (1)

The G7P protein query was negative, as it did not show any results.

Comparative analysis of epitopes

In order to reach the highest reliability possible, we then compared the results of the two softwares (MHC-Il binding prediction tool and CD4 T cell immunogenicity prediction tool), and thus the predicted epitope through their respective parameter (MHC-II binding prediction and T-cell immunogenicity prediction).

Our assumption was that the two software evaluate two different processes of the human immune response. On one hand we have MHC processing, on the other hand we have the binding affinity between epitopes and T cell receptors, thus the grade of probability to activate the T cells. In fact, it was our belief that a comparative analysis between the results of two software could lead to a more reliable prediction.

The comparative analysis was conducted as follows:

  • we selected all positive 15mers and 9-mers (epitopes) peptides predicted by the MHC-II binding prediction software, each of which represented a possible epitope exposed on MHC-II complex;
  • we selected all positive 15-mers and 9-mers (epitopes) peptides predicted by CD4 T cell immunogenicity prediction software;
  • we then compared the two lists, looking for the no. of identical epitopes present in both lists.

G3P protein

G6P protein

G8P protein

G8P protein

B cell epitopes prediction

We used two different softwares to predict B cell epitopes namely DiscoTope and Elliscope. Whereas the former predicts only three dimensional epitopes, with the latter we were able to predict linear epitopes too.

DiscoTope

DiscoTope[21][22] is a method for predicting discontinuous epitopes from 3D structures of proteins in PDB format. The chain ID for all the viral proteins was the same: A. The used version of the software was 2.0.

Since the 3D protein structure is needed, we could analyze G3P and G8P only.

Explaining output values:

  • Chain ID:The chain id of the protein chain used in prediction (specified by the user)
  • Residue ID:PDB Residue id
  • Residue Name:Name of the residue
  • Contact Number:The residue contact number is the number of Cα atoms in the antigen within a distance of 10 Å of the residue's Cα atom. A low contact number correlates with localization of the residue close to the surface or in protruding regions of the antigen's structures.
  • Propensity Score:This score tells about the probability/tendency of being part of an epitope for that particular residue. The propensity is reflected in amino acid epitope log-odds ratios, which were calculated on a set of 75 antigens. The propensity score is calculated by sequentially averaging epitope log-odds ratios within a window of 9 residues. Then the scores are summed up based on the proximity in the 3D structure of the antigen. For any given residue, the sequentially averaged log-odds scores from all residues within 10Å are summed to give the propensity score.
  • Discotope Score:This score is calculated by combining the contact numbers with propensity score. DiscoTope score above the threshold value indicates positive predictions and that below the threshold value indicates negative predictions.

G3P protein

Discotope score vs aa position, the red line represents the threshold.

G8P protein

Discotope score vs aa position, the red line represents the threshold.

ElliPro

ElliPro[23] predicts linear and discontinuous antibody epitopes based on a protein antigen's 3D structure. ElliPro accepts as an input protein structures in PDB format.

Parameters used for the queries:

  • Minimum score: 0.5 (default);
  • Maximum distance (Angstrom): 6 (default).

G3P protein

equation

Residue score vs position, the red line represents the threshold

Predicted linear epitopes:

Predicted discontinuous epitopes:

Linearized epitopes:

We also tried to analyze G8P, but the tool did not work for still unknown reasons, even after several attempts.

Allergic reactions

Allergic reactions can cause significant damage in individuals, especially if occure outside the hospital. For this reason, we designed an allergic reaction analysis of the phage proteins. Our idea was to use the data obtained from the immunogenicity essay to predict the probability of an allergic reaction to the phage.

Using the IEDB database

To do so, we first search in the IEDB database for any known allergies to the phage, i.e. if the only certain epitope of the phage (G3P). The query had no positive results.

Using prediction softwares

At this point, we tried to analyze the predicted epitopes with the previous softwares.

Predicting through T-cell linear epitopes

For T cell epitopes, we decided to search for the one we found in both the MHC-II binding prediction tool and the CD4+ T cell binding prediction tool, i.e. the ones resulted from the comparative analysis between the two softwares.

As for the parameters, we applied the following:

  • Epitope: linear peptide (input);
  • Homology rate: 80%, then exact match if 80% positive;
  • Receptor: anytype;
  • Assay: T cell, B cell, MHC ligand;
  • MHC Restriction: any;
  • Host: human;
  • Disease: allergic.

G3P protein

equation

G6P protein

equation

G8P protein

equation

G9P protein

equation

Predicting through B-cell epitopes

Since DiscoTope results are shown as a DiscoTope score assigned to each amino acid, we could not rebuild the discontinuous epitopes easily. Thus, for B cell epitopes, we started analyzing the linear epitopes obtained from ElliPro. As a matter of fact, the prediction principle relies on two possible assumptions.

  1. Compositional theory: we consider that every 3-dimensional epitope is continuous, i.e. every fragment of the protein can represent an epitope.
  2. Positional theory: we consider that every 3-dimensional epitope can be either continuous or, more likely, discontinuous, due to the spatial structure and the sterical hindrance of the proteins.

Composition theory

Following the composition theory assumption, we search in the database every linear epitope obtained from ElliPro analysis of G3P protein.

G3P protein

Linear epitopes:

Position theory

Following the position theory assumption, we search in the database every discontinuous epitope obtained from ElliPro analysis of G3P protein. Since the IEDB database can only get linear epitopes as input, we treated discontinuous epitopes as they were, precisely, linear. We based our research on the belief that in 3-dimensional space the discontinuous epitopes behave like linear ones due to the proximity (d≤8Å) of their residues.

Linearized discontinuous epitopes:

Autoimmunity

Finally, we repeated the same process for the autoimmunity risk. One other important risk to evaluate when developing a therapeutic agent is, in fact, the phenomenon of molecular mimetism, i.e. the fact that the therapeutic agent can be similar (in terms of chemical composition, 3D structure, overall charge distribution) to a human protein, thus carrying the risk of generating an autoimmune disease.

Using the IEDB database

We made a research of both peptidic and non-peptidic material of M13 for homology in any human autoimmune disease epitopes registered in the database, with no results. We applied the same parameter as for the allergic reaction query, selecting “autoimmune” instead of “allergy”. The result was still negative.

Using prediction softwares

We searched in the database if there were any known autoimmune disease epitopes with a certain homology to our T and B cell predicted epitopes.

Predicting through linear T-cell epitopes

G3P protein

equation

G6P protein

equation

G8P protein

equation

G9P protein

equation

Predicting through B-cell epitopes

Linear epitopes:

equation

Linearized discontinuous epitopes:

equation

Further investigation

Professor Viola made another important statement. Even if the phage is immunogenic, there is the possibility to make it less visible to the immune system. Besides the many different delivery methods to hide phage particles before they reach the target site, there is another kind of mimetism which depends directly on the amino acid composition of the coat proteins. As a matter of fact, it is possible to slightly change the amino acid sequence of the most immunogenic epitopes to make them less immunogenic. One tool to achieve this is Deimmunization[24]. Due to a time limit of the project, we decided not to look into this software.

References

  1. Moise L, Gutierrez AH, Bailey-Kellogg C, Terry F, Leng Q, Abdel Hady KM, VerBerkmoes NC, Sztein MB, Losikoff PT, Martin WD, Rothman AL, De Groot AS. The two-faced T cell epitope: examining the host-microbe interface with JanusMatrix. Hum Vaccin Immunother. 2013 Jul;9(7):1577-86. doi: 10.4161/hv.24615. Epub 2013 Apr 12. PMID: 23584251; PMCID: PMC3974887.
  2. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, Wheeler DK, Sette A, Peters B. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 2019 Jan 8;47(D1):D339-D343. doi: 10.1093/nar/gky1006. PMID: 30357391.
  3. Mikkel Harndahl; Kasper Lamberth; Sune Justesen; Gustav Røder; Michael Madsen; Morten Nielsen; Claus Lundegaard; Mette Voldby Larsen; Sheila Tang; Søren Brunak; Ole Lund; Søren Buus. Large scale analysis of peptide-HLA class I interactions.
  4. Larsen M.V., Lundegaard C., Lamberth K., Buus S., Lund O., and Nielsen M. 2007. Large-Scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinformatics 8:424. PMID: 17973982.
  5. Larsen M.V., Lundegaard C., Kasper Lamberth, Buus S,. Brunak S., Lund O., and Nielsen M. 2005. An integrative approach to CTL epitope prediction: A combined algorithm integrating MHC-I binding, TAP transport efficiency, and proteasomal cleavage predictions. European Journal of Immunology 35(8):2295-303. PMID: 15997466.
  6. Sidney, J., Peters, B., Frahm, N. et al. HLA class I supertypes: a revised and updated classification. BMC Immunol 9, 1 (2008). https://doi.org/10.1186/1471-2172-9-1.
  7. Wang P, Sidney J, Kim Y, Sette A, Lund O, Nielsen M, Peters B. 2010. Peptide binding predictions for HLA DR, DP and DQ molecules. BMC Bioinformatics. 11:568. PMID: 21092157.
  8. Wang P, Sidney J, Dow C, Mothé B, Sette A, Peters B. 2008. A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach. PLoS Comput Biol. 4(4):e1000048.PMID: 18389056.
  9. Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, Sette A, Peters B, Nielsen M. 2018. Improved methods for predicting peptide binding affinity to MHC class II molecules. Immunology 154(3):394-406.PMID: 29315598.
  10. Nielsen M, Lund O. 2009. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC Bioinformatics. 10:296. PMID: 19765293.
  11. Nielsen M, Lundegaard C, Lund O. 2007. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method. BMC Bioinformatics. 8:238. PMID: 17608956.
  12. Sidney J, Assarsson E, Moore C, Ngo S, Pinilla C, Sette A, Peters B. 2008. Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries. Immunome Res 4:2. PMID: 18221540.
  13. Sturniolo T, Bono E, Ding J, Raddrizzani L, Tuereci O, Sahin U, Braxenthaler M, Gallazzi F, Protti MP, Sinigaglia F, Hammer J. 1999. Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices. Nat Biotechnol. 17(6):555-561. PMID: 10385319.
  14. Kaabinejadian S, Barra C, Alvarez B, Yari H, Hildebrand WH, Nielsen M. 2022. Accurate MHC Motif Deconvolution of Immunopeptidomics Data Reveals a Significant Contribution of DRB3, 4 and 5 to the Total DR Immunopeptidome. Front Immunol. 13:835454. PMID: 35154160.
  15. Reynisson B., Alvarez B., Paul S., Peters B., Nielsen M. 2020. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 48(W1):W449-W454. PMID: 32406916.
  16. Jensen KK, Andreatta M, Marcatili P, Buus S, Greenbaum JA, Yan Z, Sette A, Peters B, Nielsen M. 2018. Improved methods for predicting peptide binding affinity to MHC class II molecules. Immunology 154(3):394-406. PMID: 29315598.
  17. Andreatta M, Karosiene E, Rasmussen M, Stryhn A, Buus S, and Nielsen M. 2015. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics.67(11-12):641-50. PMID: 26416257.
  18. Dhanda et. al.: Prediction of HLA CD4 immunogenicity in human populations, Frontiers in Immunology, 2018, 9, 1369.
  19. Paul et. al: Development and validation of a broad scheme for prediction of HLA class II restricted T cell epitopes, Journal of Immunological Methods, 2015, 422, 28-34.
  20. Sinu Paul, Cecilia S. Lindestam Arlehamn, Thomas J. Scriba, Myles B.C. Dillon, Carla Oseroff, Denise Hinz, Denise M. McKinney, Sebastian Carrasco Pro, John Sidney, Bjoern Peters, Alessandro Sette. Development and validation of a broad scheme for prediction of HLA class II restricted T cell epitopes, Journal of Immunological Methods, Volume 422, 2015, Pages 28-34, ISSN 0022-1759, https://doi.org/10.1016/j.jim.2015.03.022.
  21. P. H. Andersen, M. Nielsen and O. Lund. 2006. Prediction of residues in discontinuous B cell epitopes using protein 3D structures. Protein Science.15:2558-2567. [PMID: 17001032].
  22. J. V. Kringelum, C. Lundegaard, O. Lund, M. Nielsen. 2012. Reliable B cell epitope predictions: impacts of method development and improved benchmarking. PLoS Comput Biol. 8:(12):e1002829. [PMID: 23300419]
  23. Ponomarenko JV, Bui H, Li W, Fusseder N, Bourne PE, Sette A, Peters B. 2008. ElliPro: a new structure-based tool for the prediction of antibody epitopes. BMC Bioinformatics 9:514. PMID: 19055730.
  24. Sandeep Kumar Dhanda, Alba Grifoni, John Pham, Kerrie Vaughan, John Sidney, Bjoern Peters, Alessandro Sette (2017). Development of a strategy and computational application to select candidate protein analogs with reduced HLA binding and immunogenicity. Immunology. PMID: 2883308.

Modeling the infection of phages

We have tried to draft a mathematical model which recapitulates the expected mechanism of infection that the system should undergo. Namely, a pool of engineered non-lytic phages infecting a population of AMR bacteria, transfecting a CRISPRi machinery, in turn restoring susceptibility to antibiotics. We have divided the modeling aspects into 3 parts:

  • the bacterial growth;
  • the transfection of the phagemid;
  • the occurrence of mutations harnessing the therapeutic effect.

This has allowed us to create a nested model focusing on single decoupled aspects; this approach will enable future teams to re-use and adapt the model at their will, changing modeling strategies and structure without re-drafting the whole system.

Bacterial growth

In a general environment, we can assume that bacteria grow following a chemostat-like abstraction: the number of bacteria x depends on the nutrients abundance N and on a washout w. The latter can arise from washing, defecating, coughing or any other action or external source exerted on the colony or culture which can lead to the removal of part of the bacterial population. For the sake of simplicity, any other effect associated with bacterial basal death rate - apart from antimicrobial actions - will be assumed as included in this constant.

While the washout w can be fairly modeled as a first order kinetic, the dependence on nutrients can be modeled in several ways; neglecting possible space constraints, the most suitable model that we can adopt is probably through the Monod equation:

In this framework the growth is limited by the amount of nutrients through the parameter N (and the half growth constant Ks which correspond to the amount of nutrient for which the growth is halved). It is worth noting that under the assumption of nutrients overabundance (N→∞) this equation collapses in a classical exponential growth.

This result in a final form:

with ɑ(t) generic function describing the nutrients supply Nin distribution over time and ε conversion constant from bacterial to nutrient concentration.

Phagemid transfection

Once the AMR bacterial population evolution is described, it is now necessary to model the delivery of the CRISPRi system and its effect on the bacterial cells. Different approaches and possible similar systems are reported in literature [1-3]. However, we decided to explore a different interesting modeling approach based on the exploitation of delayed differential equations (DDE). This representation could catch realistic interesting scenarios since, once infected by phages, the cells require time to produce the CRISPRi system and thus become sensitive. However, not all the cells turn to be sensitive at the same time; during this delay, mutations could occur, leading to a decrease in the efficiency of the system. This type of modeling can be implemented through transit compartmental models (TCM), a type of modeling inherited from pharmacokinetics [4]. Through these models, the population can be divided into different subpopulations, depending on their state (or stage of infection) and the passage from a state to another one can be modeled as rates [5].

Briefly, assuming the existence of n intermediate transitory possible states xi and considering the average time τ to reach the final state xn+1, the rates k (also called latency, assumed equal between states) can be computed as:

Conveniently, depending on the number of intermediate states, the system can evolve with a first order dynamic (n=0) up to a step-like one (n→∞).

Adapted from [5]

Moving to a general version of our model, (neglecting for now the cell growth and the washout, for the sake of notation readability), we can call:

  • T the number of target pathogens;
  • TR the number of pathogens infected;
  • F the number of phages.

with δ absorption rate of phages in the bacterium (a second order dynamic with unit [mL/(#*h)] as reported in [6]) and bs burst size, which is the number of phages released by an infected cell during the lytic cycle. Since our system does not involve lytic phages but only engineered temperate ones with a single lysogenic cycle, the system can be be simplified in two versions:

  1. Administration of isolated phages

  2. Administration of probiotics P producing phages

Since the second version could lead to safety concerns, from now on we will take into consideration only the model 1, which is how PASTA has been conceived. By adopting the TCM formulation, the system can be expanded as:

with:

  • k=(n+1)/τ latency
  • Tr final state

For the sake of clarity, the model 2 would have required additional division in more stages also for P, with phages burst only at the final stage.

Mutations

Assuming the possibility that both the CRISPRi system passed by the phage or the target DNA can mutate, we can describe the following possible states:

equation

where:

  • rin is the rate of inhibition of antimicrobial resistance by CRISPRi circuitry
  • mM is the basal mutation rate of the genome, multiplied by NR which is the copy number of the antimicrobial resistance gene
  • my is the mutation rate of te transfected phagemid
  • T is the number of target bacteria w/o CRISPRi circuitry
  • Tx is the number of target bacteria w/ CRISPRi circuitry
  • Txr is the number of target bacteria w/ CRISPRi circuitry actively inhibiting AMR
  • Ty is the number of target bacteria w/ mutated CRISPRi circuitry
  • M is the number of target bacteria mutated on the target gene w/o CRISPRi circuitry
  • Mx is the number of target bacteria mutated on the target gene w/ CRISPRi circuitry
  • My is the number of target bacteria mutated on the target gene w/ mutated CRISPRi circuitry

The whole system can be therefore written as:

with

This system allows the designer to recapitulate the desired system in a biologically plausible framework; moreover, its structure enables a detailed analysis for fine tuning or exploration of the designed platform boundaries by varying the parameters and the number of intermediate states.

Final model

For the sake of simplicity, we have posed some assumption to our model: Saturation of nutrients (N→∞)

  • Only one intermediate state exist (possible states:{T, M, Tx, Ty, Mx, My, TRx, TRy, MRx, MRy, Txr}).
  • Equal mutation rates for bacterial genomes and CRISPRi-carrying phagemids (mM=my=m).
  • Single copy antimicrobial resistance expression cassettes (NR=1)

Additionally, we have introduced a first order kinetic A to describe the action of an antibiotic (constantly administered) on pathogens with restored susceptibility.

Under this assumptions, the simplified model can be written as:

equation

To simulate our system, we have set the following parameters from literature:

equation

while the initial amount of target pathogen T0 was set to the effect of the antibiotic A and the initial amount of phages F0 were the actual input of the simulations, thus were explored over a wide range of possible values, as reported in the following figures; all the other initial conditions for the different states were set to zero. Additionally, the amount of killed bacteria were shown and computed as A•Txr.

Simulations were carried out over a time frame of 10h, which was the maximum time at which all the phages were almost transfected (or washed out). Indeed, longer time would have the bias of a mixed population of mutants and non-transfected cells growing again without any therapeutic action.

graph

We developed a script in MatLab to understand how PASTA could be affected by tunable aspects such as the dosage of phages F(t0) and the killing effect of the antibiotic A. We set the initial amount of bacteria to be targeted equal to 5•107; meanwhile, for the two variables, we covered a wide range of possible values, namely:

  • for the initial amount of phages, we consider 13 different possibilities from 100 to 1012 [#/mL]
  • for the antibiotical killing rate we used 16 different values, from 10-5 to 1010 [1/h].

In the simulations that we have run, we calculated the number of killed bacteria using the MatLab function trapz, integrating the vector that describes the trend of Txr over time (i.e., the result from the ODE solver ode15s), multiplied by the antibiotic killing rate A (called “Killed”). We also calculate the amount of the bacteria that remain resistant to the antibiotic after the same time span (due to mutations in either the target DNA or the CRISPRi machinery) and so are still alive (called “Escapers”); to do this we integrated the sum of the variables M, Ty, Mx, My, TRy, MRx, MRy.

We can see the trend of these two lumped variables in the heatmap at the top of the figure. First of all, we can observe that the TCM ODE model was implemented properly, indeed a shift in the curves that were supposed to have a delay during the evolution of the system can be clearly observed. This is proof that this model is a valuable and easy to use modeling methodology for this kind of system.

As envisaged, the model demonstrates that there is an increase of killed bacteria and a decrease of escapers for increasing antibiotical killing rate and the amount of initial phage is rising. More interestingly, it can also be observed that when the antibiotical rate of killing is bigger than 10 h-1 and the initial number of phages is bigger than 106 #/mL we can appreciate that there is a high number of killed bacteria and only a few escapers, which suggest the existence of minimal thresholds for the design of the treatment.

In the model we also compute the number of mutants that can be formed, we consider a possible mutation in the gene target in the CRISPRi circuitry, or both. From the graphs in the figure, it is clear that the number of mutants is not negligible. As we can see in the Ty, Mx, My and M graphs, at the end of the simulation there is a rapid increase of these variables. It is worth noting that we considered mM = my , this means that the mutation rate of the plasmid and the genome are the same. Actually, this can be considered a worst-case scenario, since a single point mutation in the CRISPRi system (as well as in the target DNA) does not necessarily imply a complete disruption of the repression action [21].

Here the developed model implemented in Matlab:

References

  1. Barrangou, R., et al. Using CRISPR-Cas systems as antimicrobials. Current opinion in microbiology (2017); 37:155-160. doi: 10.1016/j.mib.2017.08.005
  2. Yosef, I., et al. Temperate and lytic bacteriophages programmed to sensitize and kill antibiotic-resistant bacteria. Proceedings of the National Academy of Sciences of the USA (2015); 112(23):7267-7272. doi: 10.1073/pnas.1500107112
  3. Bikard, D., et al. Exploiting CRISPR-Cas nucleases to produce sequence-specific antimicrobials. Nature biotechnology (2014); 32:1146-1150. doi: 10.1038/nbt.3043
  4. Savic, R. M., et al. Implementation of a transit compartment model for describing drug absorption in pharmacokinetic studies. Journal of pharmacokinetics and pharmacodynamics (2007); 34(5):711- 726. doi: 10.1007/s10928-007-9066-0
  5. Messina, M., (2020) Simulazione di interazioni tra comunità microbiche: confronto tra tecniche di inibizione di resistenze antibiotiche in batteri patogeni (Unpublished master's thesis). University of Pavia, Italy.
  6. Sinha, S., et al. Modeling bacteria–phage interactions and its implications for phage therapy. Advances in applied microbiology (2018); 103:103- 141. doi: 10.1016/bs.aambs.2018.01.005
  7. Cairns, B. J., et al. Quantitative models of in vitro bacteriophage–host dynamics and their application to phage therapy. PLoS Pathogens (2009); 5(1). doi: 10.1371/journal.ppat.1000253
  8. Lenski, R. E. Dynamics of interactions between bacteria and virulent bacteriophage. Advances in Microbial Ecology (1988); 10:1-44. doi:10.1007/978-1-4684-5409-3_1
  9. Lenski, R. E., et al. Constraints on the coevolution of bacteria and virulent phage: a model, some experiments and predictions for natural communities. The American naturalist (1985); 125(4):585-602. doi: 10.1086/284364
  10. Schrag, S. J., et al. Host-parasite coexistence: the role of spatial refuges in stabilizing bacteria-phage interactions. The American naturalist (1986); 148(2):348-377. doi: 10.1086/285929
  11. Smith, H. L., et al. Bacteriophage infection dynamics: multiple host binding sites. Mathematical modelling of natural phenomena (2009); 4(6):111-136. doi: 10.1051/mmnp/20094604
  12. Levin, B. R., et al. Resource-limited growth, competition, and predation: a model and experimental studies with bacteria and bacteriophage. The American naturalist (1977); 111(977). doi: 10.1086/283134
  13. Santos, S. B., et al. Population dynamics of a Salmonella lytic phage and its host: implications of the host bacterial growth rate in modelling. PLoS One (2014); 9(7). doi: 10.1371/journal.pone.0102507
  14. Kysela, D. T., et al. Optimal bacteriophage mutation rates for phage therapy. Journal of theoretical biology (2007); 249(3):411-421. doi: 10.1016/j.jtbi.2007.08.007
  15. Sternberg, S. H., et al. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature (2014); 507(7490):62-67. doi: 10.1038/nature13011
  16. Drake, J. W. A constant rate of spontaneous mutation in DNA-based microbes. Proceedings of the National Academy of Sciences USA (1991); 88(16):7160-7164. doi: 10.1073/pnas.88.16.7160
  17. Duncan, S. H., et al. The role of pH in determining the species composition of the human colonic microbiota. Environmental Microbiology (2009); 11(8):2112-2122. doi: 10.1111/j.1462-2920.2009.01931.x
  18. Stein, R. R., et al. Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota. PLoS Computational Biology (2013); 9(12). doi: 10.1371/journal.pcbi.1003388
  19. Wan, Z., et al. Measuring the rate of conjugal plasmid transfer in a bacterial population using quantitative PCR. Biophysical Journal (2011); 101(1):237-244. doi: 10.1016/j.bpj.2011.04.054
  20. Massoudieh, A., et al. Horizontal gene transfer on surfaces in natural porous media: conjugation and kinetics. Vadose Zone Journal (2007); 6(2):306-315. doi: 10.2136/vzj2006.0069
  21. Bellato, M., et al. CRISPR Interference Modules as Low-Burden Logic Inverters in Synthetic Circuits.Frontiers in Bioengineering and Biotechnology (2022). doi: 10.3389/fbioe.2021.743950

Modeling the relationship between optical density and actual growth between species.

An important aspect in the characterization of novel parts, is the choice of a reference to adopt for normalizing data, thus results can be shared and compared reliably between laboratories with different instrumentations. As proposed in Kelly et al.[1], genetic circuits (more precisely, promoter strengths) can be compared by normalizing the computed Scell for a reference strain bearing:

  • the transcriptional promoter BBa_J23101
  • the same RBS (e.g., BBa_B0034)
  • the same reporter gene (e.g., rfp, BBa_E1010)

Additionally, we also hypothesized that an additional aspect to be considered could have been a measurement bias rising from the possible different relationship between OD600 measured and the actual CFU of the culture; indeed, some of the strains used in our study displayed a slightly mucoid phenotype.

For this reason, liquid cultures of 3 different bacterial species used in this study were analyzed by measuring samples OD600 and plating proper dilutions to perform colony counting. The results of the analysis are reported in the following table:

Linear relationship between OD600 and CFU for three of the bacterial species in this project. Dashed lines represent the 95% confidence interval for the linear regression. Graph and regression produced in Graphpad Prism 10.

The graph above represents the fitting of the linear model. There are the values of the measurements, the line that describes the trend of the OD vs CFU, and the area between the dashed lines that describe the confidence interval. The confidence interval is the area where there is a high probability of finding unknown values.

The equation describing the linear relationship between OD600 and CFU (imposing a null intercept) for a certain species x is given by:

equation

This means that two species having the same CFU are in a relationship through the following equation:

equation

Considering an E. coli strain as a reference, the equivalent OD600 (on which then compute Scell or RFP/OD and thus RPUs units) can be obtained through the conversion constant between strains as:

equation

so:

equation

For the strains analyzed, we computed βK.pneumoniae=1.37 and βA.baumannii=0.60.

While the linearity between OD600 and CFU seems verified and a conversion coefficient has been computed, data gathered for this analysis are too poor and noisy to be reliably used. Indeed, due to time constraints, no additional replicates were performed; therefore, we decided not to include this additional normalization step in the analysis reported in the wiki, despite further investigation to assess the reliability of this normalization will be performed in the future.

References

  1. Kelly, J.R., et al. Measuring the activity of BioBrick promoters using an in vivo reference standard. Journal of Biological Engineering 3, 4 (2009). doi: 10.1186/1754-1611-3-4