**Why Agent-Based Models?**

Predicting how genetically engineered *P. fluorescens* behaves in soil is complicated due to intricate microbial interactions and fluctuating environments. This is not an easy problem to solve for deterministic models, as the actions of a single bacteria/colony might lead to unknown collective behaviors. With several simple rules for bacteria to follow, agent-based models can explore complex biological systems on multiple levels [1], including entity behaviors, microbial interactions and environmental effects (Figure 1).

**Aim of the Model**

To predict more precisely the behaviors of our genetically engineered *P. fluorescens* in the soil, we have built two sub-models: a Microbial Community model to simulate how our bacteria move and colonise on the root; and a Cell Level model to explore how the protein secretion system would work in cells with genetic circuits. Addtionally, we intertwined two sub-models to explore cell-level effects on antiflorigen protein secretion across populations, in pursuit of the ideal product application. Eventually, we are trying to give suggestions for both experimental design and product development. Here, we are using NetLogo [2] as an agent-based modeling platform to build our model and conduct simulations; this also allows us to even track the spatial movements of the entities during the simulations.

**What We Learned?**

From our agent-based models, we got new insights into the behaviours of our bacteria in the soil. This helps us in both experiments and product development. Moreover, we also integrated the results of our agent-based models to the Applied Design, giving suggestions for the product design and precisely application methods. Furthermore, through the analysis of large amounts of data from the agent-based models, we successfully captured a series of real-life biological patterns. This proves that agent-based modelling frameworks can be potentially powerful tools for the complex biological systems.

**Microbial Community Model**

In the Microbial Community model, we explore the bacteria behaviours at population level in the soil. The agents in this sub-model are natural *P. fluorescens* and our modified *P. fluorescens*. Those agents will follow several simple rules (see below) to act in a 20 x 20 cm closed field in the soil. The output of this sub-model is colonisation efficiency of our modified *P. fluorescens* on roots (i.e., the percentage of population of our bacteria on the root).

You can also enjoy the non-embedded Microbial Community model here in a separate page.

For smaller screens this content is not embedded. Instead go and enjoy the microbial community model here.**Model Validation**

We modelled the spatial distribution of the modified *P. fluorescens* on the root by adding the population of our bacteria for each root region from 30 simulations; the darker colours are where more colonies are likely to be found (Figure 2a and 2b). Given the comparable root lengths between wheat and cherry trees, we utilized wheat data for a comprehensive comparison (Figure 2c). We can observe that our bacteria will colonise mostly in the first 6 cm underground on root, and the colonisations are showing a decreasing gradient from top layer to deeper layer of the roots in the soil (Figure 2a and 2b), as shown in the wheat root (Figure 2c). Moreover, we experimentally measured the colonisation patterns of our bacteria on the root of *Arabidopsis thaliana*. Our model's findings corresponded with the observed data from the wet lab (Figure 2c). This means the model can capture the essential aspects of the real-world system.

**Colonisation Dynamics**

We conducted a parameter sweep, exploring combinations of parameter values through an extensive series of simulations. Here we show four example plots from different parameter conditions with 30 repetitions for each condition, which leads to different colonisation dynamic patterns (Figure 3). Within the simulation time, the population of natural *P. fluorescens* and modified *P. fluorescens* will reach steady states, or constant levels, after around four to nine days. This means if we want to apply our product to the orchard, it would be better to apply it at least ten days earlier to make sure our product works for the plants (see Model Suggestions below).

**Predictability of the Microbial Community Model**

Subsequently, we used random forest regression to test the predictability of the Microbial Community model. Despite being a stochastic model, the mean squared error is low and the R-squared score is over 0.99 (Figure 4), which means the behavior of our bacteria would be relatively controlable at the population level in the soil.

**Cell Level Model**

The biosensor system comprises two genetic circuits: one activated by a specific root exudate – salicylic acid (click here to find out the details of screening of root exudates in weblab page) and the other by N-Acyl-l-homoserine lactone (AHL) (Figure 5). AHL is continually produced but functional to the cells only above a certain concentration threshold. Consequently, the target protein is synthesized only in the presence of salicylic acid and a sufficiently bacterial population.

In the Cell Level model, we simplify several genetic circuits into only one, and the production of trigger RNAs and AHL is expressed in simpler ways. In this sub-model, we assume that AHL concentration is always high enough to activate the protein secretion. The agents of this sub-model are the molecules in the cell (e.g., RNA polymerase, proteins, etc.). The setting of the strength of promoter, ribosome binding site, and protein degradation rate are mostly set to default following an example model in NetLogo [12]. This sub-model will follow several rules (see below) to perform transcription, translation, etc., exploring the protein secretion of cells with artificial genetic circuits.

You can also enjoy the non-embedded Cell Level model here in a separate page.

For smaller screens this content is not embedded. Instead go and enjoy the cell level model here.**Model Validation**

We compared our model with our own generated lab data; the model’s results were collected after 30 times repetitions of simulation (orange and black regions) and shows that the results from the Cell Level model fits the data generated from lab (blue and red lines) (Figure 6). This means the Cell Level model can explain the protein secretion system in the cells.

**Predictability of the Cell Level Model**

Unlike the Microbial Communities model, the Cell Level model shows more variability due to random movements of molecules in the cells, making the protein secretion more unpredictable based on the parameters. This is reflected by the larger mean squared error and lower R-squared value (Figure 7). We hypothesise that this is because the dynamics of protein production could be different even though the basic trend of protein secretion is like the real data.

**Model Interaction**

We used a built-in tool LevelSpace [13] in NetLogo to conduct model interactions. The output of the interactive model is the total protein secretion level in the system.