Dry lab results

Flux Balance Analysis (FBA) :

COBRApy is a Python module for modeling and simulating the entire metabolism of a cell. It allows you to build models that reference all metabolic reactions and the enzymes, reagents, products and genes involved in these reactions. Among other things, it enables FBA to be performed by directly importing models created by researchers and bioinformaticians.

Overview :

Accessibility	COBRApy modeling is fast enough and widely accessible for a modern computer. The amount of data required to build the model is relatively low compared with other types of modeling. Prebuilt databases for a number of organizations are available online, and can be imported in a variety of formats.
Computing power	FBA modeling is capable of handling large models exceeding 2000 metabolic reactions
POSSIBILITIES	A multitude of scenarios can be studied with this approach: studying the complete suppression of a metabolic reaction, the suppression of pairs of reactions, gene deletion, inhibition of reactions, optimizing growth media (for more efficient production of a product, for example), studying host-pathogen interaction.

P. putida model KT2440 :

It is possible to import an organism model created by researchers from databases on the Internet or directly from articles in the supplementary data. The "xml" format I used to import a model of the KT2440 strain was available and accessible free of charge on the Internet from the BiGG Model databaseAll I had to do was read it using a python program to access all the genes/metabolites/reactions. The model, which they have named iJN1463, is already used in several scientific articles, and to date counts 2927 reactions for 1462 genes and 2153 metabolites.

All I had to do was read it using a python program to access all the genes/metabolites/reactions. The model, which they have named iJN1463, is already used in several scientific articles, and to date counts 2927 reactions for 1462 genes and 2153 metabolites.

The three COBRA classes :

Reactions	Metabolites	Genes
For reactions, you have access to the identifier, full name, formula with metabolite stoichiometry and reversibility.	For metabolites, we have their formula, the compartment in which they are found, and the reactions in which they are involved.	For genes, we can find out whether they are functional, and in which reactions they are involved.

It is possible to create or modify a model, its reactions, metabolites and genes and all their characteristics (those displayed in the tables). This will enable us to add the pathways we want to the model and study the resulting metabolic variations.

Notion of "Flux” :

In the context of FBA, "flux" refers to the rate or amount of material converted by a specific biochemical reaction in a metabolic network. It represents the rate at which a particular reaction occurs.

When running an FBA test, the aim is to optimize the distribution of flux in the metabolic network to achieve a specific objective, usually maximizing biomass production or maximizing the production of a specific metabolite. FBA assumes that the metabolic network is in a stable state, where metabolite production and consumption are balanced.

Each reaction in the metabolic network is associated with a flux value, which represents the rate of that particular reaction. The flux value can be positive, indicating a direct reaction rate, or negative, indicating a reverse reaction rate. A flux value of zero indicates that the reaction is not active or has no flow through it.

By optimizing the flux distribution, FBA calculates flux values for all reactions that best meet the specified objective function (for instance, maximizing biomass production). Optimization is usually performed using linear programming or other optimization algorithms.

The resulting flux distribution provides insight into the metabolic activity and efficiency of the organism under the specified conditions. It can help identify key reactions, bottlenecks and potential targets for metabolic engineering to enhance desired phenotypes or produce specific metabolites.

In many cases, the exact physical units of flux may not be specified or relevant because FBA focuses on the relative rates of reactions rather than their absolute values. Indeed, FBA deals primarily with stoichiometric coefficients and steady-state assumptions, rather than precise kinetic rate constants.

Flux Variability Analysis (FVA) :

FVA is another interesting approach that complements FBA. It is used in metabolic modeling to determine the range of possible flux values for each reaction in a metabolic network. It provides information on the flexibility and variability of flux distributions within the model. We have also used it extensively to find out the minimum and maximum fluxes across the reactions in our model. The main objective of FVA is to identify the minimum and maximum flux values that a specific reaction can achieve under specific constraints and optimization criteria. By calculating the flux variability for each reaction, FVA helps to characterize potential flux limits and ranges within the metabolic network.

Sophorolipid pathway:

We therefore sought to create the four reactions for sophorolipid biosynthesis in the iJN1463 model. The first reaction is the omega-1 hydroxylation of a fatty acid (preferably oleic acid) by the CYP52 family cytochrome P450. The next two reactions take place independently of each other, but are both glycosylations of hydroxylated fatty acids, catalyzed by UDP-glucosyltransferase A1 and UDP-glucosyltransferase B1. Finally, the last reaction is a lactone esterase-mediated lactonization of the acid sophorolipid produced by the two glycosylations. All these reactions are reversible, except for the last one, according to the various databases (KEGG, ENZYME, BiGG...).

The first reaction involves fatty acids, of which there can be many (palmitic, palmitoleic, stearic, linoleic, epoxystearic and arachidonic acids), and taking them all into consideration would be extremely difficult to model. The preferred substrate is oleic acid, so I settled for this fatty acid. The reaction equation is :

Oleic acid + H+ + NADPH + O2 <=> Hydroxy-oleic-acid + H2O + NADP

The second reaction involves a glucose linked to a UDP supplied by the enzyme, which binds to the hydroxylated fatty acid to form a glycolipid. This UDP + glucose complex carries two negative charges, taken into account by the metabolite already integrated into the model. The equation is as follows:

Hydroxy-oleic-acid + UDP-alpha-D-glucose <=> Glucolipid 1 + H+ + UDP

The third reaction is similar, involving only the addition of another UDP-coupled glucose to the glucolipid to form an acidic sophorolipid:

Glucolipid 1 + UDP-alpha-D-glucose <=> Glucolipid 2 + H+ + UDP

The fourth reaction is lactonization, an intramolecular esterification between the carboxyl group of the hydroxylated fatty acid and the hydroxyl group of C6 sophorose. It takes place in an extracellular environment (the enzyme is normally secreted) and is irreversible, making it the critical reaction of the project. The reaction equation is :

Acid sophorolipid => Lactonized sophorolipid + H2O

Model objective

In COBRApy, the model objective refers to the mathematical representation of the "objective" function optimized during FBA (or other optimization-based analyses). The function defines the goal or criterion that the model aims to optimize. It usually represents a specific cellular objective, such as maximizing the production of a desired metabolite, minimizing the uptake of certain nutrients, or optimizing biomass growth. It is a linear combination of the fluxes of certain reactions in the metabolic network, where the coefficients of the reactions determine their relative importance in the objective. COBRApy allows us to define and manipulate the model's objective using its attribute.

We have established some clear objectives that allow us to learn more about our model and the impact of adding metabolic pathways in the bacterium :

- product yield:If we wish to maximize the production of a specific metabolite of interest, we can define the objective of maximizing its production yield. This involves maximizing the flow through the reactions leading to the desired product, while taking into account the constraints and limitations of the metabolic network.

- Optimizing flow distribution:instead of focusing on a specific objective, we can explore the optimal distribution of flows in the network without a predefined goal. This approach makes it possible to analyze network capacities, identify key reactions or paths, or assess the effects of disturbances.

- Studying biomass: the addition of the new pathway to the bacteria will inevitably impact its growth. The study of biomass enables us to evaluate bacterial growth and the metabolites they consume and release. By setting this objective for the model, we can compare the biomass of P. putida KT2440 and P. putida modified by the metabolic pathway.

Sophorolipid production

Having correctly added the channels to the model, the first obvious objective we could ask of the model was the maximum sophorolipid production it could offer. Once the simulation had run, the model revealed that the maximum flux that the sophorolipid production pathway could achieve was around 41.60817. This maximum depends on production, the import of oleic acid and the presence of all substrates in the intracellular medium; we'll see later how we found the flux-limiting reactions.

A look at the metabolites secreted by the model reveals that lactonized sophorolipids are found in the extracellular medium.

Model construction code :


          import cobra
          from cobra.io import load_model
          from pathlib import Path
          from cobra.io import load_json_model, save_json_model
          import logging

          # Load template in JSON format
          data_dir = Path(".") / ".." / "Programmation" 
          data_dir = data_dir.resolve()

          print("mini test files: ")
          print(", ".join(str(i) for i in data_dir.glob('iJN1463*.*')))
          mini_json_path = data_dir / "iJN1463.json"
          model = load_json_model(str(mini_json_path.resolve()))
          model

          # Creating the 4 reactions
          from cobra import Model, Reaction, Metabolite
          # Reaction catalyzed by cytochrome p450
          reaction1 = Reaction('OH_OCDCEA')
          reaction1.name = 'omega-1 fatty acid hydroxylation'
          reaction1.subsystem = 'Sophorolipid production'
          reaction1.lower_bound = -1000.  
          reaction1.upper_bound = 1000. 
          # Reaction catalyzed by UDP-glucosyltransferase A1
          reaction2 = Reaction('GLU_HFAG')
          reaction2.name = 'hydroxy fatty acid glycosylation'
          reaction2.subsystem = 'Sophorolipid production'
          reaction2.lower_bound = -1000.  
          reaction2.upper_bound = 1000. 
          # Reaction catalyzed by UDP-glucosyltransferase B1
          reaction3 = Reaction('GLU2_HFAG')
          reaction3.name = 'second hydroxy fatty acid glycosylation'
          reaction3.subsystem = 'Sophorolipid production'
          reaction3.lower_bound = -1000.  
          reaction3.upper_bound = 1000. 
          # Lactone esterase-catalyzed reaction
          reaction4 = Reaction('LACTOZ')
          reaction4.name = 'acid sophorolipid lactonization'
          reaction4.subsystem = 'Sophorolipid production'
          reaction4.lower_bound = 0.  
          reaction4.upper_bound = 1000. 
          cobra.Reaction("OH_OCDCEA", lower_bound=None)
          cobra.Reaction("GLU_HFAG", lower_bound=None)
          cobra.Reaction("GLU2_HFAG", lower_bound=None)

          # Meatbolites creation
          # First reaction
          oleic_acid = model.metabolites.get_by_id("ocdcea_c")
          o2 = model.metabolites.get_by_id("o2_c")
          red_NADPH = model.metabolites.get_by_id("nadph_c")
          ox_NADPH = model.metabolites.get_by_id("nadp_c")
          proton = model.metabolites.get_by_id("h_c")
          h2o = model.metabolites.get_by_id("h2o_c")
          h2o_E = model.metabolites.get_by_id("h2o_e")
          ho_oleic_acid = Metabolite(
              'HO_ocdcea_c',
              formula='C18H33O3',
              name='18-hydroxy-oleic-acid',
              compartment='c')
          # Second reaction
          UDP_glucose = model.metabolites.get_by_id("udpg_c")
          glucolipid1 = Metabolite(
              'GLU1_HO_ocdcea_c',
              formula='C24H43O8',
              name='18-glu1-oleic-acid',
              compartment='c')
          UDP = model.metabolites.get_by_id("udp_c")
          # 3th reaction 
          acid_sophorolipid = Metabolite(
              'acsopho_c',
              formula='C30H53O13',
              name='acid sophorolipid',
              compartment='c')
          # 4th reaction 
          lacto_sophorolipid = Metabolite(
              'lacsopho_e',
              formula='C30H51O12',
              name='lactonized sophorolipid',
              compartment='e')

              # Adding reactions to the model
              reaction1.add_metabolites({
                  oleic_acid: -1.0,
                  o2: -1.0,
                  red_NADPH: -1.0,
                  ho_oleic_acid: 1.0,
                  h2o: 1.0,
                  proton: -1.0,
                  ox_NADPH: 1.0
              })

              reaction2.add_metabolites({
                  ho_oleic_acid: -1.0,
                  UDP_glucose: -1.0,
                  glucolipid1: 1.0,
                  proton: 1.0,
                  UDP: 1.0
              })

              reaction3.add_metabolites({
                  glucolipid1: -1.0,
                  UDP_glucose: -1.0,
                  acid_sophorolipid: 1.0,
                  proton: 1.0,
                  UDP: 1.0
              })

              reaction4.add_metabolites({
                  acid_sophorolipid: -1.0,
                  lacto_sophorolipid: 1.0,
                  h2o_E: 1.0
              })

              print(reaction1.reaction)
              print(reaction2.reaction)
              print(reaction3.reaction)
              print(reaction4.reaction)

              # Adding genes associated with reactions
              reaction1.gene_reaction_rule = '( CYP52 )'
              reaction2.gene_reaction_rule = '( UGTA1 )'
              reaction3.gene_reaction_rule = '( UGTB1 )'
              reaction4.gene_reaction_rule = '( SBLE )'

              # Adding reactions to the model 
              model.add_reactions([reaction1])
              model.add_reactions([reaction2])
              model.add_reactions([reaction3])
              model.add_reactions([reaction4])
              print(f'{len(model.reactions)} reactions')
              print(f'{len(model.metabolites)} metabolites')
              print(f'{len(model.genes)} genes')

              # Export of acid sophorolipid to the EC medium
              exported_metabolite = model.metabolites.get_by_id("acsopho_c")
              exported_metabolite_id = "acsopho_c"

              # Export reaction creation
              export_reaction = model.add_boundary(
                  exported_metabolite,
                  type="export",
                  reaction_id="EX_acsopho_c"
              )

              # Define bounds
              export_reaction.lower_bound = 0   # No uptake
              export_reaction.upper_bound = 1000  # Adjust as needed

              # Add oleic acid and glucose in the medium 
              medium = model.medium
              medium["EX_glc__D_e"] = 100
              medium["EX_ocdcea_e"] = 100
              model.medium = medium

              # Demand reaction for lactic sophorolipids  
              target_metabolite_id = "lacsopho_e" 
              target_metabolite = model.metabolites.get_by_id(target_metabolite_id)
              demand_reaction_soph = model.add_boundary(target_metabolite, type="demand")
              demand_reaction_soph.lower_bound = 0
              demand_reaction_soph.upper_bound = 1000  # Adjust the upper bound as needed

              # Define model objective
              from cobra.util.solver import linear_reaction_coefficients
              model.objective = "LACTOZ" 
              linear_reaction_coefficients(model)
              print(model.objective.expression)
              print(model.objective.direction)

              # Model solution (equal to the maximal flux)
              solution = model.optimize()
              print(solution)
              from cobra.flux_analysis import flux_variability_analysis
              soph_list = [model.reactions.OH_OCDCEA,model.reactions.GLU_HFAG,model.reactions.GLU2_HFAG,model.reactions.LACTOZ]
              flux_variability_analysis(model, reaction_list=soph_list)
              save_json_model(model, "compare1.json") # Save the new model with json format

              # Deletion of genes of the model to see consequences on model solution
              import pandas as pd
              deletion_results = single_gene_deletion(model, model.genes[:1465])
              # Save the DataFrame to an Excel file
              excel_file_path = "gene_deletion_results.xlsx"
              deletion_results.to_excel(excel_file_path, index_label="Gene_ID")

Study of the impact of the new pathway under optimized conditions on KT2440 metabolism

To compare metabolic fluxes between the strain KT2440 WT and the KT2440 with the new sophorolipid production pathway.. We compare the fluxes of all the reactions, select those that are significantly different and compare them in heatmaps. Here are the heatmaps comparing the difference in fluxes in the metabolic networks between the model that is optimized to produce the maximum amount of sophorolipid and the control whose target is simply set to allow the best growth. To isolate the reactions that are significantly impacted by the addition of the new pathway, we have chosen only those that exceed the absolute value of 5. Reactions with a negative flux are those that are less used in the new strain, while reactions with a positive flux are those more solicited to meet the model's objective of producing a maximum of extracellular sophorolipids.

Figure 1: Heatmap of the fluxes of each reaction being used more (left) or less (right) by the sophorolipid-producing model compared with the control.

Increased flux	Reactions	Reactions role	Comments
156.65	Htex	Proton import from extracellular medium ⟶ periplasm	Imported protons are used as reagents in certain reactions
132.20	PYRt2rpp and PYRtex	Entry of pyruvate from periplasm to cytosol via proton symport (reversible) \| Passive diffusion of pyruvate from EC medium to periplasm	Pyruvate enters the cytosol and is used to regenerate NAD into NADH
93.62	GAPDi_napd	Dehydrogenation of 3-Phospho-D-glyceroyl phosphate to 3-P-glyceraldehyde (-1 NADH, +1 NAD)	The reagent comes from the PGK reaction, and the reaction seems to serve to regenerate NADH into NAD.
83.12	GALUi	Conversion of glucose-1-P to UDP-glucose (udpg)	Glucose imported from the external environment is transformed into UDP-glucose to meet the two glycosylations of oleic acid in the sophorolipid pathway.
82.78	CO2 échange extérieur	Release of CO2 into the EC medium	Optimizing sophorolipid production appears to release significantly more CO2 than in the control condition.
82.78	CO2 échange extérieur	Release of CO2 into the EC medium	Optimizing sophorolipid production appears to release significantly more CO2 than in the control condition.
81.99	NDPK2	Using ATP to make UTP	The UTP will undoubtedly be used to form UDP-glucose, and the flow through the reaction is almost the same as in UDP-glucose synthesis, meaning that this is the only source of UTP. This reaction therefore consumes sufficient ATP.
66.11	H20tpp	Passive diffusion of water from the periplasm to the cytosol	Water seems to be widely used for efficient sophorolipid synthesis. Here it's from the periplasm.
62.06	PGK	Transformation of 3-phospho-D-glycerate into 3-Phospho-D-glyceroyl phosphate (-1 ATP)	The product is used by the GAPDi reaction to regenerate NADH into NAD.
57.87	PGM	Mutation of D-glycerate-2-phosphate to 3-phospho-D-glycerate	The 3-phospho-glycerate then serves as the product of the PGK reaction (regeneration of NADH to NAD).
56.82	NADH16pp	NADH dehydrogenation: 4.0 h_c + nadh_c + q8_c → nad_c + q8h2_c + 3.0 h_p	Use of 8-quinone(?) to regenerate NADH into NAD.
0.35	GND	Conversion of 6-phospho-D-gluconate to D-ribulose-5P (-1 NADP, +1 NADPH)
46.92	GLCabcpp and HEX	Transport of glucose by the ABP system from the periplasm to the cytosol (-1 ATP) \| Phosphorylation of glucose into glucose-6-P by hexokinase (-1 ATP)	Glucose imports from the outside world increased by 1819%
41.61	FACOAE181	Consumption of an octadecanoyl-CoA in oleic acid	This reaction is the only one to produce oleic acid in the KT2440 model, and as it has the same flux as the sophorolipid synthesis pathways, we deduce that it is the limiting reaction.
40.81	OCDCEAtexi and FACOAL181t2pp	Irreversible transport of oleic acid from EC medium to periplasm \| Irreversible transport of oleic acid from periplasm to cytosol by fatty acid CoA ligase (-1 ATP)	Transport of oleic acid to the intracellular environment.
40.81	OCDCEAtexi and FACOAL181t2pp	Irreversible transport of oleic acid from EC medium to periplasm \| Irreversible transport of oleic acid from periplasm to cytosol by fatty acid CoA ligase (-1 ATP)	Transport of oleic acid to the intracellular environment.
35.93	RPE	Conversion of D-ribulose-5P to xylulose-5-P	D-ribulose-5P is produced by the GND reaction
35.70	H2Otex	Water import from EC environment to periplasm	Water seems to be used for efficient sophorolipid synthesis. Here it's from the EC environment.
32.37	O2tex and O2tpp	Oxygen import from EC medium to periplasm
31.56	GAPD	Conversion of glyceraldehyde-3-P to 3-phospho-D-glyceryl phosphate (-1 NAD, +1 NADH)
31.36	ADK1	Adenylate kinase amp_c + atp_c ⇌ 2.0 adp_c
26.42	NH4 external exchange	Ammonium output
24.08	ATPS4rpp	Synthesis of ATP from ADP (ATP synthase)
18.76	TKT2	Transformation of erythrose-4-P and xylulose-5-P into fructose-6-P and glyceraldehyde-3-P
18.64	GLUDy	Consumption of L-glutamate to 2-oxo-glutarate (-1 NADP, +1 NADPH +1NH4)
17.21	TALA	Transformation of glyceraldehyde-3-P and sedoheptulose-7-P into fructose-6-P and erythrose-4-P
17.16	TKT1/td>	Transformation of alpha-ribose-5-P and xylulose-5-P into glyceraldehyde-3-P and sedoheptulose-7-P
16.70	GLCtex	Passive diffusion of glucose from EC medium to periplasm
9.61 7.57	ICDHyr, PDH, SUCDi, and MDH	Conversion of isocitrate to 2-oxo-glutarate (-1 NADP, +1 NADPH) \| Conversion of pyruvate (-1 NAD, +1 NADH) \| Conversion of succinate to fumarate \| Conversion of L-malate to oxaloacetate (-1 NAD, +1 NADH)
7.57	ASPTA	Consumption of 2-oxo-glutarate + L-aspartate <--> oxaloacetate + L-glutamate
7.11	FUM	Consumption of an L-malate fumarate
7.06	ACONTa, ACONTb and CS	Consumption of citrate to cis-aconitate \| Consumption of cis-aconitate to isocitrate \| Consumption of oxaloacetate to citrate

Decreased flux	Reactions	Reactions role	Comments
-156.65	H+ external exchange		All imported protons are exported to the EC, which must surely be a sign that the model is in equilibrium.
-132.20	EX_pyr_e	External release of pyruvate
-109.46	PPK	Consumption of ATP to make ADP
-83.12	PGMT	Mutase of glucose-1-P to glucose-6-P	Glucose-1-P is used to make UDP-glucose, so glucose-6-P is used less in this model.
-82.78	CO2tex and CO2tpp	Passive diffusion of CO2 from the EC medium to the cytosol
-80.57	EDA and EDD	Consumption of 2-Dehydro-3-deoxy-D-gluconate 6-phosphate to pyruvate and glyceraldehyde-3-P \| Consumption of 6-phospho-D-gluconate to 2-Dehydro-3-deoxy-D-gluconate 6-phosphate
-57.87	ENO	Consumption of 2-P-glycerate to phosphoenlopyruvate	Glycerate-2-P is used as one of the very first red block metabolites for NAPDH production, so there's less flux in this ENO reaction./td>
-47.03	PYK	Consumption of phosphoenolpyruvate to pyruvate (regeneration of ATP)	The loss of flow here is surely the consequence of ENO's under-utilization.
-45.62	NADPHQR2	Regeneration of NADPH to NADP	As NADPH is used to a greater extent in the sophorolipid production model, it is normal that its consumption to produce NADP is used to a lesser extent.
-40.81	Oleic acid external exchange	Release of oleic acid into the environment
-36.20	PGI	Consumption of glucose-6-P to fructose-6-P
-32.37	O2 external exchange	Release of oxygen into the environment
-30.22	GLCDpp, GNK and GLCNt2rpp	Dehydrogenation of glucose to gluconate \| Consumption of gluconate to 6-phospho-D-gluconate (-1ATP) \| Proton symport to bring gluconate into the cytosol	The whole route of importing, manufacturing and consuming gluconate is less widely used.
-26.42	NH4tex and NH4tpp	Ammonium transport from EC medium to cytosol
-16.70	Glucose external exchange	Glucose export to the EC environment
-15.13	CYTBO3_4pp	Regeneration of ubiquinol-8 to ubiquinone-8	Ubiquinone is widely used in many regeneration reactions, for example.
-14.48	RPI	Consumption of alpha-D-ribose-5P to D-ribulose-5P
9.61	SUOCAS	Consumption of a succinate to form succinyl-CoA (-1ATP)
-7.19	GLNS_copy1	Consumption of L-glutamate to L-glutamine (-1 ATP)
-5.38	HCO3E	Proton production with CO2 and H2O	The amount of proton is balanced by using protons from the cytosol to make water and CO2.

These are the metabolic pathways used by putida to overproduce sophorolipids. This is a comparative metabolic map showing only those reactions that are over-utilized by the sophorolipid-producing bacterium compared with the WT bacterium. The size of the arrows depends on the utilization of the reaction (the wider the arrow, the more the reaction is used by the bacterium). In this model, 4 metabolic groups are distinguished:

- UDP-glucose: the blue block corresponds to the use of glucose and UDP to produce UDP-glucose for oleic acid glycosylation.

- Production of NAD+ and NADPH:The red block corresponds to the production of NADPH, which will be used directly for the hydroxylation of oleic acid in the first reaction.

- Production of NADPH : The green block consumes NAD+ to produce NAPDH, which will also fuel the first sophorolipid production reaction.

- Sophorolipid production: This is the purple block. Production depends on the amount of oleic acid present in the bacteria and the concentration of NADPH/UDP-glucose. Here, the extracellular transport of acid sophorolipids has been modeled, but there's no information on whether this transport is efficient or not. It is assumed to transport 100% of acid sophorolipids.

* It should be noted here that proton outflow into the extracellular medium is just as important as proton inflow, but this is not shown here for aesthetic reasons.

What's impressive here is the large amount of NAD+ produced. Looking at the NAD+-consuming reactions in the model and their respective fluxes, we can isolate two of the reactions that consume the most NAD+ in the model: 14.55% is used by the AKGDH reaction, which is the reaction that produces succinyl-CoA, a succinate precursor used by the model but not shown on the map. And 38.14% of the NAD+ is consumed by the MDH reaction, which consumes L-malate to produce oxaloacetate, visible on the map.

Study of the impact of the new biomass pathway on KT2440 metabolism

Comparison of the metabolites secreted and recovered from the EC medium by KT2440 bacteria with the sophorolipid pathway and with the "biomass" model objective (i.e. bacterial population growth) showed no change between the two models. On the other hand, some reactions were impacted in the model with the presence of the sophorolipid pathway even though it was not used directly to produce sophorolipids. The aim of this comparison is to see whether, even when the objective cell model is not set to produce sophorolipids, the pathway can be used to feed other metabolic reactions.

Figure 2: Heatmap comparing the flux of reactions in the sophorolipid-producing model with the control when the model objective is set to "Biomass".

Study of metabolites consumed and produced under optimized sophorolipid production conditions

We thought it would be interesting to compare the WT and sophorolipid models in the metabolites they use and secrete, in order to optimize the production environment. Here, oleic acid was added to the medium under saturated conditions to observe the difference in bacterial uptake.

Figure 3: Comparison of the recovery (left) and secretion (right) fluxes of metabolites in the external environment between the WT and sophorolipid-producing models.

The production of sophorolipids does not involve the consumption of ammonium from EC media, it does not release acetate, it consumes a lot of glucose and O2, and it releases more water and CO2 than the WT model.

Screening KT2440 to identify genes important in sophorolipid production

To perform the screening, we can use a function already integrated into COBRApy which allows us to delete each gene to see the consequence on the solution of the model. These results were saved in an excel file for all the genes that showed a solution below 38 (instead of the optimized 41.61). From this, 29 genes were found to be important for optimal sophorolipid production, including 6 that were absolutely necessary. It should be noted that no deletions resulted in more efficient production, but this may be due to the limiting reaction of the model, which sets the maximum flux at 41.61. It should be noted that the genes presented in this table for solution 0.0 correspond both to genes required for sophorolipid production in some cases, and to genes required for bacterial survival in others.

Objective model solution	Solution = 0.0	Solution = 21.47	Solution = 33.73	Solution = 37.88
Genes ID	PP_1611, PP_4863, pWW0_131, PP_5013, PP_1141, PP_3514	PP_2017, PP_1417, PP_5336, PP_2492, PP_4495, PP_0127, PP_4829, PP_3073	PP_4427	PP_1002, PP_3947, PP_2136, PP_2800, PP_3766, PP_3515, PP_4701, PP_0323, PP_1649, PP_0983, PP_1803, PP_1807, PP_2001
Gene	kdsA-I, livF-II, 3mbzdh, ubiB, livK, oplB	pepN, citT, purE, yqhD, aroP-II, dsbA, cobG, hbdH	NaN	arcD-I, nicA, fadB, NaN, gloA, opiA, pgi-II, soxB, ldhA, NaN, wbpV, kdsA-II, metZ
Associated reaction	KDOPS, Amino acid transport, Benzaldehyde metabolism, OPHHX, Amino acid transport, ATP hydrolyzing	Aminopeptidase, citrate transport, AIRC3, proprionate production and aldehyde consumption, tyrosine and phenylalanine transport, DsbA protein reoxidation reaction, precorrin reaction, acetoacetate production	Ectoin ABC transport	arginine import, nicotinate hydroxylation, β-oxidation of fatty acids, aspartate & glutamate production, Lactoylglutathione lyase, Carbamoylsarcosine & glutamate production, glucose isomerization, sarcosine oxydase, pyruvate production by lactate, LPS extracellular transport, UDP-glucose production, 3-deoxy -D-manno-octulosonic -acid 8-phosphate production, homocysteine & acetate production

Study of the relationship between sophorolipid production and consumption of important metabolites

The notion of "production envelopes" in COBRApy refers to an analysis that determines the limits of production rates of a metabolite of interest in a metabolic model. It's a useful approach for understanding the production potential of specific metabolites by a model organism. First, we need to define the objective of our analysis. We choose a particular metabolite we wish to produce, in our case, sophorolipids. The results of these simulations are then used to generate a graph called the "production envelope". This graph generally displays the production rate of the metabolite of interest relative to another condition. The production envelope can be used to identify the optimal culture conditions for maximizing metabolite production. We can observe how variations in carbon sources or other substrates influence metabolite production. This can be useful for designing bioprocesses or optimizing metabolic production. These envelopes are used to represent combinations of metabolite production rates that the organism can achieve while satisfying model constraints. They are calculated using linear equations based on metabolic fluxes

In reality, these graphs should not be seen as the effect on sophorolipid production or bacterial growth of the consumption of a metabolite, but rather from the point of view of the model's objectives, and we need to think with these in mind. You have to read the graphs and understand that you're asking the model to produce a maximum amount of sophorolipids or bacteria, while at the same time forcing it to consume a metabolite by determining its flux. We look at maximum production as a function of the determined flow of metabolite consumption, which is a little more subtle than simply saying that production is a function of metabolite consumption in the environment. Here, for example, we can observe the production of sophorolipids as a function of the glucose exchange flux imposed by the model. In other words, if I force my model to have maximum glucose consumption with a flux of -20, then the flux that will go into sophorolipid production will only be around 10 instead of 41.6

Figure 4: Curves representing the growth (A) or sophorolipid production flux (B-F) of the model in relation to the metabolite flux imposed on the model: (A) glucose flux, (B) ammonium flux, (C) oleic acid flux, (D) carbon dioxide flux, (E) glucose flux and (F) oxygen flux.

We can see the growth rate of our bacteria as a function of the amount of glucose exchanged (Fig.4 A), and we can see that the rate is constant until the bacteria start producing glucose for export rather than for consumption, which tends to decrease growth linearly.
Sophorolipid production does not involve ammonium consumption from the external environment, as we have already seen in figure? Setting ammonium consumption at around -5 slightly reduces sophorolipid production in proportion to the amount of ammonium absorbed (Fig.4 B), probably because ammonium disrupts the metabolic pathways that are supposed to produce sophorolipids in an optimized way. The presence of oleic acid is necessary for sophorolipid production. One difficulty encountered in building the model was to have the model's metabolism produce oleic acid directly, since the model's metabolism was not complete and the reactions producing oleic acid were only partially functional. To best simulate oleic acid consumption, we added oleic acid to the external environment, with the bacterium absorbing it directly for sophorolipid production. This is shown here (Fig.4 C), where the amount of oleic acid absorbed by the model varies sophorolipid production. Obviously, the best production implies the necessary and sufficient consumption of oleic acid, as can be seen here at around 41.00 absorption. Excess consumption leads to a drop in production, probably because the excess lipid supplies another metabolic pathway and causes the metabolism to reorganize to cope with all these constraints. And obviously, lowering oleic acid absorption below the optimal threshold lowers sophorolipid production. Efficient sophorolipid production inevitably leads to a release of CO2 (Fig.4 D), at a flux of over 75, compared with the WT model, which releases CO2 at a flux of around 30 (see previous figure). We can therefore say that the sophorolipid pathway increases CO2 release by around 40% according to this model.
As expected, glucose consumption proved extremely important as a source of sugar to form the glycolipids intermediate to sophorolipid production. Production efficiency depends directly on the amount of glucose added to the external medium, and is linearly proportional to the amount of glucose taken up by the bacteria (Fig.4 E).
Similarly, O2 consumption is directly correlated with sophorolipid production efficiency (Fig.4 F), and has increased from a flux of 60 to a flux of 100. Looking more closely at the model, the two reactions that consume O2 are the oleic acid hydroxylation reaction at 41.61% and the regeneration of ubiquinol-8 in the periplasm at 58.39%.

The code to obtain these results :


          import cobra
          from cobra.io import load_model
          from pathlib import Path
          from cobra.io import load_json_model, save_json_model
          import logging
          import matplotlib.pyplot as plt
          import seaborn as sns
          import pandas as pd
          import numpy as np

          # Load models for both conditions
          model_without_pathway = cobra.io.load_json_model('KT2440WT.json')
          model_with_pathway = cobra.io.load_json_model('compare1.json')
          # Set the objective for both models (e.g., biomass maximization)
          model_without_pathway.objective = 'BIOMASS_KT2440_WT3'
          model_with_pathway.objective = 'LACTOZ'
          # Solve both models
          solution_without_pathway = model_without_pathway.optimize()
          solution_with_pathway = model_with_pathway.optimize()

          # Compare fluxes for each reaction
          for reaction in model_without_pathway.reactions:
            flux_without_pathway = solution_without_pathway.fluxes[reaction.id]
            flux_with_pathway = solution_with_pathway.fluxes[reaction.id]
            flux_difference = flux_with_pathway - flux_without_pathway

            print(f"Reaction: {reaction.id}")
            print(f"Flux without pathway: {flux_without_pathway}")
            print(f"Flux with pathway: {flux_with_pathway}")
            print(f"Flux difference: {flux_difference}")
            print()

            import numpy as np
            metabolite_ids = ["ocdcea_e", "glc__D_e", "o2_e", "nh4_e"]

            # Calculate and print the difference in fluxes for each metabolite
            for metabolite_id in metabolite_ids:
                metabolite_flux_without_pathway = model_without_pathway.reactions.get_by_id(f"EX_{metabolite_id}").flux
                metabolite_flux_with_pathway = model_with_pathway.reactions.get_by_id(f"EX_{metabolite_id}").flux
                flux_difference = -1 * (metabolite_flux_with_pathway - metabolite_flux_without_pathway)
                flux_proportion = ((metabolite_flux_with_pathway-metabolite_flux_without_pathway) / metabolite_flux_without_pathway)*100

                
                print(f"Metabolite: {metabolite_id}")
                print(f"Flux without pathway: {metabolite_flux_without_pathway}")
                print(f"Flux with pathway: {metabolite_flux_with_pathway}")
                print(f"Difference in fluxes: {flux_difference}")
                print(f"Proportion in fluxes: {flux_proportion} ")
                print("--------------------")

            flux_differences_condition1 = [4.23, 12.32, 60.40, 30.0]  # Replace with actual data
            flux_differences_condition2 = [41.61, 100.0, 100.0, 0.0]  # Replace with actual data
            x = np.arange(len(metabolite_ids))
            bar_width = 0.4
            plt.bar(x - bar_width/2, flux_differences_condition1, width=bar_width, edgecolor='black', label="WT KT2440 model")
            plt.bar(x + bar_width/2, flux_differences_condition2, width=bar_width, edgecolor='black',label="Optimized sophorolipid prod. model")


            plt.xlabel("Metabolites")
            plt.ylabel("Flux Difference")
            plt.title("Uptake metabolites")
            plt.xticks(x, metabolite_ids)
            plt.legend()
            plt.savefig("Uptake metabolites optimized model.png")
            plt.show()
          
            # Get a list of metabolites present in both models
            metabolites_to_compare = set(model_without_pathway.metabolites) & set(model_with_pathway.metabolites)

            # Compare the quantities of metabolites
            for metabolite in metabolites_to_compare:
                original_amount = model_without_pathway.metabolites.get_by_id(metabolite.id).summary().flux
                new_pathway_amount = model_with_pathway.metabolites.get_by_id(metabolite.id).summary().flux
                difference = new_pathway_amount - original_amount
                
                print(f"Metabolite: {metabolite.id}")
                print(f"Original Amount: {original_amount}")
                print(f"New Pathway Amount: {new_pathway_amount}")
                print(f"Difference: {difference}")
                print("--------------------")

                # Create a dictionary to store the flux differences for each reaction
                flux_differences = {}

                # Iterate through reactions and calculate flux differences
                for reaction in model_without_pathway.reactions:
                    flux_without_pathway = solution_without_pathway.fluxes[reaction.id]
                    flux_with_pathway = solution_with_pathway.fluxes[reaction.id]
                    flux_difference = flux_with_pathway - flux_without_pathway
                    
                    if flux_difference <= -5:
                        flux_differences[reaction.id] = flux_difference

                # Create a DataFrame from the flux differences dictionary
                flux_df = pd.DataFrame.from_dict(flux_differences, orient='index', columns=['Flux Difference'])

                # Create a heatmap using Seaborn
                plt.figure(figsize=(5, 10))
                sns.heatmap(flux_df, cmap="RdBu_r", center=0, annot=True, fmt=".2f")
                plt.title("Flux differences sophorolipid pathway VS control")
                plt.xlabel("")
                plt.ylabel("Reactions")

                # Rotate x-axis labels for better visibility
                plt.xticks(rotation=90)
                plt.tight_layout()

                # Show the heatmap
                plt.savefig("Reduced sophorolipid pathway comparaison positive flux.png")
                plt.show()

References :

- Ebrahim, A., Lerman, J. A., Palsson, B. O., & Hyduke, D. R. (2013). Cobrapy: Constraints-based reconstruction and analysis for python. BMC Systems Biology, 7, 74. https://doi.org/10.1186/1752-0509-7-74

- Bigg model http://bigg.ucsd.edu

- Cobra documentation https://cobrapy.readthedocs.io/en/latest/0 documentation