The extraction of the produced hydrocarbons from the yeast cells via the general laboratory technique of cell lysing is expected to be infeasible in an industrial setting and hence there is a need to enable extraction from the media itself. In addition to that the increase in cellular hydrocarbons could be toxic to our yeast Yarrowia lipolytica. To tackle both these issues, initially we came up with the solution of overexpressing the existing efflux pump in Y.lipolytica to actively pump the hydrocarbons out of the cells.

We chose the ATP-Binding Cassette 2 (ABC2) transporter as a candidate for this because of existing literature on its ability to improve alkane tolerance in Y.lipolytica by pumping them out.[1]

Efflux pumps are known to exhibit affinity towards a broad range of substrates, and in the context of our project, ABC2 transporter is expected to show similar affinity towards both fatty acids and alkanes. This poses an issue because in the last step of our engineered pathway, fatty acids (palmitic acid, stearic acid, oleic acid and linoleic acid) are converted into the associated alkanes (pentadecane, heptadecane, heptadecene and heptadecadiene respectively) via the action of photo-decarboxylase from C.variabilis (CvFAP) .An increased outflux of fatty acids from the cell implies a flux leakage from our engineered pathway , resulting in an overall decrease in the efficiency of our solution.

To overcome this issue we decided to engineer the ABC2 transporter pump from Y.lipolytica with the objective of biasing its affinity more towards the alkanes compared to the fatty acids. We propose that the overexpression of our engineered efflux pump in the yeast cells should make the extraction of our desired hydrocarbons from the medium feasible without compromising on the flux flow in our engineered pathway.

Structure and Mechanism of ABC Transporters

ATP-binding cassette transporters are a family of ubiquitous membrane-bound proteins found in all prokaryotes as well plants, fungi, yeast and animals and plays a crucial role in the transport of a wide range of substances across biological membranes. Structurally the ABC transporters comprises of two Transmembrane Domains (TMDs), which contain helices that form a channel for the substrate transport and two Nucleotide Binding Domains (NBDs), involved in ATP hydrolysis.

The ATP switch model describes the mechanism of the ABC transporters.

ATP switch model

The model involves repeated communication between the NBDs and TMDs via nonvalent conformational changes. It describes the mechanism of the pump as a consequence of the constant switching between the open dimer (NBD domains separated) and the closed dimer (NBD dimerized) conformations of the NBDs. It involves the following steps.

Engineering Principle

As the aim of this project was to engineer the already existing ABC2 transporter in Y.lipolytica to make it more specific towards alkanes over fatty acids, initially we went through literature and analysed the binding sites associated with fatty acids and alkanes. Both were characterized as hydrophobic cavities with its length correlated with the length of the binding molecule. We came across the presence of a basic amino acid residue in fatty acid binding sites, coordinating with the head group, as the only point of difference between the two.[3] We designed our engineering cycle of docking-mutations and structural predictions to exploit this feature.

The strategy that we finalised was to dock the protein structure of the ABC2 pump with different fatty acids synthesized by Y.lipolytica (i.e palmitic acid, stearic acid, oleic acid and linoleic acid), identify the basic residue involved in the binding, and mutate it to decrease the binding affinity of the ligand with respect to that specific site. Each cycle of such docking and mutations will be followed by running the mutant amino acid sequence through AlphaFold 2.0 to predict the structure of the variant. Multiple such cycles will be undertaken to reach the mutant pump with a considerable decrease in binding affinity towards fatty acids without affecting that towards the alkanes.

Docking-Mutation-Prediction Cycles

We performed 3 rounds of molecular docking and mutations and generated 3 variants of the pump from the initial structure predicted by AlphaFold 2.0. AutoDock Vina was used for the molecular docking in each cycle.

AutoDock Vina

AutoDock Vina is one of the fastest and most widely used open-source docking engines. It is a turnkey computational docking program that is based on a simple scoring function and rapid gradient-optimization conformational search.[4][5]

The Cycles

The pump of interest, the ABC2 transporter of Y.lipolytica was not structurally characterized in literature. Hence, we ran an AlphaFold 2.0 structural prediction and the prediction with highest confidence score was used for the docking purposes.

The SDF files for the ligands available in the following databases were used:

Fig: Structures of the 4 fatty acids

In each cycle the ligands and the protein structures for the docking were prepared in AutoDock Vina and the necessary .pdbqt files generated. Then the docking was performed on IISER Pune’s PARAM Brahma supercomputer with the .pdbqt files generated earlier and a configuration file specifying the Grid Box, exhaustiveness and energy range as inputs.

Each output gave 9 possible instances of ligand binding.

Fig: ABC2 transporter docked with palmitic acid

In each case the binding sites were visualized using PyMol and the interacting basic residues were identified by analysing the conformations. Later the identified basic residues were mutated with the decrease in binding affinity as the objective in mind. All the mutations were made on the amino acid sequence of the preceding mutant/wild type protein using Benchling as the platform.

The first two sets of docking and mutations were focused on the transmembrane domain of the protein which is characterized in literature to be the binding site for the alkanes and fatty acids. In these cycles, the identified basic residues were mutated into alanine because of its hydrophobic nature and ability to promote alpha helix formation. These two cycles were to eliminate all possible sites that can accommodate fatty acids as substrate for the pump and create more sites accommodating the alkanes.

The last cycle was focussed on the opening of the ABC2 transporter, and the basic residues were converted to aspartic acid, making it less probable for the fatty acids with -COOH head group to access the pump.

A brief summary of the three cycles is given below:

Variant 3, obtained at the end of the last cycle was chosen as the engineered mutant efflux pump and was further used in validating the pump’s specificity with respect to the wild type.

Validation of the Pump Using MD Simulations

Once the engineering cycles were completed, the next task at hand was to validate our mutant pump’s binding affinity against that of the wild type pump in the case of fatty acids and alkanes. We decided to use GROMACS MD simulations for this.

MD simulations

MD simulation is a technique to compute the dynamical trajectory for a system composed of N particles by integrating Newton’s equations of motion with the initial conditions, boundary conditions and the force fields defined. Hence mathematically the problem at hand breaks down to solving the classical equations of motion:

Here U(r1, r2..., rN) is the potential energy dependent on the vector coordinates of the N particles (r1, r2..., rN) . This, being a system of N coupled second order nonlinear differential equations, cannot be solved exactly. So, an appropriate integration algorithm is used to solve this system of differential equations and to compute the trajectories (position of the ith particle, ri as a function of time)


GROMACS is a versatile package to perform molecular dynamics, i.e., simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions.

We decided to simulate four systems for validation (The systems were generated using CHARMM GUI as the platform):

These systems were generated using CHARMM GUI as the platform


CHARMM-GUI is a web-based platform to interactively build complex systems and prepare their inputs with well-established and reproducible simulation protocols for state-of-the-art molecular simulations using widely used simulation packages such as CHARMM, NAMD, GROMACS, AMBER, GENESIS, Tinker, LAMMPS, Desmond, and OpenMM.

The server supports a variety of input generator modules, and the Membrane Builder module was chosen for our purpose. It takes in inputs in the form of the protein structures docked with the ligand, .sdf files of the ligand and different parameters for the desired system. The input structures for the module were generated for the 4 systems by docking the efflux pumps with their respective ligands at the opening of the pump using AutoDock Vina and choosing the pose with lowest energy. The .sdf files associated with each ligand were downloaded from PubChem.

Once the input structures are uploaded, CHARMM GUI determines the orientation of the protein with respect to the membrane bilayer by running PPM 2.0.

PPM 2.0

PPM stands for positioning of proteins in membranes and is a server that helps in the calculation of rotational and translational positions in membranes of transmembrane and peripheral proteins and peptides using their 3D structures as input.

The parameter values for our system were determined following the CHARMM GUI tutorial for Heterogenous Membrane Protein Complex.[7]

Default values were used in most of the fields except:

In the end of the Membrane Builder module, CHARMM GUI generates topology file(.top), input structure files (.psf, .gro, .pdb), index file(.ndx) and the following .mdp scripts for GROMACS:

These scripts were used to run GROMACS MD simulations and calculate the trajectories on IISER Pune’s PARAM Brahma supercomputer.

Each system was simulated thrice for 100ns and the trajectories were visualized using VMD. VMD is a molecular visualization program for displaying, animating, and analysing large biomolecular systems using 3-D graphics and built-in scripting.

The visualized trajectories from the 4 x 3 = 12 MD simulations are summarized below.

Wild Type - Heptadecane

Wild Type - Palmitic Acid

Variant - Heptadecane

Variant - Palmitic Acid



In conclusion, the insilico engineering of the ATP-Binding Cassette 2 (ABC2) transporter in Yarrowia lipolytica represents a promising solution to overcome the challenges of hydrocarbon extraction from our engineered yeast cells and the potential toxicity of cellular hydrocarbons. By attempting to bias the engineered pump's affinity towards alkanes over fatty acids through a well-planned strategy of docking, mutations, and structural predictions, we successfully generated a possible design for the variant efflux pump. This engineered pump's specificity was validated through molecular dynamics simulations, where a trend of improved binding affinity for alkanes and a reduced binding affinity for fatty acids was observed.

The design that we have come up with acts as a preliminary model for the engineered efflux pump on which further engineering efforts can be undertaken. As the next step, cycles of random mutations in the TMD coupled with GROMACS MM-PBSA Binding Energy calculation could be performed with the objective of biasing the specificity of the pump. But because of the limitations in time and computational resources we were not able to incorporate this into our project, and we propose this as part of the future implementation.

The results of our simulations highlight the potential of the variant pump to facilitate the extraction of desired hydrocarbons from the growth medium while maintaining the efficiency of the engineered pathway. With further research and experimentation, this engineered efflux pump can contribute to the improved efficiency of production of biofuels in Y.lipolytica.