Interleukin-8 (IL-8) plays a crucial role in protein binding by facilitating interactions with chemokine receptors, particularly CXCR1. IL-8 is a small globular protein with defined structural regions. It possesses binding sites primarily in its N-loop and 40s loop regions. These regions contain specific charged and hydrophobic residues that are essential for receptor recognition. When IL-8 encounters CXCR1, these binding sites on IL-8 engage with complementary sites on the receptor's N-terminal domain, such as ND-CXCR1(1–38). These interactions create a stable protein-protein complex, which serves as a signaling mechanism to initiate cellular responses, such as chemotaxis and immune cell activation, ultimately aiding in immune responses and the regulation of inflammation.
This section lays out our process of predicting the structure for the anti-IL8 scFv format antibody. The amino acid sequence for the antibody has the following parts.
MPLLLLLPLLWAGALA
EVQLLESGGGLVQPGGSLRLSCAASGFTFSYYGMGWVRQAPGKGLEWVSGISYSGSGTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCARDYVGNLDYWGQGTLVTVSS
GGGGSGGGGSGGGGS
DIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSDTPSTFGQGTKLEIK
RTDYKDHDGDYKDHDIDYKDDDDKAAALPETGGHHHHHH
Therefore, the full amino acid sequence that we will work with is
MPLLLLLPLLWAGALAEVQLLESGGGLVQPGGSLRLSCAASGFTFSYYGMGWVRQAPGKGLEWVSGISYSGSGTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCARDYVGNLDYWGQGTLVTVSSGGGGSGGGGSGGGGSDIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSDTPSTFGQGTKLEIKRTDYKDHDGDYKDHDIDYKDDDDKAAALPETGGHHHHHH
Note that we keep the signal peptide and the tags attached during the modelling process, because we do not know whether (and how) they interact with Interleukin-8 during docking.
AlphaFold2 is a deep learning system that predicts protein structures from amino acid sequences. We used the open-source distribution of AlphaFold2, ColabFold to predict the structure of the antibody. We used the AlphaFold2_mmseqs2 notebook. This notebook differs from full AlphaFold2 and AlphaFold2 Colab in that it uses MMseqs2 (Many-against-Many sequence searching) in place of homology detection and MSA pairing.
We used ColabFold with two different schemes: one without templates, and one with PDB70 as a database for templates. We also relaxed the top structure in either scheme with AMBER.
Both of these have very similar average predicted aligned errors, as well as predicted lDDT scores. For both, folding is poor near the ends where the signal peptide and flags were attached, and in the middle where the linker is present.
With PDB70 | Without Templates |
---|---|
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
AlphaFold produces a per-residue estimate of its confidence on a scale from 0 – 100 . This confidence measure is called pLDDT and corresponds to the model’s predicted score on the lDDT-Cα metric. It is stored in the B-factor fields of the mmCIF and PDB files available for download (although unlike a B-factor, higher pLDDT is better). pLDDT is also used to colour-code the residues of the model in the 3D structure viewer. The following rules of thumb provide guidance on the expected reliability of a given region:
The pLDDT per position is also given as a plot for the five models made in every run and gives a simpler overview:
Our protein clearly has 2 domains. We use the Predicted Aligned Error (PAE) plot provided by AlphaFold. PAE is a 2D plot.
The colour at (x, y) corresponds to the expected distance error in residue x’s position, when the prediction and true structure are aligned on residue y. Dark Blue is good (low error), red is bad (high error). For example, aligning on residue 150:
AlphaFold produces a per-residue confidence score (pLDDT) between 0 and 100. Some regions with low pLDDT may be unstructured in isolation.
They use mmCIF files from the model archive extension to get resources and information of predicted proteins. It contains molecular description, Taxonomy id, Quality measures, per residue quality.
Impact of structural bioinformatics:
Alpha fold predicts various possible structures for a given sequence of amino acids. So, one of the tools developed are the MSA graphs. Here we can take two different alignments and combine them based on requirements, organism and other information. We can pair and unpair them to get better results, which depends on the sequences and how well the software predicts on each of them.
Here we can see for unpaired MSA case, we have possible sequence counts and their positions and based on it, the software well developed 1 graph which has minimum error in relative positions. While in the case of Paired MSA, we have mostly all better than Unpaired case but their quality is decreasing.
Thus combining proteins to form a bigger one can help us determine structures and relative inter atomic distances of that protein better.
The MSA help us to even determine structural accuracy of bigger proteins. For example, we want to determine the PAE graph for a protein P which is made of 2 copies of protein A, 1 copy of B and 2 copies of C, we can create MSA sequence of each and using the graphs we can create multiple possible structure of P with respective PAE graphs and choose the best one. Hence this simplifies and increase information regarding protein analysis.