Project Description

back to top back to top

Our Project

Antibody humanization has reached certain technical maturity; nevertheless, there are plenty of animals on earth in need of antibody drugs that lack a proper solution. We aim to tackle the issue with artificial intelligence for its powerful fitting, learning, and predictive capabilities. On one hand, we build an extensive homologous library of antibody Framework Regions (FRs) and Complementarity Determining Regions (CDRs) from various species. On the other hand, we optimize antibody immunogenicity via mutagenesis of a given sequence and adapt multi-dimensional scoring tools.

Based on this, we have developed an artificial intelligence-based antibody drug design system that integrates antibody sequence generation, antibody sequence scoring, antibody structure scoring, and antibody result visualization into a single system.

The goal of this project is to support antibody research in more other species by expanding the dataset and developing immunogenicity optimization algorithms, and to provide solutions for antibody sequence generation, antibody sequence scoring, antibody structure scoring methods.

Our project starts with the other sources of antibody sequences against target species antigen (such as mouse-derived antibodies), automatically generates antibody sequences that may have better efficacy and lower immunogenicity, and performs comprehensive scoring. The core of our project comprises three components:

1. Collecting antibody sequences from hundreds of different species with annotations and constructed a multi-species FR homology database.

2. Proposing a rational approach for scoring antibody immunogenicity, encompassing diverse species.

3. Delivering a comprehensive set of automated tools for designing antibody for diverse target species, enabling multi-dimensional evaluation of generated sequences.

Contribution overview
Figure 1: Project overview


Antibodies are proteins produced by the immune system in response to the presence of foreign substances called antigens. Their primary function is to recognize and neutralize these antigens to protect the body from infections and diseases. Moreover, the antibodies obtained in one species (e.g., murine) cannot be directly used in another species, as it may lead to the development of an immunogenic antidrug antibody (ADA) response.Therefore, to obtain antibodies for a specific species without conducting in vivo experiments, it is necessary to perform a process called "species-specification", designing antibodies suitable for certain speices basing on those of other speices, on existing antibodies such as mouse-derived antibodies. This process involves modifying the antibodies to reduce their immunogenicity when used in a different species.

In our early research, we found out that the field of humanization of murine antibodies has developed since the 1980s and has now achieved considerable biotechnological maturity. Nowadays (especially since 2023), significant breakthroughs in antibody design have been reached with the help of AI technologies, especially in the highly variable CDR H3 regions. In this context, methods such as transgenic mice, antibody humanization, and AI-driven de novo design are all proven effective in obtaining antibodies suitable for human use. That is, the realm has come a long way in designing antibodies with high affinity and low immunogenicity to human.

However, after communicating with a biotech company, we realized that there are still plenty of gaps left to fill in designing antibodies for various species other than human.

For example, antibody drugs are required for whether pet dogs, pet cats, or animals like pigs, cows, and sheep that have significant importance in the livestock industry. Therefore, attempts to conquer the problem are neccessary, and our goal is to extend the progress made in antibody design to a broader range of species with the help of artificial intelligence.

Our Solution

During our discussion with the company, they offered us many insights and advice to the realm and industry. Consequently, between the two technical roadmap, AI de novo design and extending the spieces derived from murine antibodies with AI methods, we chose the latter one as the former technique hasn't been widely acknowledged by the industry yet and is unable to achieve industrial-level reliability.

Antibodies are divided into constant and variable regions, within the variable region are the high-frequency mutated CDR regions (where the antigen comes into contact with the antibody), and the relatively conserved and species-specific FR regions.

Antibody structure
Antibody structure
Figure 2 & 3: Antibody structure

On this note, our project is divided into two main parts.

The first part is collecting antibody variable region sequences from various species and using them to construct a homologous library of Framework Region (FR) and Complementarity Determining Region (CDR) for each species, providing a comprehensive and reusable resource for future research.

The second part highlights antibody immunogenicity optimization based on mutations to acheive our goal of antibody species-specification. Meaning, it centers on antibody FR + CDR design, drawing from the achievements in the field of antibody humanization to avoid potential immunogenic antidrug antibody (ADA) response when applying antibodies to a new species.

In our project, we identify the CDRs and FRs of antibody sequences and direct the directed evolution of a given sequence through immunogenicity scores, which are altered by mutations that alter the immunogenicity score of the sequence. In essence, our algorithm aims to adapt existing antibody humanization scoring tools to many other species. Subsequently, we also apply other scoring methods, including structure scoring (structural rationality scoring and affinity scoring), to a given antibody to evaluate it on multiple levels.

More Detail

In the first part, due to the lack of PDB files for antibodies of many different species, we first extended the dataset by BLAST to search for homologous sequences and valuable frames, and then filtered, harmonized and processed the sequences. We also completed annotations for the Framework Regions (FRs) and Complementarity-Determining Regions (CDRs) and constructed a multi-species FR homology database. This will facilitate the work of other researchers in related fields, making the development of antibody drugs for multiple species more accessible.

In the second part, we propose a strategy based on immunogenicity optimization, considering that there are fewer antibody data from other species compared to human antibodies. Traditionally, antibody humanization methods involve CDR chimerism and reverse mutation, where the initial step involves comparing homologous sequences from homology libraries as templates for further design. Therefore, this approach may not be applicable to data-limited species, as the results of homology matching may lack reliability due to the limited number of samples. Therefore, in our project, we identify CDRs and FRs of antibody sequences and direct the directed evolution of a given sequence through immunogenicity scores, which are altered by mutations that alter the immunogenicity score of the sequence.

For more technical details, please see the corresponding page on the wiki.

Why AI

In this era of big data, we can mine many potential sequence rules from existing sequences. Owing to the potent fitting, learning, and predictive capabilities of AI algorithms, applying AI solutions to biotechnological problems offers numerous benefits, including accelerating research and discovery, personalization, enhanced data analysis, and guided antibody engineering decisions, etc. AI algorithms can create a cross-species antibody design model by analyzing antibody sequences and structural data from different species. The model can extend antibody design to more species, including pet dogs, pet cats, and livestock animals. By analyzing vast amounts of antibody sequences and structural data, AI can identify crucial sequence and structural features, providing valuable guidance for antibody engineering decisions.

The integration of AI's automation technology enables efficient screening and discovery of new potential antibody candidates from extensive antibody libraries, significantly increasing the efficiency and success rate of antibody discovery. Furthermore, AI algorithms automate the simulation and computation of numerous antibody sequences to optimize their properties such as affinity, specificity, stability, and production efficiency. This contributes to enhancing the therapeutic effectiveness and production efficiency of antibodies.

Our AI approach brings several advancements in antibody species diversification. First, with computational-guided directed evolution and rational analysis of antibody sequences, our project can markedly reduce the material resources and time consumed in wet-lab experimental validations. Secondly, we gather large scale data in our homologous library of FRs and CDRs in diverse species in an organized manner, which provides added convenience for further antibody-related reseach. Last but not least, in terms of drug development, with the comprehensive knowledge of given antibody sequences generated by our algorithm and providing antibodies suitable for multiple species, we can expedites the drug and treatment development process and open up new possibilities for disease therapies.