LLM generated summaries for protein classification at InterPro
Date: 19 March 2025 @ 09:00 - 17:00
This webinar will explore how Large Language Models can accelerate protein classification by automatically generating descriptive annotations for previously unannotated protein families. Traditionally, the process of curating protein family descriptions relies on manual literature review and expert knowledge, a time-consuming approach that often delays integration into biological databases. In this session, we will discuss our innovative workflow that leverages LLMs to synthesise functional summaries from existing curated data, thereby streamlining the annotation process. We will also highlight a comparative evaluation of using both a state-of-the-art GTP model and a fine-tuned local model, demonstrating that smaller, cost-effective LLMs can produce high-quality descriptions that support rapid protein classification.
Organizer: European Bioinformatics Institute (EBI)
Event types:
- Workshops and courses
Scientific topics: Machine learning
Activity log