e-learning

Neoantigen 2: Non-normal-Database-Generation

Abstract

Proteogenomics leverages mass spectrometry (MS)-based proteomics data alongside genomics and transcriptomics data to identify neoantigens—unique peptide sequences arising from tumor-specific mutations. In the initial section of this tutorial, we will construct a customized protein database (FASTA) using RNA-sequencing files (FASTQ) derived from tumor samples. Following this, we will conduct sequence database searches using the resultant FASTA file and MS data to identify peptides corresponding to novel proteoforms, specifically focusing on potential neoantigens. We will then assign genomic coordinates and annotations to these identified peptides and visualize the data, assessing both spectral quality and genomic localization.

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • Why must we generate a customized fusion database for Proteogenomics research?

Learning Objectives

  • Downloading databases related to 16SrRNA data
  • For better neoantigen identification results.

Licence: Creative Commons Attribution 4.0 International

Keywords: Proteomics, label-free

Target audience: Students

Resource type: e-learning

Version: 1

Status: Active

Prerequisites:

  • Introduction to Galaxy Analyses
  • Proteomics

Learning objectives:

  • Downloading databases related to 16SrRNA data
  • For better neoantigen identification results.

Date modified: 2025-01-14

Date published: 2025-01-14

Authors: James Johnson, Katherine Do, Subina Mehta

Contributors: Pratik Jagtap, Timothy J. Griffin

Scientific topics: Proteomics


Activity log