Proteogenomics 1: Database Creation

e-learning

Proteogenomics 1: Database Creation

View material

Abstract

Proteogenomics involves the use of mass spectrometry (MS) based proteomics data against genomics and transcriptomics data to identify peptides and to understand protein-level evidence of gene expression. In the first section of the tutorial, we will create a protein database (FASTA) using RNA-sequencing files (FASTQ) and then perform sequence database searching using the resulting FASTA file with the MS data to identify peptides corresponding to novel proteoforms. Then, we will assign the genomic coordinates and annotations for these identified peptides and visualize the data for its spectral quality and genomic localization

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

How to create a customized Protein Database from RNAseq data?

Learning Objectives

Generating a customized Protein sequence database with variants, indels, splice junctions and known sequences.

Licence: Creative Commons Attribution 4.0 International

Keywords: Proteomics, proteogenomics

Target audience: Students

Resource type: e-learning

Version: 29

Status: Active

Prerequisites:

Introduction to Galaxy Analyses

Learning objectives:

Generating a customized Protein sequence database with variants, indels, splice junctions and known sequences.

Date modified: 2025-05-08

Date published: 2018-11-20

Authors: James Johnson, Pratik Jagtap, Praveen Kumar, Ray Sajulga, Subina Mehta, Timothy J. Griffin

Contributors: Anthony Bretaudeau, Björn Grüning, Helena Rasche, James Johnson, Melanie Föll, Nicola Soranzo, Saskia Hiltemann, Subina Mehta

Scientific topics: Proteomics

External resources:

Content provider