e-learning

Using the VGP workflows to assemble a vertebrate genome with HiFi and Hi-C data

Abstract

The Vertebrate Genome Project (VGP), a project of the Genome 10K (G10K) Consortium, aims to generate high-quality, near error-free, gap-free, chromosome-level, haplotype-phased, annotated reference genome assemblies for every vertebrate species. The VGP has developed a fully automated de-novo genome assembly pipeline, which uses a combination of three different technologies: Pacbio high fidelity reads (HiFi), all-versus-all chromatin conformation capture (Hi-C) data, and (optionally) Bionano optical map data. The pipeline consists of nine distinct workflows. This tutorial provides a quick example of how to run these workflows for one particular scenario, which is, based on our experience, the most common: assembling genomes using HiFi Reads combined with Hi-C data (both generated from the same individual).

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • What combination of tools can produce the highest quality assembly of vertebrate genomes?
  • How can we evaluate how good it is?

Learning Objectives

  • Learn the tools necessary to perform a de novo assembly of a vertebrate genome
  • Evaluate the quality of the assembly

Licence: Creative Commons Attribution 4.0 International

Keywords: Assembly, VGP, eukaryote, pacbio

Target audience: Students

Resource type: e-learning

Version: 23

Status: Active

Prerequisites:

  • Introduction to Galaxy Analyses
  • Quality Control

Learning objectives:

  • Learn the tools necessary to perform a de novo assembly of a vertebrate genome
  • Evaluate the quality of the assembly

Date modified: 2024-10-08

Date published: 2022-04-06

Authors: Alex Ostrovsky, Anton Nekrutenko, Brandon Pickett, Cristóbal Gallardo, Delphine Lariviere, Linelle Abueg

Scientific topics: Sequence assembly

External resources:

Activity log