View event

Date: No date given

This Learning Pathway collects the results of Intellectual Output 2 in the Gallantries Project

Keywords: beginner, data-science, galaxy-interface, microbiome, variant-analysis, visualisation

Learning objectives:

Analyze and preprocess Nanopore reads
Apply the ML techniques to analyse their own datasets
Be able to write simple shell scripts for running multiple workflows concurrently or sequentially.
Build complex and customized plots from data in a data frame.
Check quality reports generated by FastQC and NanoPlot for metagenomics Nanopore data
Create a number of Circos plots using the Galaxy tool
Create clean, non-repetitive workflows
Describe what faceting is and apply faceting in ggplot.
Familiarise yourself with the various different track types
Filter, annotate and report lists of variants
Identify pathogens based on the found virulence factor gene products via assembly, identify strains and indicate all antimicrobial resistance genes in samples
Identify pathogens via SNP calling and build the consensus gemone of the samples
Identify yeast species contained in a sequenced beer sample using DNA
Inspect metagenomics data
Interpret and visualize the results obtained from ML analyses on omics datasets
Learn about SRA aligned read format and vcf files for Runs containing SARS-CoV-2 content
Learn about the Rule Based Uploader
Learn even more about the Rule Based Uploader
Learn how to change a workflow using the workflow editor
Learn how to extract a workflow from a Galaxy history
Learn how to use Pangolin to assign annotated variants to lineages.
Learn how to use Workflow Parameters to improve your Workflows
Learn to use the planemo run subcommand to run workflows from the command line.
Modify the aesthetics of an existing ggplot plot (including axis labels and color).
Perform taxonomy profiling indicating and visualizing up to species level in the samples
Perform variant linkage analyses for phenotypically selected recombinant progeny
Plot an E. coli genome in Galaxy
Preprocess the sequencing data to remove adapters, poor quality base content and host/contaminating reads
Produce scatter plots, boxplots, and time series plots using ggplot.
Relate all samples' pathogenic genes for tracking pathogens via phylogenetic trees and heatmaps
Run metagenomics tools
Set universal plot settings.
Understand and master dataset collections
Understand differences between ML algorithms categories and to which kind of problem they can be applied
Understand different applications of ML in different -omics studies
Understand how to search the metadata for these Runs to find your dataset of interest and then import that data in your preferred format
Understand key aspects of workflows
Understand the ML taxonomy and the commonly used machine learning algorithms for analysing -omics data
Use Kraken2 to assign a taxonomic labels
Use Nanopore data for studying soil metagenomics
Use joint variant calling and extraction to facilitate variant comparison across samples
Use some basic, widely used R packages for ML
Use the scientific library matplolib to explore tabular datasets
Visualize the microbiome community of a beer sample
With tracks for the annotations, sequencing data, and variants.

Event types:

Workshops and courses

Sponsors: ELIXIR Europe, University of Freiburg, de.NBI

Scientific topics: Metagenomics, Microbial ecology, Taxonomy, Sequence analysis, Metabarcoding, Public health and epidemiology, Sequence assembly, Pathology

Activity log

Content provider

Learning Pathway Gallantries Grant - Intellectual Output 2 - Large-scale data analysis, and introduction to visualisation and data modelling