e-learning

Data Manipulation Olympics - SQL

Abstract

Scientific analyses often consist of a number of tools that run one after the other, in order to go from the raw data to scientific insight. Between these specialized tools, simple data manipulation steps are often needed as a kind of "glue" between tools. For example, the output of tool A may produce a file that contains all the information needed as input for tool B, but tool B expects the columns in a different order. Or in genomic data analysis, some tools expect chromosome X to be listed as chrX, while others simply expect X. In these situations, extra data manipulation steps are needed to prepare files for input to analysis tools.

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • How can I do basic data manipulation in SQL?
  • Which functions are available to convert, reformat, filter, sort etc my data stored in a database?

Learning Objectives

  • Familiarize yourself with data manipulation in SQL
  • Perform basic SQL query tasks in Galaxy
  • Reason about the expected outcome of tools

Licence: Creative Commons Attribution 4.0 International

Keywords: Foundations of Data Science, cyoa, jupyter-notebook, sql

Target audience: Students

Resource type: e-learning

Version: 4

Status: Active

Learning objectives:

  • Familiarize yourself with data manipulation in SQL
  • Perform basic SQL query tasks in Galaxy
  • Reason about the expected outcome of tools

Date modified: 2024-05-07

Date published: 2023-06-16

Authors: Helena Rasche, Saskia Hiltemann

Scientific topics: Software engineering


Activity log