e-learning

A Docker-based interactive Jupyterlab powered by GPU for artificial intelligence in Galaxy

Abstract

Jupyterlab is a popular integrated development environment (IDE) for a variety of tasks in data science such as prototyping analyses, creating meaningful plots, data manipulation and preprocessing. Python is one of the most used languages in such an environment. Given the usefulness of Jupyterlab, more importantly in online platforms, a robust Jupyterlab notebook application has been developed that is powered by GPU acceleration and contains numerous packages such as Pandas, Numpy, Scipy, Scikit-learn, Tensorflow, ONNX to support modern data science projects. It has been developed as an interactive Galaxy tool that runs on an isolated docker container. The docker container has been built using nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu20.04 as the base container. Moreover, a Galaxy tool ( run_jupyter_job) can be executed using Bioblend which uses Galaxy's remote job handling for long-running machine learning and deep learning training. The training happens remotely on a Galaxy cluster and the outcome datasets such as the trained models, tabular files and so on are saved in a Galaxy history for further use.

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • How to use Jupyterlab and it several features?
  • How to use it for creating input datasets and writing artificial intelligence (AI) algorithms?

Learning Objectives

  • Learn to use Jupyterlab - an online Python editor designed for developing AI algorithms
  • Explore several of its features such as Git, workflow of jupyter notebook, integration to Galaxy
  • Develop AI algorithms using Tensorflow
  • Send long-running jobs to Galaxy's cluster and save results in its history
  • Reproduce results from recent scientific publications - COVID CT scan segmentation and 3D protein structure prediction

Licence: Creative Commons Attribution 4.0 International

Keywords: Statistics and machine learning, deep-learning, image-segmentation, interactive-tools, jupyter-lab, machine-learning, protein-3D-structure

Target audience: Students

Resource type: e-learning

Version: 8

Status: Active

Prerequisites:

  • Introduction to Galaxy Analyses
  • Introduction to deep learning
  • JupyterLab in Galaxy

Learning objectives:

  • Learn to use Jupyterlab - an online Python editor designed for developing AI algorithms
  • Explore several of its features such as Git, workflow of jupyter notebook, integration to Galaxy
  • Develop AI algorithms using Tensorflow
  • Send long-running jobs to Galaxy's cluster and save results in its history
  • Reproduce results from recent scientific publications - COVID CT scan segmentation and 3D protein structure prediction

Date modified: 2024-07-09

Date published: 2022-04-06

Authors: Anup Kumar

Contributors: Anup Kumar, Björn Grüning, Helena Rasche, Kaivan Kamali, Saskia Hiltemann

Scientific topics: Statistics and probability


Activity log