e-learning

Regulations/standards for AI using DOME

Abstract

With the significant drop in the cost of many high-throughput technologies, vast amounts of biological data are being generated and made available to researchers. Machine learning (ML) has emerged as a powerful tool for analyzing data related to cellular processes, genomics, proteomics, post-translational modifications, metabolism, and drug discovery, offering the potential for transformative medical advancements.

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • How should data provenance be documented to ensure transparency in AI research?
  • What strategies can be employed to manage redundancy between training and test datasets in biological research?
  • Why is it important to make datasets and model configurations publicly available, and how can this be achieved?
  • What are the key considerations in selecting and documenting optimization algorithms and parameters for AI models?
  • How can the interpretability of AI models be enhanced, and why is this crucial in fields like drug design and diagnostics?

Learning Objectives

  • Explain the importance of data provenance and dataset splits in ensuring the integrity and reproducibility of AI research.
  • Develop a comprehensive plan for documenting and sharing AI model configurations, datasets, and evaluation results to enhance transparency and reproducibility in their research.

Licence: Creative Commons Attribution 4.0 International

Keywords: Statistics and machine learning, ai-ml, elixir

Target audience: Students

Resource type: e-learning

Version: 3

Status: Active

Prerequisites:

Introduction to Galaxy Analyses

Learning objectives:

  • Explain the importance of data provenance and dataset splits in ensuring the integrity and reproducibility of AI research.
  • Develop a comprehensive plan for documenting and sharing AI model configurations, datasets, and evaluation results to enhance transparency and reproducibility in their research.

Date modified: 2025-05-19

Date published: 2025-03-11

Authors: Fotis E. Psomopoulos, Stella Fragkouli

Contributors: Anup Kumar, Bérénice Batut, Saskia Hiltemann, Stella Fragkouli

Scientific topics: Statistics and probability

External resources:

Activity log