e-learning

Text-mining with the SimText toolset

Abstract

Literature exploration in PubMed on a large number of biomedical entities (e.g., genes, diseases, or experiments) can be time-consuming and challenging, especially when assessing associations between entities. Here, we use SimText, a toolset for literature research that allows you to collect text from PubMed for any given set of biomedical entities, extract associated terms, and analyze similarities among them and their key characteristics in an interactive tool.

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • How can I automatically collect PubMed data for a set of biomedical entities such as genes?
  • How can I analyze similarities among biomedical entities based on PubMed data on large-scale?

Learning Objectives

  • Learn how to use the SimText toolset
  • Upload a table with biomedical entities in Galaxy
  • Retrieve PubMed data for each of the biomedical entities
  • Extract biomedical terms from the PubMed data for each biomedical entity
  • Analyze the similarity among the biomedical entities based on the extracted data in an interactive app

Licence: Creative Commons Attribution 4.0 International

Keywords: Statistics and machine learning, interactive-tools

Target audience: Students

Resource type: e-learning

Version: 7

Status: Active

Prerequisites:

Introduction to Galaxy Analyses

Learning objectives:

  • Learn how to use the SimText toolset
  • Upload a table with biomedical entities in Galaxy
  • Retrieve PubMed data for each of the biomedical entities
  • Extract biomedical terms from the PubMed data for each biomedical entity
  • Analyze the similarity among the biomedical entities based on the extracted data in an interactive app

Date modified: 2024-03-05

Date published: 2021-04-05

Authors: Daniel Blankenberg, Dennis Lal group, Marie Gramm

Contributors: Björn Grüning, Helena Rasche, Martin Čech, Saskia Hiltemann, dlal-group

Scientific topics: Statistics and probability


Activity log