e-learning
Clustering in Machine Learning
Abstract
The goal of unsupervised learning is to discover hidden patterns in any unlabeled data. One of the approaches to unsupervised learning is clustering. In this tutorial, we will discuss clustering, its types and a few algorithms to find clusters in data. Clustering groups data points based on their similarities. Each group is called a cluster and contains data points with high similarity and low similarity with data points in other clusters. In short, data points of a cluster are more similar to each other than they are to the data points of other clusters. The goal of clustering is to divide a set of data points in such a way that similar items fall into the same cluster, whereas dissimilar data points fall in different clusters. Further in this tutorial, we will discuss ideas on how to choose different metrics of similarity between data points and use them in different clustering algorithms.
About This Material
This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.
Questions this will address
- How to use clustering algorithms to categorize data in different clusters
Learning Objectives
- Learn clustering background
- Learn hierarchical clustering algorithm
- Learn k-means clustering algorithm
- Learn DBSCAN clustering algorithm
- Apply clustering algorithms to different datasets
- Learn how to visualize clusters
Licence: Creative Commons Attribution 4.0 International
Keywords: Statistics and machine learning
Target audience: Students
Resource type: e-learning
Version: 13
Status: Active
Prerequisites:
Introduction to Galaxy Analyses
Learning objectives:
- Learn clustering background
- Learn hierarchical clustering algorithm
- Learn k-means clustering algorithm
- Learn DBSCAN clustering algorithm
- Apply clustering algorithms to different datasets
- Learn how to visualize clusters
Date modified: 2024-01-15
Date published: 2020-05-08
Contributors: Alireza Khanteymoori, Anup Kumar, Björn Grüning, Mélanie Petera, Saskia Hiltemann
Scientific topics: Statistics and probability
Activity log