Machine Learning - Unsupervised Learning

Table of Contents

This article explains what Unsupervised Learning is for Machine Learning.

What is Unsupervised Learning? #

Unsupervised learning is a distinct type of machine learning that operates on a unique premise. Unlike supervised learning where data is already labeled, in unsupervised learning, the data only comes with input variables (x), and no corresponding output labels (y). This means the algorithm is not guided by a predefined outcome. Instead, it is tasked with discovering the inherent structure within the data, or to find interesting and insightful patterns in the unlabeled data.

Types of Unsupervised Learning #

There are several key types of unsupervised learning methodologies that are commonly used:

Clustering #

: Clustering is a technique that seeks to group similar data points together. The aim is to identify inherent groupings within the data. For instance, Google News employs clustering to group together similar news articles from different sources, DNA microarray data analysis uses clustering to group genes with similar expression patterns, and businesses use clustering for customer segmentation, grouping customers with similar buying behaviors.

Anomaly Detection #

: Anomaly detection is another important method in unsupervised learning. This technique is used to identify unusual data points in the dataset. These could be outliers or anomalies that deviate significantly from the rest of the data. Anomaly detection is particularly useful in fraud detection, network security, and fault detection.

Dimensionality Reduction #

: This is a technique that seeks to compress data by reducing the number of random variables under consideration, using fewer essential variables. This not only helps in data compression, but also in improving the performance of machine learning models by removing noise and redundant data, and hence, making the learning process more efficient.