How Machine Learning Models Work

Want to find out a little more about what machine learning is and the most commonly used machine learning models? It is worth highlighting that there a vast range of models and sub-categories of machine learning, involving layers of complexity that can be difficult to understand as a beginner. As a result, the models are discussed on an overview basis. In this guide, we go over the most frequently used models and how they function.

Two main categories of machine learning models

First of all, it is important to establish that machine learning models fall under one of the two categories.

Supervised learning

Supervised learning is a type of machine learning model that involves the system learning a function, mapping an input to an output. This is based on the examplar input-output the computer has been given.

Within supervised learning, you have two further models of machine learning. These are:

Regression models

Regression models in machine learning are used for predicting continuous value. The output is continuous. A regression model in practice, for example, could be making house value predictions.

Machine learning is a bit like a Russian doll, as there are many different models and sub-categories. For example, regression models also have a number of types:

Simple linear regression: a linear connection must be apparent between the target variable and predictor
Polynomial regression: involves applying linear regression on the polynomial features of a given degree
Decision trees regression: can be used for regression and classification purposes (which will get to in the next section) where each level requires the identification of a splitting attribute
Neural network regression: a model that is multi-layered by nature and gets its name from the human brain. Every node in the many hidden layers represents different functions that each input goes through. These then result in an output
Random Forest regression: takes into consideration the prediction of a number of decision regression trees at once

Classification models

Classification models have an output that is discrete and therefore the predict responses, such as if a tumour is benign or cancerous, or if an email is spam or not.

It has a number of different sub-category models. Some of these are:

Logistic regression: works in a similar way to linear regression, but used primarily to calculate the probability of outcomes (typically just two).
Support vector machine: used for both classification and regression models. The goal for this algorithm is to identify the hyperplane in an N-dimensional space, which classifies the data points
Decision Tree, Neural Network and Random Forest are all used here, as in the previously mentioned regression models. The main difference is that the output is not continuous but discrete.

Unsupervised learning

Unsupervised learning is primarily used in order to identify patterns as well as gain inferences from the input data. This is all done without having any references to labelled outcomes. There are two main types of unsupervised learning:

Clustering models

This is a machine learning technique (including k-means clustering and hierarchical clustering) that involves the clustering of data points, and it is used in a variety of circumstances. For example, it is commonly used to detect fraud, document classification as well as customer segmentation.

Dimensionality reduction models

This unsupervised learning technique involves reducing the number of features. There are a variety of techniques that can be used, but most fall into one of two categories: feature extraction or feature elimination.