Machine Learning Glossary

A glossary of machine learning terms, frameworks, tasks and technologies.
Created on October 20|Last edited on November 22
Comment
This glossary is a work in progress. It is limited at present, but is being actively worked on.
A﻿
B
Batch NormalizationBatch normalization in machine learning is a technique which standardized (or normalizes) the inputs to layers deep in a neural network, to avoid internal covariance shift.
BERTBidirectional Encoder Representations from Transformers, better known as BERT, is a revolutionary paper by Google that increased the State-of-the-art performance for various NLP tasks and was the stepping stone for many other revolutionary architectures.
﻿
C
Catastrophic ForgettingThe tendency of neural networks to completely forget how to do one task, when they are trained on another.
ClassificationClassification is a supervised learning task concerned with predicting or categorizing data. It involves the systematic grouping of data into categories. Classification algorithms are especially useful in cases where there is a large volume of historical data to be categorized. 
Cross Entropy LossCross entropy loss is a metric used to measure how well a classification model in machine learning performs. The loss (or error) is measured as a number between 0 and 1, with 0 being a perfect model. The goal is generally to get your model as close to 0 as possible. Read more >﻿
Cross-ValidationSee K-Fold Cross-Validation.
D
DataLoaderThe DataLoader in PyTorch is a class that fetches data from a Dataset and serves the data in batches to the model. Generally batches will be created for training, and one for testing. Read more >﻿
DropoutDropout is a machine learning technique where you remove (or "drop out") units in a neural net to simulate training large numbers of architectures simultaneously. Importantly, dropout can drastically reduce the chance of overfitting during training. 
Decision treesDecision trees are supervised learning models used for classification and regression problems. A decision tree learns rules that “branch” off into different predictions based on the features of a data point, in order to predict some value of a new data point.
﻿
E﻿
F
FinBERTDropout is a machine learning technique where you remove (or "drop out") units in a neural net to simulate training large numbers of architectures simultaneously. Importantly, dropout can drastically reduce the chance of overfitting during training. 
﻿
G﻿
H﻿
I
Image ClassificationIn image classification, a system detects objects and produces a Boolean true or false, answering whether a particular image belongs to a certain class or not. The goal of the classification is to assign a label to each image.
Image Segmentation
Figure 1 .2 Image segmentation classifying each pixel in an image
Image segmentation,  sees the system assign a label to every pixel in an image. A class is assigned to each pixel, defining what the system believes it to be and which object it belongs to.
J﻿
K
K-Fold Cross ValidationK-fold cross-validation is a procedure where a dataset is divided into multiple training and validations sets (folds) where k is the number of them to help safeguard the model against random bias caused by the selection of only one training and validation set. 
K-Means Clusteringk-means clustering is an unsupervised learning algorithm used for clustering problems. The goal is to partition data points into a pre-specified k number of clusters, which each data point belonging to the cluster with the nearest center.
K-Nearest Neighborsk-nearest neighbors, or knn, is a supervised learning algorithm used primarily for classification problems. The goal is to predict the probability that a data point belongs to a certain class, based on which class(es) the data points near it belong to. 
L
Linear RegressionLinear regression is a supervised learning algorithm used for regression problems. The goal is to identify the hyperplane that best predicts the value of some relationship between two or more features within a specified dataset, in order to predict new values.
You can read more on linear regression in this article.
You can read the mathematical definition in our ML mathematics glossary.﻿﻿
Logistic regressionLogistic regression is a supervised learning algorithm used primarily for classification. The goal is to identify the logistic curve that best predicts the probability that an input belongs to some class, which is then used to map the input to an actual class.
M
Meta LearningMeta-learning in neural networks refers to the approach of using a reward and/or error system to teach said system to solve problems outside its trained domain. Rather than looking directly at the data however, the system instead looks to the output of the algorithm and trains on making predictions based on that.
﻿
N
Naive BayesNaive Bayes classifiers are supervised learning models used for classification problems. A naive Bayes model uses Bayes’ theorem to calculate the probability that a data point belongs to each possible class, in order to identify the most probable class.
Neural Network PruningOne popular approach for reducing the resource requirements at test time is Neural Network Pruning. This means systematically removing parameters (neurons, connections, etc.) from an existing network to try to reduce down its size. 
﻿
O
Object DetectionObject Detection is a computer vision technique in which software learns to identify and locate objects in a video or digital image. Once an object has been identified and localized, an Object Detection algorithm can also label it.
Optical Character Recognition (OCR)Optical Character Recognition (OCR) is a computer vision and machine learning technique that extracts the text from images, generally to make it usable by other systems including image search and software-based receipt processing.
OptimizerIn deep learning, an optimizer is a function or algorithm that is dependent on a neural network's Weights & Biases. The optimizer modifies these parameters with the goal of reducing the loss with minimal effort.
﻿
P
Permutation-InvariancePermutation-invariance in machine learning refers to a system in which reordering the inputs does not impact the output.
PolicyIn machine learning, a policy is a formula based on the current environment, the possible set of actions, the probability that the action will result in a state change, and the reward function. The policy is used to steer a model to the highest reward.
Principal component analysis (PCA)Principal component analysis (PCA) is an unsupervised dimensionality reduction algorithm. The goal is to compute a dataset’s principal components (PCs), new features derived from the original features. Typically, only the first two to three PCs are kept, allowing the dataset to be remapped into two or three dimensions.
Q﻿
R﻿
S
Self-OrganizationSelf-organization in neural networks, describes the ability of a self-supervised system to take local interactions between disorganized parts of itself and create from that a coherent policy.
Sensory NeuronIn a neural network, a sensory neuron (or sensory input neuron) is a node which takes input from "the outside world" and after processing it through the activation function, passes the resulting value along.
Support Vector Machines (SVMs)Support vector machines (SVMs) are supervised learning models used for classification. An SVM is the hyperplane that best separates different classes within a dataset, in order to classify a new data point by identifying which side of the hyperplane (aka which class) it belongs to.
T
TokensA token in Natural Language Processing is a representation of a word, word segment (subword) or character. When text is being processed, a tokenizer breaks that text into tokens, so those tokens can be processed by the system with historically higher efficiency that processing the same text character-by-character.
U﻿
V﻿
W
Weight InitializationWeight Initialization was first discussed as a "trick" to prevent certain undesirable behaviours during neural network training. The initial values of the weights can have a significant impact on the training process. Read more >﻿
﻿
X﻿
Y
YOLOYOLO stands for You Only Look Once and is an extremely fast object detection framework using a single convolutional network. YOLO is frequently faster than other object detection systems because it looks at the entire image at once as opposed to sweeping it pixel-by-pixel.
﻿
Z
Zero-Shot LearningZero-shot learning is a machine learning term that describes the ability of a model to be applied to a task when it has received no training on that task, but has been trained on tasks of other types.
﻿
Add a comment