Toloka logo
Menu
Products
Data Labeling Platform
Unlimited flexibility to handle
the most complex labeling needs
Adaptive AutoML
New
Human-in-the-loop workflows
with ML models
ML Platform
New
Tune better models faster.
Deploy in a few clicks
Solutions
By industry
E-commerce
Social Science
Field Data Collection
By AI domain
Search Relevance
Computer Vision
NLP & Others
Content Moderation
Large Language Models
Custom Datasets
Get high-quality custom
datasets for ML
Toloka In-house
New
Use our technologies
with your own annotators
Pricing
Explore
Resources
What’s new
Success stories
Blog
Events
Community
Knowledge Base
Toloka Crash Course
Practical step-by-step introduction
to Toloka
Crowdsourcing
concepts
Tap into the wisdom of
the crowd on a large scale
Docs
Guides
Get started
Tutorials
Code Recipes
API & SDK
API Reference
Python SDK
Toloka on Github
Impact on AI
In the Industry
Responsible AI
Opensource
Public Datasets
Community
CSA Certification
In Science
Toloka Research
Education Partnerships
Toloka Grant Program
Impact on AI
Advancing the global AI community
by empowering professional growth
and putting humanity at the center of
technology
Company
About us
Partnerships
Security
In the Media
Careers
Contact
Products
Data Labeling Platform
Unlimited flexibility to handle
the most complex labeling needs
Adaptive AutoML
New
Human-in-the-loop workflows
with ML models
ML Platform
New
Tune better models faster.
Deploy in a few clicks
Solutions
By industry
E-commerce
Social Science
Field Data Collection
By AI domain
Search Relevance
Computer Vision
NLP & Others
Content Moderation
Large Language Models
Custom Datasets
Get high-quality custom
datasets for ML
Toloka In-house
New
Use our technologies
with your own annotators
Pricing
Explore
Resources
What’s new
Success stories
Blog
Events
Community
Knowledge Base
Toloka Crash Course
Practical step-by-step introduction
to Toloka
Crowdsourcing
concepts
Tap into the wisdom of
the crowd on a large scale
Docs
Guides
Get started
Tutorials
Code Recipes
API & SDK
API Reference
Python SDK
Toloka on Github
Impact on AI
In the Industry
Responsible AI
Opensource
Public Datasets
Community
CSA Certification
In Science
Toloka Research
Education Partnerships
Toloka Grant Program
Impact on AI
Advancing the global AI community
by empowering professional growth
and putting humanity at the center of
technology
Company
About us
Partnerships
Security
In the Media
Careers
Contact
Request a demo
Log in
Machine learning terminology from A to Z
A
B
C
D
E
F
G
H
I
K
L
M
N
O
P
R
S
T
U
V
X
Y
Z
A
Active learning
Active learning in machine learning refers to the cases when an algorithm can query a user (also called teacher or oracle) to label new data points with the desired outputs.
Annotation
In the field of AI development, data annotation (also called labeling or tagging) is the process of adding one or more meaningful labels to data.
Artificial Neural Network (ANN)
Artificial neural network (ANN) is a computing system with an architecture inspired by the biological brains of living organisms.
Automated machine learning (AutoML)
AutoML is the process of automating various complicated or time-consuming tasks in machine learning.
B
Backpropagation
In machine learning, backpropagation is a form of supervised learning algorithm, mostly used to train feedforward neural networks.
Bidirectional Encoder Representations from Transformers (BERT)
BERT is a family of masked-language models based on the transformers architecture.
Bounding box
In data labeling, specifically in object detection tasks, bounding boxes are rectangles used to describe spatial location of objects in images.
C
Classification
Classification is a type of algorithm used in supervised learning.
Computational linguistics
Computational linguistics is a field of scientific study of natural language and modeling of language via computing systems.
Computer vision
Computer vision is an interdisciplinary field that deals with enabling computers to gain high-level understanding from digital images or videos.
Convolutional neural networks
Convolutional neural networks are a common type of artificial neural network used for analyzing visual images.
D
Data annotation
In the field of machine learning, annotation usually refers to data annotation (also called data labeling). It is the process of adding meaningful tags to data, so that this data can be later used to train ML models.
Data labeling
In the field of machine learning, annotation usually refers to data annotation (also called data labeling). It is the process of adding meaningful tags to data, so that this data can be later used to train ML models.
Data-centric AI
Data-centric AI is an approach to creating AI systems where, instead of tweaking hyperparameters of ML models, you focus on improving your data quality.
Deep learning
Deep learning is one of machine learning methods, which is based on artificial neural networks with representation learning.
E
Extraction
Extraction refers to the task of automatic identification of terms that best describe the topic of a given document.
F
Facial recognition
Facial recognition is a technology which is able to match a human face from a digital image or a video frame against a database of faces.
G
Generative adversarial network
Generative adversarial network (GAN) is a type of ML algorithm built using a combination of two neural networks (generator and discriminator).
H
Hyperparameters
Hyperparameters are parameters used to set up and control the learning process of an ML model.
I
Image recognition
Image recognition is a process that helps computer systems to identify and classify objects and patterns in images.
Image segmentation
In machine learning, image segmentation means dividing a digital image into different segments or regions, each made of a group of pixels.
Imbalanced data
Imbalanced data refers to the case when the data in a training dataset contains significantly different numbers of labels for each class.
K
K-nearest neighbor
The k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method used for classification and regression.
L
Large language model (LLM)
A large language model (LLM) is a deep learning model which consists of a neural network with a huge number of parameters (billions of weights or more) and is trained on large quantities of unlabelled text via self-supervised learning.
M
Machine learning (ML)
Machine learning (ML) is a subset of the artificial intelligence (AI) field. It focuses on developing methods for computers to leverage data to improve their performance on certain tasks.
N
Natural language processing
Natural language processing is a subfield at the intersection of linguistics, computer science, and artificial intelligence.
O
Overfitting in machine learning
Overfitting in machine learning refers to the situation when a model is trained with too much data.
P
Part-of-speech tagging
Part-of-speech tagging (POS tagging or POST) is the process of labeling every word in a text corpus with a particular part of speech, based on both its definition and its context.
R
Regression algorithms
In machine learning, regression algorithms are used for prediction and forecasting.
S
Sentiment analysis
Sentiment analysis is a natural language processing technique used to determine the emotional tone behind a piece of text.
T
Training set
Training set (training dataset) is a set of examples used to initially train an ML model.
U
Underfitting in machine learning
Underfitting in machine learning is a situation which occurs when a mathematical model fails to capture the underlying structure of the data.
V
Validation set in machine learning
In machine learning, a validation dataset is a portion of the dataset used to evaluate the performance of a model during the training process.
X
XgBoost
XGBoost is an optimized, distributed gradient boosting library that implements machine learning algorithms within the Gradient Boosting framework.
Y
YOLO (object detection algorithm)
The YOLO (you only look once) algorithm is a popular algorithm used for object detection known for its speed and accuracy.
Z
Zero-shot learning
In machine learning, zero-shot learning (ZSL) is a problem setup where a model is supposed to predict classes for the samples which were not observed during its training.
Scroll to top