ZeroBasedSkill

crowdkit.aggregation.classification.zero_based_skill.ZeroBasedSkill | Source code

ZeroBasedSkill(
self,
n_iter: int = 100,
lr_init: float = 1.0,
lr_steps_to_reduce: int = 20,
lr_reduce_factor: float = 0.5,
eps: float = 1e-05
)

The Zero-Based Skill (ZBS) aggregation model performs weighted majority voting on tasks. After processing a pool of tasks,

it re-estimates the workers' skills with a gradient descend step to optimize the mean squared error of the current skills and the fraction of responses that are equal to the aggregated labels.

This process is repeated until the labels change or exceed the number of iterations.

Note

It is necessary that all workers in the dataset that is sent to predict exist in responses to the dataset that was sent to fit.

Parameters description

ParametersTypeDescription
n_iterint

The maximum number of iterations.

lr_initfloat

The initial learning rate.

lr_steps_to_reduceint

The number of steps required to reduce the learning rate.

lr_reduce_factorfloat

The factor by which the learning rate will be multiplied every lr_steps_to_reduce step.

epsfloat

The convergence threshold.

skills_Optional[Series]

The workers' skills. The pandas.Series data is indexed by worker and has the corresponding worker skill.

labels_Optional[Series]

The task labels. The pandas.Series data is indexed by task so that labels.loc[task] is the most likely true label of tasks.

probas_Optional[DataFrame]

The probability distributions of task labels. The pandas.DataFrame data is indexed by task so that result.loc[task, label] is the probability that the task true label is equal to label. Each probability is in the range from 0 to 1, all task probabilities must sum up to 1.

Examples:

from crowdkit.aggregation import ZeroBasedSkill
from crowdkit.datasets import load_dataset
df, gt = load_dataset('relevance-2')
result = ZeroBasedSkill().fit_predict(df)

Methods summary

MethodDescription
fitFits the model to the training data.
fit_predictFits the model to the training data and returns the aggregated results.
fit_predict_probaFits the model to the training data and returns the aggregated results.
predictPredicts the true labels of tasks when the model is fitted.
predict_probaReturns probability distributions of labels for each task when the model is fitted.

Last updated: March 31, 2023

Crowd-Kit
Overview
Reference
Aggregation
Datasets
Learning
Metrics
Postprocessing