MACE

crowdkit.aggregation.classification.mace.MACE | Source code

MACE(
self,
n_restarts: int = 10,
n_iter: int = 50,
method: str = 'vb',
smoothing: float = 0.1,
default_noise: float = 0.5,
alpha: float = 0.5,
beta: float = 0.5,
random_state: int = 0,
verbose: int = 0
)

The Multi-Annotator Competence Estimation (MACE) model is a probabilistic model that associates each worker with a label probability distribution.

A worker can be spamming on each task. If the worker is not spamming, they label a task correctly. If the worker is spamming, they answer according to their probability distribution.

We assume that the correct label TiT_i comes from a discrete uniform distribution. When a worker annotates a task, they are spamming with probability Bernoulli(1θj)\operatorname{Bernoulli}(1 - \theta_j). SijS_{ij} specifies whether or not worker jj is spamming on instance ii.

Thus, if the worker is not spamming on the task, i.e. Sij=0S_{ij} = 0, their response is the true label, i.e. Aij=TiA_{ij} = T_i. Otherwise, their response AijA_{ij} is drawn from a multinomial distribution with parameter vector ξj\xi_j.

MACE latent label model

The model can be enhanced by adding the Beta prior on θj\theta_j and the Diriclet prior on ξj\xi_j.

The marginal data likelihood is maximized with the Expectation-Maximization algorithm:

  1. E-step. Performs n_restarts random restarts, and keeps the model with the best marginal data likelihood.
  2. M-step. Smooths parameters by adding a fixed value smoothing to the fractional counts before normalizing.
  3. Variational M-step. Employs Variational-Bayes (VB) training with symmetric Beta priors on θj\theta_j and symmetric Dirichlet priors on the strategy parameters ξj\xi_j.

D. Hovy, T. Berg-Kirkpatrick, A. Vaswani and E. Hovy. Learning Whom to Trust with MACE. In Proceedings of NAACL-HLT, Atlanta, GA, USA (2013), 1120–1130.

https://aclanthology.org/N13-1132.pdf

Parameters description

ParametersTypeDescription
n_restartsint

The number of optimization runs of the algorithms. The final parameters are those that gave the best log likelihood. If one run takes too long, this parameter can be set to 1. Default: 10.

n_iterint

The maximum number of EM iterations for each optimization run. Default: 50.

methodstr

The method which is used for the M-step. Either 'vb' or 'em'. 'vb' means optimization with Variational Bayes using priors. 'em' means standard Expectation-Maximization algorithm. Default: 'vb'.

smoothingfloat

The smoothing parameter for the normalization. Default: 0.1.

default_noisefloat

The default noise parameter for the initialization. Default: 0.5.

alphafloat

The prior parameter for the Beta distribution on θj\theta_j. Default: 0.5.

betafloat

The prior parameter for the Beta distribution on θj\theta_j. Default: 0.5.

random_stateint

The state of the random number generator. Default: 0.

verboseint

Specifies if the progress will be printed or not: 0 — no progress bar, 1 — only for restarts, 2 — for both restarts and optimization. Default: 0.

labels_Optional[Series]

The task labels. The pandas.Series data is indexed by task so that labels.loc[task] is the most likely true label of tasks.

probas_Optional[DataFrame]

The probability distributions of task labels. The pandas.DataFrame data is indexed by task so that result.loc[task, label] is the probability that the task true label is equal to label. Each probability is in the range from 0 to 1, all task probabilities must sum up to 1.

spamming_...

The posterior distribution of workers' spamming states.

thetas_...

The posterior distribution of workers' spamming labels.

theta_priors_Optional[...]

The prior parameters for the Beta distribution on θj\theta_j.

strategy_priors_Optional[...]

The prior parameters for the Diriclet distribution on ξj\xi_j.

Examples:

from crowdkit.aggregation import MACE
from crowdkit.datasets import load_dataset
df, gt = load_dataset('relevance-2')
mace = MACE()
result = mace.fit_predict(df)

Methods summary

MethodDescription
fitFits the model to the training data.
fit_predictFits the model to the training data and returns the aggregated results.
fit_predict_probaFits the model to the training data and returns probability distributions of labels for each task.

Last updated: March 31, 2023

Crowd-Kit
Overview
Reference
Aggregation
Datasets
Learning
Metrics
Postprocessing