crowdkit.aggregation.classification.glad.GLAD
| Source code
GLAD( self, n_iter: int = 100, tol: float = 1e-05, silent: bool = True, labels_priors: Optional[Series] = None, alphas_priors_mean: Optional[Series] = None, betas_priors_mean: Optional[Series] = None, m_step_max_iter: int = 25, m_step_tol: float = 0.01)
Generative model of Labels, Abilities, and Difficulties.
A probabilistic model that parametrizes workers' abilities and tasks' dificulties.
Let's consider a case of class classification. Let be a vector of prior class probabilities,
be a worker's ability parameter, be an inverse task's difficulty, be a latent variable representing the true task's label, and be a worker's response that we observe. The relationships between this variables and parameters according to GLAD are represented by the following latent label model:The prior probability of being equal to is
,the probability distribution of the worker's responses conditioned by the true label value follows the single coin Dawid-Skene model where the true label probability is a sigmoid function of the product of worker's ability and inverse task's difficulty:
,where
.Parameters , , and latent variables are optimized through the Expectation-Minimization algorithm.
J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. Movellan. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. Proceedings of the 22nd International Conference on Neural Information Processing Systems, 2009
https://proceedings.neurips.cc/paper/2009/file/f899139df5e1059396431415e770c6dd-Paper.pdf
Parameters | Type | Description |
---|---|---|
max_iter | - | Maximum number of EM iterations. |
eps | - | Threshold for convergence criterion. |
silent | bool | If false, show progress bar. |
labels_priors | Optional[Series] | Prior label probabilities. |
alphas_priors_mean | Optional[Series] | Prior mean value of alpha parameters. |
betas_priors_mean | Optional[Series] | Prior mean value of beta parameters. |
m_step_max_iter | int | Maximum number of iterations of conjugate gradient method in M-step. |
m_step_tol | float | Tol parameter of conjugate gradient method in M-step. |
labels_ | Optional[Series] | Tasks' labels. A pandas.Series indexed by |
probas_ | Optional[DataFrame] | Tasks' label probability distributions. A pandas.DataFrame indexed by |
alphas_ | Series | workers' alpha parameters. A pandas.Series indexed by |
betas_ | Series | Tasks' beta parameters. A pandas.Series indexed by |
Examples:
from crowdkit.aggregation import GLADfrom crowdkit.datasets import load_datasetdf, gt = load_dataset('relevance-2')glad = GLAD()result = glad.fit_predict(df)
Method | Description |
---|---|
fit | Fit the model through the EM-algorithm. |
fit_predict | Fit the model and return aggregated results. |
fit_predict_proba | Fit the model and return probability distributions on labels for each task. |