crowdkit.aggregation.classification.dawid_skene.OneCoinDawidSkene
| Source code
OneCoinDawidSkene( self, n_iter: int = 100, tol: float = 1e-05)
The one-coin Dawid-Skene aggregation model works exactly the same as the original Dawid-Skene model based on the EM algorithm, except for calculating the workers' errors at the M-step of the algorithm.
For the one-coin model, a worker confusion (error) matrix is parameterized by a single parameter :
where is a worker confusion (error) matrix of size in case of the class classification, be a true task label, is a worker response to the task , and is a worker skill (accuracy).
In other words, the worker uses a single coin flip to decide their assignment. No matter what the true label is, the worker has the probability to assign the correct label, and has the probability to randomly assign an incorrect label. For the one-coin model, it suffices to estimate for every worker and estimate for every task . Because of its simplicity, the one-coin model is easier to estimate and enjoys better convergence properties.
Parameters , , and latent variables are optimized with the Expectation-Maximization algorithm:
Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan. Spectral methods meet EM: A provably optimal algorithm for crowdsourcing.
Journal of Machine Learning Research. Vol. 17, (2016), 1-44.
https://doi.org/10.48550/arXiv.1406.3824
Parameters | Type | Description |
---|---|---|
n_iter | int | The maximum number of EM iterations. |
tol | float | The tolerance stopping criterion for iterative methods with a variable number of steps. The algorithm converges when the loss change is less than the |
labels_ | Optional[Series] | The task labels. The |
probas_ | Optional[DataFrame] | The probability distributions of task labels. The |
priors_ | Optional[Series] | The prior label distribution. The |
errors_ | Optional[DataFrame] | The workers' error matrices. The |
skills_ | Optional[Series] | The workers' skills. The |
loss_history_ | List[float] | A list of loss values during training. |
Examples:
from crowdkit.aggregation import OneCoinDawidSkenefrom crowdkit.datasets import load_datasetdf, gt = load_dataset('relevance-2')hds = OneCoinDawidSkene(100)result = hds.fit_predict(df)
Method | Description |
---|---|
fit | Fits the model to the training data with the EM algorithm. |
Last updated: March 31, 2023