crowdkit.aggregation.classification.m_msr.MMSR
| Source code
MMSR( self, n_iter: int = 10000, tol: float = 1e-10, random_state: Optional[int] = 0, observation_matrix: ... = ..., covariation_matrix: ... = ..., n_common_tasks: ... = ..., n_workers: int = 0, n_tasks: int = 0, n_labels: int = 0, labels_mapping: Dict[Any, int] = ..., workers_mapping: Dict[Any, int] = ..., tasks_mapping: Dict[Any, int] = ...)
Matrix Mean-Subsequence-Reduced Algorithm.
The M-MSR assumes that workers have different level of expertise and associated with a vector of "skills" which entries show the probability of the worker to answer correctly to the given task. Having that, we can show that
,where is the total number of classes, is a covariation matrix between workers, and is the all-ones matrix which has the same size as .
So, the problem of recovering the skills vector becomes equivalent to the rank-one matrix completion problem. The M-MSR algorithm is an iterative algorithm for rubust rank-one matrix completion, so its result is an estimator of the vector .
Then, the aggregation is the weighted majority vote with weights equal to .
Matrix Mean-Subsequence-Reduced Algorithm. Qianqian Ma and Alex Olshevsky. Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion. 34th Conference on Neural Information Processing Systems (NeurIPS 2020)
https://arxiv.org/abs/2010.12181
Parameters | Type | Description |
---|---|---|
n_iter | int | The maximum number of iterations of the M-MSR algorithm. |
eps | - | Convergence threshold. |
random_state | Optional[int] | Seed number for the random initialization. |
labels_ | Optional[Series] | Tasks' labels. A pandas.Series indexed by |
skills_ | Optional[Series] | workers' skills. A pandas.Series index by workers and holding corresponding worker's skill |
scores_ | Optional[DataFrame] | Tasks' label scores. A pandas.DataFrame indexed by |
Examples:
from crowdkit.aggregation import MMSRfrom crowdkit.datasets import load_datasetdf, gt = load_dataset('relevance-2')mmsr = MMSR()result = mmsr.fit_predict(df)
Method | Description |
---|---|
fit | Estimate the workers' skills. |
fit_predict | Fit the model and return aggregated results. |
fit_predict_score | Fit the model and return the total sum of weights for each label. |
predict | Infer the true labels when the model is fitted. |
predict_score | Return total sum of weights for each label when the model is fitted. |