crowdkit.aggregation.classification.m_msr.MMSR | Source code
MMSR(self,n_iter: int = 10000,tol: float = 1e-10,random_state: Optional[int] = 0,observation_matrix: ... = _Nothing.NOTHING,covariation_matrix: ... = _Nothing.NOTHING,n_common_tasks: ... = _Nothing.NOTHING,n_workers: int = 0,n_tasks: int = 0,n_labels: int = 0,labels_mapping: Dict[Any, int] = _Nothing.NOTHING,workers_mapping: Dict[Any, int] = _Nothing.NOTHING,tasks_mapping: Dict[Any, int] = _Nothing.NOTHING)
The Matrix Mean-Subsequence-Reduced Algorithm (M-MSR) model assumes that workers have different expertise levels and are represented as a vector of "skills" which entries show the probability that the worker will answer the given task correctly. Having that, we can estimate the probability of each worker via solving a rank-one matrix completion problem as follows:
where is the total number of classes, is a covariance matrix between workers, and is the all-ones matrix which has the same size as .
Thus, the problem of estimating the skill level vector becomes equivalent to the rank-one matrix completion problem. The M-MSR algorithm is an iterative algorithm for the robust rank-one matrix completion, so its result is an estimator of the vector .
And the aggregation is weighted majority voting with weights equal to .
Q. Ma and Alex Olshevsky. Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion.
34th Conference on Neural Information Processing Systems (NeurIPS 2020)
The maximum number of iterations.
The tolerance stopping criterion for iterative methods with a variable number of steps. The algorithm converges when the loss change is less than the
The seed number for the random initialization.
The matrix representing which workers give responses to which tasks.
The matrix representing the covariance between workers.
The matrix representing workers with tasks in common.
The number of workers.
The number of tasks that are assigned to workers.
The number of possible labels for a series of classification tasks.
The mapping of labels and integer values.
The mapping of workers and integer values.
The mapping of tasks and integer values.
The task labels. The
The workers' skills. The
The task label scores. The
A list of loss values during training.
from crowdkit.aggregation import MMSRfrom crowdkit.datasets import load_datasetdf, gt = load_dataset('relevance-2')mmsr = MMSR()result = mmsr.fit_predict(df)
|fit||Fits the model to the training data.|
|fit_predict||Fits the model to the training data and returns the aggregated results.|
|fit_predict_score||Fits the model to the training data and returns the total sum of weights for each label.|
|predict||Predicts the true labels of tasks when the model is fitted.|
|predict_score||Returns the total sum of weights for each label when the model is fitted.|
Last updated: March 31, 2023