crowdkit.aggregation.classification.majority_vote.MajorityVote
| Source code
MajorityVote( self, on_missing_skill: str = 'error', default_skill: Optional[float] = None)
The Majority Vote aggregation algorithm is a straightforward approach for categorical aggregation: for each task,
it outputs a label with the largest number of responses. Additionaly, the Majority Vote can be used when different weights are assigned to workers' votes. In this case, the resulting label will have the largest sum of weights.
If two or more labels have the largest number of votes, the resulting label will be the same for all tasks that have the same set of labels with the same number of votes.
Parameters | Type | Description |
---|---|---|
default_skill | Optional[float] | Default worker weight value. |
labels_ | Optional[Series] | The task labels. The |
skills_ | Optional[Series] | The workers' skills. The |
probas_ | Optional[DataFrame] | The probability distributions of task labels. The |
on_missing_skill | str | A value which specifies how to handle assignments performed by workers with an unknown skill. Possible values:
|
Examples:
Basic Majority Vote:
from crowdkit.aggregation import MajorityVotefrom crowdkit.datasets import load_datasetdf, gt = load_dataset('relevance-2')result = MajorityVote().fit_predict(df)
Weighted Majority Vote:
import pandas as pdfrom crowdkit.aggregation import MajorityVotedf = pd.DataFrame( [ ['t1', 'p1', 0], ['t1', 'p2', 0], ['t1', 'p3', 1], ['t2', 'p1', 1], ['t2', 'p2', 0], ['t2', 'p3', 1], ], columns=['task', 'worker', 'label'])skills = pd.Series({'p1': 0.5, 'p2': 0.7, 'p3': 0.4})result = MajorityVote.fit_predict(df, skills)
Method | Description |
---|---|
fit | Fits the model to the training data. |
fit_predict | Fits the model to the training data and returns the aggregated results. |
fit_predict_proba | Fits the model to the training data and returns probability distributions of labels for each task. |
Last updated: March 31, 2023