crowdkit.aggregation.classification.gold_majority_vote.GoldMajorityVote.fit_predict_proba

    data: DataFrame,
    true_labels: Series

Fit the model and return probability distributions on labels for each task.

Parameters Description

data DataFrame

Workers' labeling results. A pandas.DataFrame containing task, worker and label columns.

true_labels Series

Tasks' ground truth labels. A pandas.Series indexed by task such that labels.loc[task] is the tasks's ground truth label.

  • Returns:

    Tasks' label probability distributions. A pandas.DataFrame indexed by task such that result.loc[task, label] is the probability of task's true label to be equal to label. Each probability is between 0 and 1, all task's probabilities should sum up to 1

  • Return type: