NoisyBradleyTerry

crowdkit.aggregation.pairwise.noisy_bt.NoisyBradleyTerry | Source code

NoisyBradleyTerry(
    self,
    n_iter: int = 100,
    tol: float = 1e-05,
    regularization_ratio: float = 1e-05,
    random_state: int = 0
)

Bradley-Terry model for pairwise comparisons with additional parameters.

This model is a modification of the Bradley-Terry model with parameters for workers' skills (reliability) and biases.

Parameters description

Parameters	Type	Description
`scores_`	Series	'Labels' scores. A pandas.Series index by labels and holding corresponding label's scores
`skills_`	Series	workers' skills. A pandas.Series index by workers and holding corresponding worker's skill
`biases_`	Series	Predicted biases for each worker. Indicates the probability of a worker to choose the left item.. A series of workers' biases indexed by workers

Examples:

The following example shows how to aggregate results of comparisons grouped by some column. In the example the two questions q1 and q2 are used to group the labeled data. Temporary data structure is created and the model is applied to it. The results are splitted in two arrays, and each array contains scores for one of the initial groups.

import pandas as pd
from crowdkit.aggregation import NoisyBradleyTerry
data = pd.DataFrame(
    [
        ['q1', 'w1', 'a', 'b', 'a'],
        ['q1', 'w2', 'a', 'b', 'b'],
        ['q1', 'w3', 'a', 'b', 'a'],
        ['q2', 'w1', 'a', 'b', 'b'],
        ['q2', 'w2', 'a', 'b', 'a'],
        ['q2', 'w3', 'a', 'b', 'b'],
    ],
    columns=['question', 'worker', 'left', 'right', 'label']
)
# Append question to other columns. After that the data looks like:
#   question worker     left    right    label
# 0       q1     w1  (q1, a)  (q1, b)  (q1, a)
for col in 'left', 'right', 'label':
    data[col] = list(zip(data['question'], data[col]))
result = NoisyBradleyTerry(n_iter=10).fit_predict(data)
# Separate results
result.index = pd.MultiIndex.from_tuples(result.index, names=['question', 'label'])
print(result['q1'])      # Scores for all items in the q1 question
print(result['q2']['b']) # Score for the item b in the q2 question

Methods summary

Method	Description
fit	None
fit_predict	None

Last updated: March 31, 2023

Crowd-Kit

Overview

Reference

Aggregation

Datasets

Learning

Metrics

Postprocessing