BradleyTerry

crowdkit.aggregation.pairwise.bradley_terry.BradleyTerry | Source code

BradleyTerry(
self,
n_iter: int,
tol: float = 1e-05
)

Bradley-Terry model for pairwise comparisons.

The model implements the classic algorithm for aggregating pairwise comparisons. The algorithm constructs an items' ranking based on pairwise comparisons. Given a pair of two items ii and jj, the probability of ii to be ranked higher is, according to the Bradley-Terry's probabilistic model,

P(i>j)=pipi+pjP(i > j) = \frac{p_i}{p_i + p_j}.

Here p\boldsymbol{p} is a vector of positive real-valued parameters that the algorithm optimizes. These optimization process maximizes the log-likelihood of observed comparisons outcomes by the MM-algorithm:

L(p)=i=1nj=1n[wijlnpiwijln(pi+pj)]L(\boldsymbol{p}) = \sum_{i=1}^n\sum_{j=1}^n[w_{ij}\ln p_i - w_{ij}\ln (p_i + p_j)],

where wijw_{ij} denotes the number of comparisons of ii and jj "won" by ii.

Note

The Bradley-Terry model needs the comparisons graph to be strongly connected.

David R. Hunter. MM algorithms for generalized Bradley-Terry models Ann. Statist., Vol. 32, 1 (2004): 384–406.

Bradley, R. A. and Terry, M. E. Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika, Vol. 39 (1952): 324–345.

Parameters description

ParametersTypeDescription
n_iterint

A number of optimization iterations.

scores_Series

'Labels' scores. A pandas.Series index by labels and holding corresponding label's scores

Examples:

The Bradley-Terry model needs the data to be a DataFrame containing columns left, right, and label. left and right contain identifiers of left and right items respectively, label contains identifiers of items that won these comparisons.

import pandas as pd
from crowdkit.aggregation import BradleyTerry
df = pd.DataFrame(
[
['item1', 'item2', 'item1'],
['item2', 'item3', 'item2']
],
columns=['left', 'right', 'label']
)

Methods summary

MethodDescription
fitNone
fit_predictNone

Last updated: March 31, 2023

Crowd-Kit
Overview
Reference
Aggregation
Datasets
Learning
Metrics
Postprocessing