Implement the Bradley-Terry model, a probabilistic model for ranking items based on pairwise comparison outcomes. This model is widely used in sports rankings, recommendation systems, and LLM evaluation (e.g., Chatbot Arena leaderboards).
The Bradley-Terry model assumes that each item i has a latent strength parameter beta_i. The probability that item i beats item j is modeled as:
P(i beats j) = sigmoid(beta_i - beta_j) = 1 / (1 + exp(-(beta_i - beta_j)))
Given a dataset of pairwise comparison outcomes, your task is to estimate the strength parameters using maximum likelihood estimation via gradient ascent.
Function to implement:
fit_bradley_terry(comparisons, n_items, learning_rate, n_iterations) - Estimate strength parameters from comparison data.
Args:
comparisons: List of (winner_idx, loser_idx) tuples, where each tuple represents one comparison outcomen_items: Total number of items being rankedlearning_rate: Step size for gradient ascentn_iterations: Number of optimization iterations
Returns:
- numpy array of shape (n_items,) containing estimated strength parameters, centered so they sum to 0
Notes:
- Initialize all beta parameters to 0
- After each gradient update, center the parameters by subtracting their mean (this handles the identifiability constraint)
- Use numerically stable sigmoid computation
Examples
comparisons = [(0, 1), (0, 1), (1, 0)] # Item 0 beats item 1 twice, item 1 beats item 0 once
n_items = 2
learning_rate = 0.5
n_iterations = 100[0.3466, -0.3466]Starter Code
import numpy as np
from typing import List, Tuple
def fit_bradley_terry(comparisons: List[Tuple[int, int]], n_items: int,
learning_rate: float = 0.5, n_iterations: int = 100) -> np.ndarray:
"""
Fit Bradley-Terry model parameters using maximum likelihood estimation.
Args:
comparisons: List of (winner_idx, loser_idx) tuples
n_items: Total number of items to rank
learning_rate: Step size for gradient ascent
n_iterations: Number of optimization iterations
Returns:
np.ndarray: Estimated strength parameters of shape (n_items,)
"""
# Your code here
pass