The AI Interview - Master AI/ML Interviews

Implement the Bradley-Terry model, a probabilistic model for ranking items based on pairwise comparison outcomes. This model is widely used in sports rankings, recommendation systems, and LLM evaluation (e.g., Chatbot Arena leaderboards).

The Bradley-Terry model assumes that each item i has a latent strength parameter beta_i. The probability that item i beats item j is modeled as:

P(i beats j) = sigmoid(beta_i - beta_j) = 1 / (1 + exp(-(beta_i - beta_j)))

Given a dataset of pairwise comparison outcomes, your task is to estimate the strength parameters using maximum likelihood estimation via gradient ascent.

Function to implement:

fit_bradley_terry(comparisons, n_items, learning_rate, n_iterations) - Estimate strength parameters from comparison data.

Args:

comparisons: List of (winner_idx, loser_idx) tuples, where each tuple represents one comparison outcome
n_items: Total number of items being ranked
learning_rate: Step size for gradient ascent
n_iterations: Number of optimization iterations

Returns:

numpy array of shape (n_items,) containing estimated strength parameters, centered so they sum to 0

Notes:

Initialize all beta parameters to 0
After each gradient update, center the parameters by subtracting their mean (this handles the identifiability constraint)
Use numerically stable sigmoid computation

Examples

Example 1:

Input:

comparisons = [(0, 1), (0, 1), (1, 0)]  # Item 0 beats item 1 twice, item 1 beats item 0 once
n_items = 2
learning_rate = 0.5
n_iterations = 100

Output: [0.3466, -0.3466]

Explanation: Item 0 won 2 out of 3 comparisons against item 1 (win rate = 2/3). The MLE satisfies sigmoid(beta_0 - beta_1) = 2/3, giving beta_0 - beta_1 = log(2) = 0.693. After centering (subtracting the mean), we get beta_0 = 0.3466 and beta_1 = -0.3466. The positive parameter for item 0 correctly indicates it is the stronger item.

Starter Code

import numpy as np
from typing import List, Tuple

def fit_bradley_terry(comparisons: List[Tuple[int, int]], n_items: int, 
                      learning_rate: float = 0.5, n_iterations: int = 100) -> np.ndarray:
    """
    Fit Bradley-Terry model parameters using maximum likelihood estimation.
    
    Args:
        comparisons: List of (winner_idx, loser_idx) tuples
        n_items: Total number of items to rank
        learning_rate: Step size for gradient ascent
        n_iterations: Number of optimization iterations
    
    Returns:
        np.ndarray: Estimated strength parameters of shape (n_items,)
    """
    # Your code here
    pass

Bradley-Terry Model for Pairwise Rankings

Examples

Starter Code