The AI Interview - Master AI/ML Interviews

Implement a function to calculate the perplexity of a language model given a sequence of token probabilities.

Perplexity is a widely used metric for evaluating language models. It quantifies how well a probability distribution predicts a sample - a lower perplexity indicates the model assigns higher probabilities to the actual observed tokens, meaning it's a better predictor.

Given a list of probabilities where each probability represents the model's predicted probability for the actual next token in a sequence, compute the perplexity score.

Input:

probabilities: A list of floats where each value represents P(token_i | context) for the i-th token in the sequence. All probabilities are in the range (0, 1].

Output:

A single float representing the perplexity of the model on the given sequence.

Constraints:

The input list is non-empty
All probabilities are greater than 0 and at most 1
Use natural logarithm (base e) for calculations

Examples

Example 1:

Input: probabilities = [0.5, 0.5, 0.5, 0.5]

Output: 2.0

Explanation: Each token has probability 0.5. The log probabilities are all ln(0.5) = -0.693. The average log probability is -0.693. Perplexity = exp(-(-0.693)) = exp(0.693) = 2.0. This means on average, the model is as uncertain as randomly choosing between 2 equally likely options at each step.

Starter Code

import numpy as np

def calculate_perplexity(probabilities: list[float]) -> float:
    """
    Calculate the perplexity of a language model given token probabilities.
    
    Args:
        probabilities: List of probabilities P(token_i | context) for each token
                      in the sequence, where each probability is in (0, 1]
    
    Returns:
        Perplexity value as a float
    """
    pass

Calculate Perplexity for Language Models

Input:

Output:

Constraints:

Examples

Starter Code