The AI Interview - Master AI/ML Interviews

Write a Python function to calculate various descriptive statistics metrics for a given dataset. The function should take a list or NumPy array of numerical values and return a dictionary containing:

mean: Average of all values
median: Middle value when sorted
mode: Most frequently occurring value
variance: Population variance (divide by N)
standard_deviation: Square root of variance
25th_percentile, 50th_percentile, 75th_percentile: Quartile values
interquartile_range: Difference between 75th and 25th percentiles (IQR)

Examples

Example 1:

Input: [1, 2, 2, 3, 4, 4, 4, 5]

Output: {'mean': 3.125, 'median': 3.5, 'mode': 4, 'variance': 1.6094, 'standard_deviation': 1.2686, ...}

Explanation: Mean = (1+2+2+3+4+4+4+5)/8 = 3.125. Median = average of 4th and 5th values = (3+4)/2 = 3.5. Mode = 4 (appears 3 times, most frequent). Variance and standard deviation measure spread around the mean. Percentiles divide the sorted data into quarters.

Starter Code

import numpy as np

def descriptive_statistics(data: list | np.ndarray) -> dict:
    """
    Calculate various descriptive statistics metrics for a given dataset.
    
    Args:
        data: List or numpy array of numerical values
    
    Returns:
        Dictionary containing mean, median, mode, variance, standard deviation,
        percentiles (25th, 50th, 75th), and interquartile range (IQR)
    """
    # Your code here
    pass

Descriptive Statistics Calculator

Examples

Starter Code