Descriptive Statistics Calculator

Easy
Machine Learning

Write a Python function to calculate various descriptive statistics metrics for a given dataset. The function should take a list or NumPy array of numerical values and return a dictionary containing:

  • mean: Average of all values
  • median: Middle value when sorted
  • mode: Most frequently occurring value
  • variance: Population variance (divide by N)
  • standard_deviation: Square root of variance
  • 25th_percentile, 50th_percentile, 75th_percentile: Quartile values
  • interquartile_range: Difference between 75th and 25th percentiles (IQR)

Examples

Example 1:
Input: [1, 2, 2, 3, 4, 4, 4, 5]
Output: {'mean': 3.125, 'median': 3.5, 'mode': 4, 'variance': 1.6094, 'standard_deviation': 1.2686, ...}
Explanation: Mean = (1+2+2+3+4+4+4+5)/8 = 3.125. Median = average of 4th and 5th values = (3+4)/2 = 3.5. Mode = 4 (appears 3 times, most frequent). Variance and standard deviation measure spread around the mean. Percentiles divide the sorted data into quarters.

Starter Code

import numpy as np

def descriptive_statistics(data: list | np.ndarray) -> dict:
    """
    Calculate various descriptive statistics metrics for a given dataset.
    
    Args:
        data: List or numpy array of numerical values
    
    Returns:
        Dictionary containing mean, median, mode, variance, standard deviation,
        percentiles (25th, 50th, 75th), and interquartile range (IQR)
    """
    # Your code here
    pass
Lines: 1Characters: 0
Ready
The AI Interview - Master AI/ML Interviews