Exponential Weighted Average of Rewards

Easy
Reinforcement Learning

Given an initial value Q1Q_1, a list of kk observed rewards R1,R2,,RkR_1, R_2, \ldots, R_k, and a step size α\alpha, implement a function to compute the exponentially weighted average as:

(1α)kQ1+i=1kα(1α)kiRi(1-\alpha)^k Q_1 + \sum_{i=1}^k \alpha (1-\alpha)^{k-i} R_i

This weighting gives more importance to recent rewards, while the influence of the initial estimate Q1Q_1 decays over time. Do not use running/incremental updates; instead, compute directly from the formula. (This is called the exponential recency-weighted average.)

Examples

Example 1:
Input: Q1 = 2.0 rewards = [5.0, 9.0] alpha = 0.3 result = exp_weighted_average(Q1, rewards, alpha) print(round(result, 4))
Output: 4.73
Explanation: With k=2, we compute: (1-0.3)^2 × 2.0 + 0.3×(1-0.3)^1 × 5.0 + 0.3×(1-0.3)^0 × 9.0 = 0.49×2.0 + 0.21×5.0 + 0.3×9.0 = 0.98 + 1.05 + 2.7 = 4.73

Starter Code

def exp_weighted_average(Q1, rewards, alpha):
    """
    Q1: float, initial estimate
    rewards: list or array of rewards, R_1 to R_k
    alpha: float, step size (0 < alpha <= 1)
    Returns: float, exponentially weighted average after k rewards
    """
    # Your code here
    pass
Lines: 1Characters: 0
Ready
The AI Interview - Master AI/ML Interviews