Write a function to compute the discounted return for a sequence of rewards given a discount factor gamma. The function should take a list or NumPy array of rewards and a discount factor gamma (0 < gamma <= 1) and return the scalar value of the total discounted return. Only use NumPy.
Examples
Example 1:
Input:
rewards = [1, 1, 1]
gamma = 0.5
print(discounted_return(rewards, gamma))Output:
1.75Explanation: Discounted return: 1*1 + 1*0.5 + 1*0.25 = 1 + 0.5 + 0.25 = 1.75
Starter Code
import numpy as np
def discounted_return(rewards, gamma):
"""
Compute the total discounted return for a sequence of rewards.
Args:
rewards (list or np.ndarray): List or array of rewards [r_0, r_1, ..., r_T-1]
gamma (float): Discount factor (0 < gamma <= 1)
Returns:
float: Total discounted return
"""
# Your code here
passPython3
ReadyLines: 1Characters: 0
Ready