Problem
Implement a Gated Recurrent Unit (GRU) cell forward pass. The GRU is a type of recurrent neural network architecture that uses gating mechanisms to control the flow of information, helping to mitigate the vanishing gradient problem.
A GRU cell computes a new hidden state given an input vector and the previous hidden state using update and reset gates.
Input Parameters:
x: Input vector of shape(input_size,)h_prev: Previous hidden state of shape(hidden_size,)W_z,W_r,W_h: Weight matrices for input of shape(hidden_size, input_size)U_z,U_r,U_h: Weight matrices for hidden state of shape(hidden_size, hidden_size)b_z,b_r,b_h: Bias vectors of shape(hidden_size,)
Output:
h_next: New hidden state of shape(hidden_size,)
The GRU uses sigmoid and tanh activation functions. The update gate controls how much of the previous hidden state to retain, while the reset gate controls how much of the previous hidden state to forget when computing the candidate hidden state.
Examples
Example 1:
Input:
x = [1.0, 0.5], h_prev = [0.0, 0.0, 0.0], all weight matrices filled with 0.1 or 0.2, all biases are zerosOutput:
[0.1565, 0.1565, 0.1565]Explanation: With h_prev = 0, the reset gate has no effect. The update gate z = sigmoid(0.15) = 0.5374 for each unit. The candidate h_tilde = tanh(0.3) = 0.2913 for each unit. The final hidden state h_next = z * h_tilde = 0.5374 * 0.2913 = 0.1565 for each unit.
Starter Code
import numpy as np
def gru_cell(x: np.ndarray, h_prev: np.ndarray,
W_z: np.ndarray, U_z: np.ndarray, b_z: np.ndarray,
W_r: np.ndarray, U_r: np.ndarray, b_r: np.ndarray,
W_h: np.ndarray, U_h: np.ndarray, b_h: np.ndarray) -> np.ndarray:
"""
Implements a single GRU cell forward pass.
Args:
x: Input vector of shape (input_size,)
h_prev: Previous hidden state of shape (hidden_size,)
W_z, W_r, W_h: Weight matrices for input
U_z, U_r, U_h: Weight matrices for hidden state
b_z, b_r, b_h: Bias vectors
Returns:
h_next: New hidden state of shape (hidden_size,)
"""
# Your code here
passPython3
ReadyLines: 1Characters: 0
Ready