In this problem, you need to implement a single function that can perform three variants of gradient descent: Stochastic Gradient Descent (SGD), Batch Gradient Descent, and Mini-Batch Gradient Descent using Mean Squared Error (MSE) as the loss function. The function will take an additional parameter to specify which variant to use.
Requirements
- Do not shuffle the data; process samples in their original order (index 0, 1, 2, ...)
- For Batch GD: use all samples to compute a single gradient update per epoch
- For Stochastic GD: iterate through each sample sequentially (i.e., process sample 0, then 1, then 2, etc.) — not randomly selected
- For Mini-Batch GD: form batches from consecutive samples without overlap (e.g., for batch_size=2: first batch uses indices [0,1], second batch uses [2,3], etc.)
- The
n_epochsparameter specifies how many complete passes through the dataset to perform - For each epoch, process all samples according to the specified method
Examples
Example 1:
Input:
import numpy as np
# Sample data
X = np.array([[1, 1], [2, 1], [3, 1], [4, 1]])
y = np.array([2, 3, 4, 5])
# Parameters
learning_rate = 0.01
n_epochs = 1000
batch_size = 2
# Initialize weights
weights = np.zeros(X.shape[1])
# Test Batch Gradient Descent
final_weights = gradient_descent(X, y, weights, learning_rate, n_epochs, method='batch')
# Test Stochastic Gradient Descent
final_weights = gradient_descent(X, y, weights, learning_rate, n_epochs, method='stochastic')
# Test Mini-Batch Gradient Descent
final_weights = gradient_descent(X, y, weights, learning_rate, n_epochs, batch_size, method='mini_batch')Output:
[float, float]
[float, float]
[float, float]Explanation: The function should return the final weights after performing the specified variant of gradient descent for the given number of epochs (complete passes through the data).
Starter Code
import numpy as np
def gradient_descent(X, y, weights, learning_rate, n_epochs, batch_size=1, method='batch'):
"""
Perform gradient descent optimization.
Args:
X: Feature matrix of shape (m, n)
y: Target values of shape (m,)
weights: Initial weights of shape (n,)
learning_rate: Step size for gradient descent
n_epochs: Number of complete passes through the dataset
batch_size: Size of batches for mini-batch gradient descent (default: 1)
method: Type of gradient descent ('batch', 'stochastic', or 'mini_batch')
Returns:
Optimized weights
"""
# Your code here
pass
Python3
ReadyLines: 1Characters: 0
Ready