Implement a function that scans every feature and threshold in a small data set, then returns the split that minimises the weighted Gini impurity. Your implementation should support binary class labels (0 or 1) and handle ties gracefully.
You will write one function:
find_best_split(X: np.ndarray, y: np.ndarray) -> tuple[int, float]
Xis an n×d NumPy array of numeric features.yis a length-n NumPy array of 0/1 labels.- The function returns
(best_feature_index, best_threshold)for the split with the lowest weighted Gini impurity. - If several splits share the same impurity, return the first that you encounter while scanning features and thresholds.
Examples
Example 1:
Input:
import numpy as np
X = np.array([[2.5],[3.5],[1.0],[4.0]])
y = np.array([0,1,0,1])
print(find_best_split(X, y))Output:
(0, 2.5)Explanation: Splitting on feature 0 at threshold 2.5 yields two perfectly pure leaves, producing the minimum possible weighted Gini impurity.
Starter Code
import numpy as np
from typing import Tuple
def find_best_split(X: np.ndarray, y: np.ndarray) -> Tuple[int, float]:
"""Return the (feature_index, threshold) that minimises weighted Gini impurity."""
# ✏️ TODO: implement
passPython3
ReadyLines: 1Characters: 0
Ready