Write a Python function that implements the decision tree learning algorithm for classification. The function should use recursive binary splitting based on entropy and information gain to build a decision tree. It should take a list of examples (each example is a dict of attribute-value pairs) and a list of attribute names as input, and return a nested dictionary representing the decision tree.
Tie-Breaking Rules:
- If multiple attributes have equal information gain, choose the one that appears first in the attributes list.
- If a leaf node has equal counts of different classes, return the class that comes first alphabetically.
- Process attribute values in sorted order to ensure consistent tree structure.
Examples
Example 1:
Input:
examples = [
{'Outlook': 'Sunny', 'Wind': 'Weak', 'PlayTennis': 'No'},
{'Outlook': 'Overcast', 'Wind': 'Strong', 'PlayTennis': 'Yes'},
{'Outlook': 'Rain', 'Wind': 'Weak', 'PlayTennis': 'Yes'},
{'Outlook': 'Sunny', 'Wind': 'Strong', 'PlayTennis': 'No'},
{'Outlook': 'Overcast', 'Wind': 'Weak', 'PlayTennis': 'Yes'},
{'Outlook': 'Rain', 'Wind': 'Strong', 'PlayTennis': 'No'},
{'Outlook': 'Rain', 'Wind': 'Weak', 'PlayTennis': 'Yes'}
],
attributes = ['Outlook', 'Wind'],
target_attr = 'PlayTennis'Output:
{'Outlook': {'Overcast': 'Yes', 'Rain': {'Wind': {'Strong': 'No', 'Weak': 'Yes'}}, 'Sunny': 'No'}}Explanation: The algorithm first calculates information gain for each attribute. 'Outlook' has the highest gain, so it becomes the root. For 'Overcast', all outcomes are 'Yes' (pure leaf). For 'Sunny', all outcomes are 'No' (pure leaf - no need to split further). For 'Rain', the data has mixed labels, so it recurses and splits on 'Wind'.
Starter Code
import math
from collections import Counter
def calculate_entropy(labels: list) -> float:
"""Calculate the entropy of a list of labels."""
# Your code here
pass
def calculate_information_gain(examples: list[dict], attr: str, target_attr: str) -> float:
"""Calculate the information gain of splitting on attr."""
# Your code here
pass
def majority_class(examples: list[dict], target_attr: str) -> str:
"""Return the majority class. Break ties alphabetically."""
# Your code here
pass
def learn_decision_tree(examples: list[dict], attributes: list[str], target_attr: str) -> dict:
"""Build a decision tree using the ID3 algorithm."""
# Your code here
passPython3
ReadyLines: 1Characters: 0
Ready