The AI Interview - Master AI/ML Interviews

Classify tool determinism by empirical testing:

Run tool_func(**inputs) n_trials times for each test input
Compare outputs - if ANY trial differs, tool is non-deterministic
Return classification with evidence

Output Format:

{
    'is_deterministic': True/False,
    'evidence': 'All N trials produced identical results' OR 'Trial 1: X, Trial 2: Y differed',
    'confidence': 0.0-1.0  # 1.0 if deterministic, lower if few trials
}

Constraints:

Handle exceptions by marking non-deterministic (may be flaky)
Compare using ==
Confidence = 1.0 if deterministic, 0.7 if non-deterministic (empirical)

Examples

Example 1:

Input: classify_tool_determinism(lambda x: x*2, [{'x': 5}], 5)

Output: {'is_deterministic': True, 'evidence': 'All 5 trials produced identical results', 'confidence': 1.0}

Explanation: Pure function, always same output

Starter Code

def classify_tool_determinism(tool_func, test_inputs, n_trials=3):
    """
    Classify whether a tool is deterministic or non-deterministic
    by running it multiple times with same inputs.
    
    Args:
        tool_func: Function to test
        test_inputs: List of input dicts to test
        n_trials: Number of times to run each input
    
    Returns:
        dict with 'is_deterministic', 'evidence', 'confidence'
    """
    # Your implementation here
    pass

Deterministic vs Non-Deterministic Classifier

Examples

Starter Code