The AI Interview - Master AI/ML Interviews

Implement hallucination detection for agent outputs:

add_source(text): Store source document with embedding
check(claim, threshold): Verify claim against sources
- Embed claim, find most similar source by cosine similarity
- If similarity >= threshold, claim is supported
check_response(response): Split into claims, check each

Hallucination Ratio: unsupported_claims / total_claims

Simulated Embedding: Use hash-based or simple character n-gram vectors for testing

Output Format: Individual checks + aggregate statistics

Examples

Example 1:

Input:

def emb(t): return [ord(t[0])/255] if t else [0]; fc = FactualityChecker(emb); fc.add_source('Paris is capital of France'); r = fc.check('Paris is in France', 0.5); r['supported']

Output: True

Explanation: Similar claim found in sources, supported

Starter Code

class FactualityChecker:
    """
    Check agent outputs for hallucinations against source documents.
    """
    
    def __init__(self, embedding_fn):
        self.embedding_fn = embedding_fn
        self.sources = []  # List of {'text': ..., 'embedding': ...}
    
    def add_source(self, text):
        """Add source document for grounding"""
        # Your implementation here
        pass
    
    def check(self, claim, threshold=0.7):
        """
        Check if claim is supported by sources.
        Returns {'supported': bool, 'confidence': float, 'source': str or None}
        """
        # Your implementation here
        pass
    
    def check_response(self, response, claims_split='. '):
        """
        Split response into claims and check each.
        Returns list of check results and overall 'hallucination_ratio'
        """
        # Your implementation here
        pass

Factuality Checker for Agent Outputs

Examples

Starter Code