The AI Interview - Master AI/ML Interviews

Agent Observability Pipeline with Cost Tracking

Production AI agents need comprehensive telemetry to debug issues, track costs, and enforce SLAs.

Task

Build AgentObservabilityPipeline that:

Computes LLM call cost from pricing table.
Records LLM and tool calls with full metadata.
Fires alerts when cumulative run cost or error rate breaches thresholds.
Provides a cross-run dashboard: p50/p95 latency, avg cost, tool error rate.
Exports all records for a run as JSONL.

Non-Functional Requirements

Cost accurate to 6 decimal places.
Alert fires synchronously on record_llm_call.
Dashboard must handle 10,000+ runs efficiently.
JSONL: one JSON object per record per line.

Examples

Example 1:

Input: pipeline.compute_cost('gpt-4o', 1_000_000, 0) # pricing: {'gpt-4o': {'input': 5.0, 'output': 15.0}}

Output: 5.0

Explanation: 1M input tokens × $5.00 per 1M = $5.00

Starter Code

from typing import Dict, List, Any, Optional
from dataclasses import dataclass, field
import time, json

@dataclass
class LLMCallRecord:
    call_id: str
    model: str
    input_tokens: int
    output_tokens: int
    latency_ms: float
    cost_usd: float
    timestamp: float
    run_id: str
    step: int
    cached: bool = False

@dataclass
class ToolCallRecord:
    call_id: str
    tool_name: str
    latency_ms: float
    success: bool
    error: Optional[str]
    run_id: str
    step: int

class AgentObservabilityPipeline:
    def __init__(self, pricing: Dict[str, Dict]):
        self.pricing = pricing
        self.llm_calls: List[LLMCallRecord] = []
        self.tool_calls: List[ToolCallRecord] = []
        self.run_metadata: Dict[str, Dict] = {}
        self.alerts: List[Dict] = []
        self.alert_thresholds: Dict[str, float] = {}

    def compute_cost(self, model: str, input_tokens: int, output_tokens: int) -> float:
        pass

    def record_llm_call(self, record: LLMCallRecord) -> None:
        pass

    def record_tool_call(self, record: ToolCallRecord) -> None:
        pass

    def start_run(self, run_id: str, metadata: Dict) -> None:
        pass

    def end_run(self, run_id: str, status: str) -> Dict:
        pass

    def set_alert(self, metric: str, threshold: float) -> None:
        pass

    def dashboard(self) -> Dict:
        pass

    def export_jsonl(self, run_id: str) -> str:
        pass

Full Agent Observability Pipeline with Cost Tracking and Alerts

Agent Observability Pipeline with Cost Tracking

Task

Non-Functional Requirements

Examples

Starter Code