Agent Observability Pipeline with Cost Tracking
Production AI agents need comprehensive telemetry to debug issues, track costs, and enforce SLAs.
Task
Build AgentObservabilityPipeline that:
- Computes LLM call cost from pricing table.
- Records LLM and tool calls with full metadata.
- Fires alerts when cumulative run cost or error rate breaches thresholds.
- Provides a cross-run dashboard: p50/p95 latency, avg cost, tool error rate.
- Exports all records for a run as JSONL.
Non-Functional Requirements
- Cost accurate to 6 decimal places.
- Alert fires synchronously on
record_llm_call. - Dashboard must handle 10,000+ runs efficiently.
- JSONL: one JSON object per record per line.
Examples
Example 1:
Input:
pipeline.compute_cost('gpt-4o', 1_000_000, 0) # pricing: {'gpt-4o': {'input': 5.0, 'output': 15.0}}Output:
5.0Explanation: 1M input tokens × $5.00 per 1M = $5.00
Starter Code
from typing import Dict, List, Any, Optional
from dataclasses import dataclass, field
import time, json
@dataclass
class LLMCallRecord:
call_id: str
model: str
input_tokens: int
output_tokens: int
latency_ms: float
cost_usd: float
timestamp: float
run_id: str
step: int
cached: bool = False
@dataclass
class ToolCallRecord:
call_id: str
tool_name: str
latency_ms: float
success: bool
error: Optional[str]
run_id: str
step: int
class AgentObservabilityPipeline:
def __init__(self, pricing: Dict[str, Dict]):
self.pricing = pricing
self.llm_calls: List[LLMCallRecord] = []
self.tool_calls: List[ToolCallRecord] = []
self.run_metadata: Dict[str, Dict] = {}
self.alerts: List[Dict] = []
self.alert_thresholds: Dict[str, float] = {}
def compute_cost(self, model: str, input_tokens: int, output_tokens: int) -> float:
pass
def record_llm_call(self, record: LLMCallRecord) -> None:
pass
def record_tool_call(self, record: ToolCallRecord) -> None:
pass
def start_run(self, run_id: str, metadata: Dict) -> None:
pass
def end_run(self, run_id: str, status: str) -> Dict:
pass
def set_alert(self, metric: str, threshold: float) -> None:
pass
def dashboard(self) -> Dict:
pass
def export_jsonl(self, run_id: str) -> str:
pass
Python3
ReadyLines: 1Characters: 0
Ready