Agent Latency Optimizer

Hard
Agents

Implement latency optimization for agent systems:

Optimization Strategies:

  1. Caching: Memoize deterministic tool calls
  2. Prefetching: Load embeddings before needed
  3. Parallelization: Run independent calls concurrently
  4. Plan Optimization: Reorder for parallelism

Methods:

  1. cached_tool_call(tool_func, params, ttl): Cache with TTL
    • Hash params as cache key
    • Check expiry before returning
  2. prefetch_embeddings(texts): Async load
  3. parallel_tool_calls(calls): ThreadPool execution
  4. optimize_plan(plan): Dependency analysis, reorder
  5. get_stats(): Hit rates, avg latency

Cache Key: hash(tool_name + canonical_json(params))

Examples

Example 1:
Input: opt = LatencyOptimizer(); result1 = opt.cached_tool_call(lambda: 'expensive', {}, ttl_seconds=60); result2 = opt.cached_tool_call(lambda: 'different', {}, 60); result1 == result2
Output: True
Explanation: Same params hit cache, return cached result

Starter Code

class LatencyOptimizer:
    """
    Optimize agent execution latency through caching, prefetching, and parallelization.
    """
    
    def __init__(self):
        self.tool_cache = {}
        self.embedding_cache = {}
        self.prefetch_queue = []
        self.execution_stats = {}
    
    def cached_tool_call(self, tool_func, params, ttl_seconds=60):
        """
        Execute tool with caching.
        Returns cached result if available and not expired.
        """
        # Your implementation here
        pass
    
    def prefetch_embeddings(self, texts):
        """Prefetch embeddings in background"""
        # Your implementation here
        pass
    
    def parallel_tool_calls(self, tool_calls):
        """
        Execute independent tool calls in parallel.
        tool_calls: list of {'tool': func, 'params': {...}}
        """
        # Your implementation here
        pass
    
    def optimize_plan(self, execution_plan):
        """
        Optimize execution plan for latency.
        Reorder independent operations, add caching hints.
        """
        # Your implementation here
        pass
    
    def get_stats(self):
        """Get cache hit rates and latency stats"""
        # Your implementation here
        pass
Lines: 1Characters: 0
Ready
The AI Interview - Master AI/ML Interviews