Implement latency optimization for agent systems:
Optimization Strategies:
- Caching: Memoize deterministic tool calls
- Prefetching: Load embeddings before needed
- Parallelization: Run independent calls concurrently
- Plan Optimization: Reorder for parallelism
Methods:
cached_tool_call(tool_func, params, ttl): Cache with TTL- Hash params as cache key
- Check expiry before returning
prefetch_embeddings(texts): Async loadparallel_tool_calls(calls): ThreadPool executionoptimize_plan(plan): Dependency analysis, reorderget_stats(): Hit rates, avg latency
Cache Key:
hash(tool_name + canonical_json(params))
Examples
Example 1:
Input:
opt = LatencyOptimizer(); result1 = opt.cached_tool_call(lambda: 'expensive', {}, ttl_seconds=60); result2 = opt.cached_tool_call(lambda: 'different', {}, 60); result1 == result2Output:
TrueExplanation: Same params hit cache, return cached result
Starter Code
class LatencyOptimizer:
"""
Optimize agent execution latency through caching, prefetching, and parallelization.
"""
def __init__(self):
self.tool_cache = {}
self.embedding_cache = {}
self.prefetch_queue = []
self.execution_stats = {}
def cached_tool_call(self, tool_func, params, ttl_seconds=60):
"""
Execute tool with caching.
Returns cached result if available and not expired.
"""
# Your implementation here
pass
def prefetch_embeddings(self, texts):
"""Prefetch embeddings in background"""
# Your implementation here
pass
def parallel_tool_calls(self, tool_calls):
"""
Execute independent tool calls in parallel.
tool_calls: list of {'tool': func, 'params': {...}}
"""
# Your implementation here
pass
def optimize_plan(self, execution_plan):
"""
Optimize execution plan for latency.
Reorder independent operations, add caching hints.
"""
# Your implementation here
pass
def get_stats(self):
"""Get cache hit rates and latency stats"""
# Your implementation here
passPython3
ReadyLines: 1Characters: 0
Ready