Memory Compression in Long-Running Agents
Long conversations exceed context windows. Agents must compress history intelligently.
Task
Implement MemoryCompressor that:
- Detects when to compress (token count exceeds threshold).
- Compresses old messages into a summary while keeping recent N messages verbatim.
- Implements hierarchical compression for very long histories.
- Tracks summary history for auditability.
Constraints
- Always keep the last N messages verbatim (default 5).
- Summary message has role
'system'and content'Previous context: {summary}'. - Hierarchical compression: compress summaries themselves when needed.
Examples
Example 1:
Input:
compressor.compress([{'role':'user','content':'msg1'}, ...(20 messages)...], keep_last_n=3)Output:
[{'role':'system','content':'Previous context: ...'}, last_3_messages]Explanation: Old messages summarized; 3 recent kept verbatim.
Starter Code
from typing import List, Dict, Any
class MemoryCompressor:
def __init__(self, llm_fn: callable, max_tokens: int = 2000, compression_threshold: int = 1500):
self.llm_fn = llm_fn
self.max_tokens = max_tokens
self.compression_threshold = compression_threshold
self.summary_history: List[str] = []
def estimate_tokens(self, text: str) -> int:
# Approximate: 1 token ≈ 4 characters
return len(text) // 4
def should_compress(self, messages: List[Dict]) -> bool:
pass
def compress(self, messages: List[Dict], keep_last_n: int = 5) -> List[Dict]:
# TODO: Summarize old messages, keep recent ones
pass
def hierarchical_compress(self, messages: List[Dict], levels: int = 2) -> Dict:
# TODO: Multi-level compression for very long histories
pass
Python3
ReadyLines: 1Characters: 0
Ready