Agent Failure Recovery System

Medium
Agents

Implement failure recovery for resilient agents:

  1. register_handler(failure_type, handler_fn): Add custom handler
  2. handle_failure(operation, error, context): Main entry
    • Classify error type
    • Select recovery strategy
    • Execute and return result

Error Classification:

  • transient: Network timeout, rate limit → retry
  • logic: Invalid args, assertion fail → fallback
  • critical: Auth fail, security → escalate

Strategies:

  • retry: Exponential backoff, max 3 attempts
  • fallback: Use alternative approach
  • escalate: Human handoff

Return: {'recovered': bool, 'result': ..., 'strategy': ..., 'attempts': int}

Examples

Example 1:
Input: fr = FailureRecovery(); result = fr.handle_failure(lambda: 1/0, Exception('fail'), {'error_type': 'transient'}); 'recovered' in result
Output: True
Explanation: Failure handled, recovery attempted

Starter Code

class FailureRecovery:
    """
    Handle and recover from agent execution failures.
    """
    
    def __init__(self):
        self.failure_handlers = {}
        self.recovery_strategies = {
            'retry': self._retry_strategy,
            'fallback': self._fallback_strategy,
            'escalate': self._escalate_strategy
        }
        self.failure_history = []
    
    def register_handler(self, failure_type, handler_fn):
        """Register handler for specific failure type"""
        # Your implementation here
        pass
    
    def handle_failure(self, operation, error, context):
        """
        Handle failure using appropriate strategy.
        Returns {'recovered': bool, 'result': ..., 'strategy': ...}
        """
        # Your implementation here
        pass
    
    def _retry_strategy(self, operation, context):
        """Retry with exponential backoff"""
        # Your implementation here
        pass
    
    def _fallback_strategy(self, operation, context):
        """Use fallback approach"""
        # Your implementation here
        pass
    
    def _escalate_strategy(self, operation, context):
        """Escalate to human operator"""
        # Your implementation here
        pass
Lines: 1Characters: 0
Ready
The AI Interview - Master AI/ML Interviews