Safety Constraint Validator

Medium
Agents

Implement safety constraint validation:

  1. add_constraint(name, check_fn, severity): Register constraint
    • check_fn(action, context) -> violation dict or None
  2. validate_action(action, context): Check single action
  3. validate_trajectory(actions): Check action sequence
    • Some constraints apply to sequences (e.g., 'no more than 3 deletes in a row')
  4. get_violation_summary(): Aggregate statistics

Constraint Types:

  • State-independent: Always apply
  • State-dependent: Context matters
  • Temporal: Apply to sequences

Violation Format: {'constraint': name, 'severity': ..., 'description': ..., 'action': ...}

Examples

Example 1:
Input: sv = SafetyConstraintValidator(); sv.add_constraint('no_delete', lambda a,c: {'violation': a=='delete'} if a=='delete' else None, 'high'); sv.validate_action('read', {})['safe']
Output: True
Explanation: Read action doesn't violate no_delete constraint

Starter Code

class SafetyConstraintValidator:
    """
    Validate agent actions against safety constraints.
    """
    
    def __init__(self):
        self.constraints = []
        self.violations = []
    
    def add_constraint(self, name, check_fn, severity='high'):
        """Add safety constraint"""
        # Your implementation here
        pass
    
    def validate_action(self, action, context):
        """
        Validate single action against all constraints.
        Returns {'safe': bool, 'violations': [...], 'mitigations': [...]}
        """
        # Your implementation here
        pass
    
    def validate_trajectory(self, actions):
        """
        Validate sequence of actions.
        Some constraints only apply to sequences.
        """
        # Your implementation here
        pass
    
    def get_violation_summary(self):
        """Get summary of all violations"""
        # Your implementation here
        pass
Lines: 1Characters: 0
Ready
The AI Interview - Master AI/ML Interviews