The AI Interview - Master AI/ML Interviews

In production ML systems, data quality is critical for model performance. Poor quality data can lead to model degradation, biased predictions, and system failures. You need to implement a data quality scoring function that evaluates incoming data against a defined schema.

Given a list of data records (dictionaries) and a schema definition, compute the following quality metrics:

Completeness: Percentage of non-null values across all expected fields
Type Validity: Percentage of values that match their expected data types (including null handling based on nullable flag)
Uniqueness Ratio: Percentage of unique records in the dataset
Overall Score: Weighted combination of metrics (40% completeness, 40% type validity, 20% uniqueness)

The schema is a dictionary where each key is a column name and the value is a specification with:

'type': One of 'numeric', 'categorical', or 'boolean'
'nullable': Boolean indicating if null values are acceptable

For type validity:

Numeric type accepts int and float (but not boolean)
Categorical type accepts strings
Boolean type accepts True/False only
If a value is None and the field is nullable, it counts as type-valid
If a value is None and the field is not nullable, it counts as type-invalid

Write a function calculate_data_quality_score(data, schema) that returns a dictionary with all four metrics. Return an empty dictionary if the input data is empty. All values should be rounded to 2 decimal places.

Examples

Example 1:

Input:

data = [{'age': 25, 'name': 'Alice', 'active': True}, {'age': 'thirty', 'name': 'Bob', 'active': False}, {'age': None, 'name': None, 'active': True}, {'age': 40, 'name': 'Dave', 'active': 'yes'}], schema = {'age': {'type': 'numeric', 'nullable': True}, 'name': {'type': 'categorical', 'nullable': True}, 'active': {'type': 'boolean', 'nullable': False}}

Output: {'completeness': 83.33, 'type_validity': 83.33, 'uniqueness_ratio': 100.0, 'overall_score': 86.67}

Explanation: Total fields = 4 rows x 3 columns = 12. Non-null fields = 10 (row 3 has 2 nulls). Completeness = 10/12 = 83.33%. For type validity: row 1 has 3 valid, row 2 has 2 valid (age is string not numeric), row 3 has 3 valid (nulls are allowed), row 4 has 2 valid (active is string not boolean). Type validity = 10/12 = 83.33%. All 4 rows are unique, so uniqueness = 100%. Overall = 0.4*83.33 + 0.4*83.33 + 0.2*100 = 86.67%.

Starter Code

def calculate_data_quality_score(data: list, schema: dict) -> dict:
    """
    Calculate data quality metrics for ML pipeline monitoring.
    
    Args:
        data: list of dictionaries representing rows of data
        schema: dictionary defining expected columns and their types
                {'column_name': {'type': 'numeric'|'categorical'|'boolean', 'nullable': True|False}}
    
    Returns:
        dict with keys: 'completeness', 'type_validity', 'uniqueness_ratio', 'overall_score'
        All values as percentages (0-100), rounded to 2 decimal places.
    """
    pass

Data Quality Scoring for ML Pipelines

Examples

Starter Code