Develop a function to compute the METEOR score for evaluating machine translation quality. Given a reference translation and a candidate translation, calculate the score based on unigram matches, precision, recall, F-mean, and a penalty for word order fragmentation.
Examples
Example 1:
Input:
meteor_score('Rain falls gently from the sky', 'Gentle rain drops from the sky')Output:
0.625Explanation: The function identifies 4 exact unigram matches ('rain', 'from', 'the', 'sky'). Note that 'gently' and 'gentle' do NOT match since exact matching is used. Precision = 4/6, Recall = 4/6, giving F-mean ≈ 0.667. The matched positions [0, 3, 4, 5] form 2 chunks, resulting in a small penalty.
Starter Code
import numpy as np
from collections import Counter
def meteor_score(reference, candidate, alpha=0.9, beta=3, gamma=0.5):
"""
Calculate METEOR score for machine translation evaluation.
Args:
reference: Reference translation string
candidate: Candidate translation string
alpha: Weight for precision vs recall in F-mean (default 0.9)
beta: Exponent for fragmentation penalty (default 3)
gamma: Maximum penalty coefficient (default 0.5)
Returns:
METEOR score between 0 and 1
"""
# Your code here
passPython3
ReadyLines: 1Characters: 0
Ready