RAG (Retrieval-Augmented Generation) Inside Agents
RAG allows agents to answer questions grounded in external knowledge.
Task
Build a RAGAgent that:
- Ingests documents by chunking with overlap.
- Stores chunks in a vector DB.
- On query: retrieves top-k chunks, builds augmented prompt, calls LLM.
- Returns answer, source chunks, and context used.
Constraints
- Chunks must overlap by
overlapcharacters to preserve context. - Retrieved chunks must be deduped before inclusion.
- Augmented prompt template: `Context: {chunks}
Question: {question} Answer:`
Examples
Example 1:
Input:
agent.ingest(['The Eiffel Tower is in Paris. It was built in 1889.'])
agent.query('When was the Eiffel Tower built?')Output:
{'answer': '1889', 'sources': ['...built in 1889...'], 'context_used': '...'}Explanation: Chunk containing '1889' retrieved and used to ground the answer.
Starter Code
from typing import List, Dict, Any
class RAGAgent:
def __init__(self, vector_db, llm_fn: callable, k: int = 3):
self.vector_db = vector_db # has .add(text) and .search(query, k)
self.llm_fn = llm_fn
self.k = k
self.doc_count = 0
def ingest(self, documents: List[str]) -> None:
# TODO: Chunk and index documents
pass
def _chunk(self, text: str, chunk_size: int = 200, overlap: int = 50) -> List[str]:
# TODO: Split text into overlapping chunks
pass
def query(self, question: str) -> Dict:
# TODO: Retrieve → Augment → Generate
# Return {'answer': str, 'sources': List[str], 'context_used': str}
pass
def update_k(self, k: int) -> None:
self.k = k
Python3
ReadyLines: 1Characters: 0
Ready