Implement RAG Inside an Agent

Medium
Agents

RAG (Retrieval-Augmented Generation) Inside Agents

RAG allows agents to answer questions grounded in external knowledge.

Task

Build a RAGAgent that:

  1. Ingests documents by chunking with overlap.
  2. Stores chunks in a vector DB.
  3. On query: retrieves top-k chunks, builds augmented prompt, calls LLM.
  4. Returns answer, source chunks, and context used.

Constraints

  • Chunks must overlap by overlap characters to preserve context.
  • Retrieved chunks must be deduped before inclusion.
  • Augmented prompt template: `Context: {chunks}

Question: {question} Answer:`

Examples

Example 1:
Input: agent.ingest(['The Eiffel Tower is in Paris. It was built in 1889.']) agent.query('When was the Eiffel Tower built?')
Output: {'answer': '1889', 'sources': ['...built in 1889...'], 'context_used': '...'}
Explanation: Chunk containing '1889' retrieved and used to ground the answer.

Starter Code

from typing import List, Dict, Any

class RAGAgent:
    def __init__(self, vector_db, llm_fn: callable, k: int = 3):
        self.vector_db = vector_db  # has .add(text) and .search(query, k)
        self.llm_fn = llm_fn
        self.k = k
        self.doc_count = 0

    def ingest(self, documents: List[str]) -> None:
        # TODO: Chunk and index documents
        pass

    def _chunk(self, text: str, chunk_size: int = 200, overlap: int = 50) -> List[str]:
        # TODO: Split text into overlapping chunks
        pass

    def query(self, question: str) -> Dict:
        # TODO: Retrieve → Augment → Generate
        # Return {'answer': str, 'sources': List[str], 'context_used': str}
        pass

    def update_k(self, k: int) -> None:
        self.k = k
Lines: 1Characters: 0
Ready
The AI Interview - Master AI/ML Interviews