Skip to content

RAG 2.0

When to use

When domain knowledge is too large or dynamic to fit in context windows, and correctness matters.

Architecture sketch

Ingest → Chunk → Enrich (entities, tables) → Index
             ↓                     ↑
         Policies ← Query → Retrieve → Compose → Evals

Pitfalls

Over-chunking without semantic boundaries
Unenriched tables/figures → hallucinated facts
Missing retrieval policies (filters, recency, diversity)

Checklist

[ ] Ground-truth set covering key intents
[ ] Chunking with structure awareness (headings, tables)
[ ] Enrichment: entities, citations, table extraction
[ ] Retrieval policies + diversity
[ ] Compose with citations and abstain behavior

Good–Better–Best

Good: naive vector search, simple compose
Better: hybrid retrieval, citations, abstain
Best: structured enrichments, policies, evaluators per intent

Tiny eval snippet

def evaluate(query, answer, gold):
    return int(citation_present(answer) and contains(gold, answer))