Haystack Agents
Master Haystack for building production-ready RAG agents and NLP pipelines
Your Progress
0 / 5 completedBuilding RAG Agents with Haystack
RAG (Retrieval-Augmented Generation) agents combine semantic search with LLM generation to provide accurate, grounded responses based on your documents. Haystack makes building these agents straightforward.
Complete RAG Pipeline
Full RAG Implementation
from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.rankers import TransformersSimilarityRanker
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
# Initialize document store and index documents
document_store = InMemoryDocumentStore()
documents = [
Document(content="RAG combines retrieval with generation for accurate answers."),
Document(content="Haystack is an open-source framework for NLP pipelines."),
# ... more documents
]
# Embed and write documents
doc_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
doc_embedder.warm_up()
docs_with_embeddings = doc_embedder.run(documents)
document_store.write_documents(docs_with_embeddings["documents"])
# Build RAG pipeline
rag_pipeline = Pipeline()
# Add components
rag_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
rag_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store, top_k=5))
rag_pipeline.add_component("ranker", TransformersSimilarityRanker(top_k=3))
rag_pipeline.add_component("prompt_builder", PromptBuilder(template="""
Answer the question based on the provided context.
Context:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Question: {{ question }}
Answer:
"""))
rag_pipeline.add_component("generator", OpenAIGenerator(api_key=api_key, model="gpt-4"))
# Connect pipeline
rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "ranker.documents")
rag_pipeline.connect("ranker.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "generator.prompt")
# Run query
result = rag_pipeline.run({
"text_embedder": {"text": "What is RAG?"},
"prompt_builder": {"question": "What is RAG?"}
})
print(result["generator"]["replies"][0])Interactive: RAG Pipeline Execution
Watch how a RAG pipeline processes a query step-by-step:
Advanced RAG Patterns
🔄 Conversational RAG
Maintain conversation history and context across multiple turns for chat-like experiences.
# Add memory component
from haystack.components.memory import ConversationMemory
pipeline.add_component("memory", ConversationMemory())
pipeline.connect("memory.history", "prompt_builder.history")🎯 Filtered Retrieval
Apply metadata filters to retrieve only relevant document subsets (by date, author, category).
# Filter by metadata
result = pipeline.run({
"retriever": {
"filters": {"author": "John Doe", "year": {"$gte": 2023}}
}
})📊 Hybrid Search
Combine BM25 keyword search with semantic embeddings for best of both worlds.
# Use both BM25 and embedding retrieval
pipeline.add_component("bm25_retriever", InMemoryBM25Retriever())
pipeline.add_component("embedding_retriever", InMemoryEmbeddingRetriever())
pipeline.add_component("joiner", DocumentJoiner())🎯 RAG Best Practices
- •Chunk documents wisely: 200-500 words per chunk balances context and precision
- •Use re-ranking: Improves precision by filtering retrieved docs before LLM
- •Include metadata: Date, author, source help with filtering and attribution
- •Monitor retrieval quality: Track if relevant docs are being retrieved