Home/Agentic AI/Memory Retrieval/Ranking & Scoring

Memory Retrieval Strategies

Master how AI agents retrieve relevant memories to support intelligent decision-making and personalized responses

Your Progress

0 / 5 completed

Multi-Factor Ranking

After retrieving candidate memories, agents must rank them to determine which ones to include in the context window. Effective ranking combines multiple factors: semantic relevance, recency, and importance.

Interactive: Adjust Ranking Weights

User query: "What ML framework should I use?"

Relevance Weight50%

Recency Weight30%

Importance Weight20%

Ranked results (highest score first):

Discussed TensorFlow vs PyTorch yesterday

0.85

R: 0.88T: 0.90I: 0.70

Working on computer vision project

0.84

R: 0.85T: 0.80I: 0.85

User prefers Python for ML projects

0.71

R: 0.95T: 0.20I: 0.90

User is senior ML engineer

0.59

R: 0.72T: 0.10I: 1.00

Experiment tip: Try maximizing recency weight to see how recent conversations dominate. Then maximize relevance to prioritize semantic similarity. Balance is key!

⚖️ Key Scoring Factors

🎯

Semantic Relevance

Cosine similarity between query and memory embeddings. Captures semantic meaning.

Formula: cos(query_vec, memory_vec)

⏰

Temporal Recency

Exponential decay based on time since memory creation. Recent = more relevant.

Formula: e^(-λ × age_in_hours)

⭐

Memory Importance

User-defined or model-assigned importance score. Core facts rank higher.

Scale: 0.0 (trivial) to 1.0 (critical)

🧮 Combined Ranking Formula

Final Score =

(α × relevance) + (β × recency) + (γ × importance)

α (alpha)

Relevance weight, typically 0.5-0.7

β (beta)

Recency weight, typically 0.2-0.3

γ (gamma)

Importance weight, typically 0.1-0.3

Weights must sum to 1.0. Adjust based on your application needs.

🚀 Advanced Scoring Techniques

•

Personalization: Learn user-specific weights over time. If a user frequently references old facts, increase importance weight.

•

Context-Aware Scoring: Adjust weights based on query type. Factual queries prioritize importance, conversations prioritize recency.

•

Diversity Penalty: Reduce scores of memories too similar to already-selected ones to avoid redundancy (MMR - Maximal Marginal Relevance).

•

Cross-Encoder Reranking: Use heavy transformer model to rerank top candidates with query-memory cross-attention for maximum accuracy.

← Retrieval MethodsPrev