Home/Agentic AI/Memory Types/Short-Term Memory

Memory Types

Understand how AI agents store, retrieve, and manage information across different memory systems

Your Progress

0 / 5 completed

Short-Term Memory: The Context Window

In AI agents, short-term memory is implemented via the context window—the recent conversation history sent with each request. It's fast, immediately accessible, but severely limited by token count.

Interactive: Context Window Explorer

Adjust window size and see how it affects conversation capacity

Context Window Size4,000 tokens

1K32K128K

Active Messages in Context5 messages

12550

Window CapacityStandard

Tokens per Message

~800 tokens

Total Capacity

4,000 tokens

Can Store

~40 messages

Assessment:

Good for short interactions

Current Context Window Contents:

Message #5800 tokens

👤 User: What are the different types of memory?

Message #4800 tokens

🤖 Agent: Memory types include working, episodic, semantic...

Message #3800 tokens

👤 User: Can you explain working memory in detail?

Message #2800 tokens

🤖 Agent: Working memory is temporary storage for...

Message #1800 tokens

👤 User: Previous interaction content (1 turns ago)

Short-Term Memory Strategies

🔄

Sliding Window

Keep only the N most recent messages. Simple but loses older context.

messages = messages[-N:]

📝

Summarization

Compress old messages into summaries. Preserves key info while reducing tokens.

summary = summarize(old_messages)

🎯

Importance Filtering

Keep important messages regardless of age. Drop mundane exchanges.

if importance_score > threshold: keep

🧩

Hybrid Approach

Combine strategies: recent messages + important older ones + summary.

context = recent + important + summary

Context Window Limitations

💸

Cost Scales with Size

Every token in context is processed and charged for. Large windows = expensive queries, especially with many requests.

🐌

Latency Increases

More tokens to process = longer response times. 128K context feels noticeably slower than 4K.

🧹

Eventually Fills Up

Even large windows have limits. Long conversations or document processing will eventually exceed capacity and require truncation.

🔄

Lost When Session Ends

Context window is stateless. When the conversation ends, everything is forgotten unless explicitly saved to long-term storage.

←Previous: Introduction