Short-Term Memory

Master how AI agents manage immediate information through context windows and attention mechanisms

The Immediate Memory Challenge

Imagine trying to hold a conversation while only remembering the last 7 things anyone said. That's essentially what short-term memory isβ€”a temporary workspace where information is held "in mind" just long enough to use it.

For AI agents, short-term memory is implemented through context windowsβ€”the maximum amount of text (measured in tokens) that an agent can "see" at once. Everything outside this window is effectively forgotten.

Understanding short-term memory is crucial because it determines how much information an agent can process simultaneously and how well it can maintain conversational coherence.

Interactive: Context vs Attention

Context Window: Hard Limit

The context window is a hard boundary on how much text can fit into memory. Think of it as a fixed-size notepad.

Storage TypeSequential text buffer
Limit TypeToken count (hard ceiling)
When ExceededOldest info is dropped
Example4K tokens = ~3,000 words

Key Point: Context windows are like a rolling conveyor beltβ€”new information pushes out old information when the limit is reached.

Interactive: Token Limit Explorer

4,096 tokens
1K32K64K128K+
βœ“

Standard

Most conversations

Approximate Words
3,072
Message Capacity
~27 messages
Use Case
Chat Apps

Why Short-Term Memory Matters

βœ… Enables

  • β€’ Conversational coherence
  • β€’ Multi-turn interactions
  • β€’ Context-aware responses
  • β€’ Real-time adaptation

⚠️ Limits

  • β€’ How long conversations can last
  • β€’ Amount of information per turn
  • β€’ Ability to reference old messages
  • β€’ Cost per interaction