记忆是一个记住先前交互信息的系统。对于 AI 智能体,记忆至关重要,因为它让它们能够记住先前的交互、从反馈中学习并适应用户偏好。随着智能体处理具有大量用户交互的更复杂任务,这种能力对效率和用户满意度变得至关重要。
短期记忆让您的应用程序记住单个线程或对话中的先前交互。
线程在会话中组织多个交互,类似于电子邮件在单个对话中对消息进行分组的方式。
对话历史记录是最常见的短期记忆形式。长对话对当今的 LLM 构成了挑战;完整的历史记录可能无法放入 LLM 的上下文窗口中,导致上下文丢失或错误。
即使您的模型支持完整的上下文长度,大多数 LLM 在长上下文中仍然表现不佳。它们被陈旧或偏离主题的内容”分散注意力”,同时遭受响应时间变慢和成本更高的问题。
聊天模型使用消息接受上下文,其中包括指令(系统消息)和输入(人类消息)。在聊天应用程序中,消息在人类输入和模型响应之间交替,导致随着时间推移而变长的消息列表。由于上下文窗口有限,许多应用程序可以受益于使用技术来删除或”忘记”陈旧信息。
要向智能体添加短期记忆(线程级持久化),您需要在创建智能体时指定 checkpointer。
LangChain 的智能体将短期记忆作为智能体状态的一部分进行管理。通过将这些存储在图的状态中,智能体可以访问给定对话的完整上下文,同时保持不同线程之间的分离。状态使用检查点器持久化到数据库(或内存)中,以便线程可以随时恢复。当调用智能体或完成步骤(如工具调用)时,短期记忆会更新,并在每个步骤开始时读取状态。
import { createAgent } from "langchain";
import { MemorySaver } from "@langchain/langgraph";
const checkpointer = new MemorySaver();
const agent = createAgent({
model: "claude-sonnet-4-5-20250929",
tools: [],
checkpointer,
});
await agent.invoke(
{ messages: [{ role: "user", content: "hi! i am Bob" }] },
{ configurable: { thread_id: "1" } }
);
在生产环境中
在生产环境中,使用由数据库支持的检查点器:
import { PostgresSaver } from "@langchain/langgraph-checkpoint-postgres";
const DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres?sslmode=disable";
const checkpointer = PostgresSaver.fromConnString(DB_URI);
Customizing agent memory
默认情况下,智能体使用 @[AgentState] 来管理短期记忆,特别是通过 messages 键管理对话历史。
您可以扩展 @[AgentState] 以添加其他字段。自定义状态模式使用 @[state_schema] 参数传递给 @[create_agent]。
import * as z from "zod";
import { createAgent, createMiddleware } from "langchain";
import { MessagesZodState, MemorySaver } from "@langchain/langgraph";
const customStateSchema = z.object({
messages: MessagesZodState.shape.messages,
userId: z.string(),
preferences: z.record(z.string(), z.any()),
});
const stateExtensionMiddleware = createMiddleware({
name: "StateExtension",
stateSchema: customStateSchema,
});
const checkpointer = new MemorySaver();
const agent = createAgent({
model: "gpt-5",
tools: [],
middleware: [stateExtensionMiddleware] as const,
checkpointer,
});
// Custom state can be passed in invoke
const result = await agent.invoke({
messages: [{ role: "user", content: "Hello" }],
userId: "user_123",
preferences: { theme: "dark" },
});
Common patterns
With short-term memory enabled, long conversations can exceed the LLM’s context window. Common solutions are:
This allows the agent to keep track of the conversation without exceeding the LLM’s context window.
Trim messages
Most LLMs have a maximum supported context window (denominated in tokens).
One way to decide when to truncate messages is to count the tokens in the message history and truncate whenever it approaches that limit. If you’re using LangChain, you can use the trim messages utility and specify the number of tokens to keep from the list, as well as the strategy (e.g., keep the last maxTokens) to use for handling the boundary.
To trim message history in an agent, use stateModifier with the trimMessages function:
import {
createAgent,
trimMessages,
type AgentState,
} from "langchain";
import { MemorySaver } from "@langchain/langgraph";
// This function will be called every time before the node that calls LLM
const stateModifier = async (state: AgentState) => {
return {
messages: await trimMessages(state.messages, {
strategy: "last",
maxTokens: 384,
startOn: "human",
endOn: ["human", "tool"],
tokenCounter: (msgs) => msgs.length,
}),
};
};
const checkpointer = new MemorySaver();
const agent = createAgent({
model: "gpt-5",
tools: [],
preModelHook: stateModifier,
checkpointer,
});
Delete messages
You can delete messages from the graph state to manage the message history.
This is useful when you want to remove specific messages or clear the entire message history.
To delete messages from the graph state, you can use the RemoveMessage. For RemoveMessage to work, you need to use a state key with messagesStateReducer reducer, like MessagesZodState.
To remove specific messages:
import { RemoveMessage } from "@langchain/core/messages";
const deleteMessages = (state) => {
const messages = state.messages;
if (messages.length > 2) {
// remove the earliest two messages
return {
messages: messages
.slice(0, 2)
.map((m) => new RemoveMessage({ id: m.id })),
};
}
};
When deleting messages, make sure that the resulting message history is valid. Check the limitations of the LLM provider you’re using. For example:
- Some providers expect message history to start with a
user message
- Most providers require
assistant messages with tool calls to be followed by corresponding tool result messages.
import { RemoveMessage } from "@langchain/core/messages";
import { AgentState, createAgent } from "langchain";
import { MemorySaver } from "@langchain/langgraph";
const deleteMessages = (state: AgentState) => {
const messages = state.messages;
if (messages.length > 2) {
// remove the earliest two messages
return {
messages: messages
.slice(0, 2)
.map((m) => new RemoveMessage({ id: m.id! })),
};
}
return {};
};
const agent = createAgent({
model: "gpt-5-nano",
tools: [],
prompt: "Please be concise and to the point.",
postModelHook: deleteMessages,
checkpointer: new MemorySaver(),
});
const config = { configurable: { thread_id: "1" } };
const streamA = await agent.stream(
{ messages: [{ role: "user", content: "hi! I'm bob" }] },
{ ...config, streamMode: "values" }
);
for await (const event of streamA) {
const messageDetails = event.messages.map((message) => [
message.getType(),
message.content,
]);
console.log(messageDetails);
}
const streamB = await agent.stream(
{
messages: [{ role: "user", content: "what's my name?" }],
},
{ ...config, streamMode: "values" }
);
for await (const event of streamB) {
const messageDetails = event.messages.map((message) => [
message.getType(),
message.content,
]);
console.log(messageDetails);
}
[['human', "hi! I'm bob"]]
[['human', "hi! I'm bob"], ['ai', 'Hi Bob! How are you doing today? Is there anything I can help you with?']]
[['human', "hi! I'm bob"], ['ai', 'Hi Bob! How are you doing today? Is there anything I can help you with?'], ['human', "what's my name?"]]
[['human', "hi! I'm bob"], ['ai', 'Hi Bob! How are you doing today? Is there anything I can help you with?'], ['human', "what's my name?"], ['ai', 'Your name is Bob.']]
[['human', "what's my name?"], ['ai', 'Your name is Bob.']]
Summarize messages
The problem with trimming or removing messages, as shown above, is that you may lose information from culling of the message queue.
Because of this, some applications benefit from a more sophisticated approach of summarizing the message history using a chat model.
To summarize message history in an agent, use the built-in summarizationMiddleware:
import { createAgent, summarizationMiddleware } from "langchain";
import { MemorySaver } from "@langchain/langgraph";
const checkpointer = new MemorySaver();
const agent = createAgent({
model: "gpt-4o",
tools: [],
middleware: [
summarizationMiddleware({
model: "gpt-4o-mini",
maxTokensBeforeSummary: 4000,
messagesToKeep: 20,
}),
],
checkpointer,
});
const config = { configurable: { thread_id: "1" } };
await agent.invoke({ messages: "hi, my name is bob" }, config);
await agent.invoke({ messages: "write a short poem about cats" }, config);
await agent.invoke({ messages: "now do the same but for dogs" }, config);
const finalResponse = await agent.invoke({ messages: "what's my name?" }, config);
console.log(finalResponse.messages.at(-1)?.content);
// Your name is Bob!
See summarizationMiddleware for more configuration options.
Access memory
You can access and modify the short-term memory (state) of an agent in several ways:
Access short term memory (state) in a tool using the ToolRuntime parameter.
The tool_runtime parameter is hidden from the tool signature (so the model doesn’t see it), but the tool can access the state through it.
import * as z from "zod";
import { createAgent, tool } from "langchain";
const stateSchema = z.object({
userId: z.string(),
});
const getUserInfo = tool(
async (_, config) => {
const userId = config.context?.userId;
return { userId };
},
{
name: "get_user_info",
description: "Get user info",
schema: z.object({}),
}
);
const agent = createAgent({
model: "gpt-5-nano",
tools: [getUserInfo],
stateSchema,
});
const result = await agent.invoke(
{
messages: [{ role: "user", content: "what's my name?" }],
},
{
context: {
userId: "user_123",
},
}
);
console.log(result.messages.at(-1)?.content);
// Outputs: "User is John Smith."
To modify the agent’s short-term memory (state) during execution, you can return state updates directly from the tools.
This is useful for persisting intermediate results or making information accessible to subsequent tools or prompts.
import * as z from "zod";
import { tool, createAgent } from "langchain";
import { MessagesZodState, Command } from "@langchain/langgraph";
const CustomState = z.object({
messages: MessagesZodState.shape.messages,
userName: z.string().optional(),
});
const updateUserInfo = tool(
async (_, config) => {
const userId = config.context?.userId;
const name = userId === "user_123" ? "John Smith" : "Unknown user";
return new Command({
update: {
userName: name,
// update the message history
messages: [
{
role: "tool",
content: "Successfully looked up user information",
tool_call_id: config.toolCall?.id,
},
],
},
});
},
{
name: "update_user_info",
description: "Look up and update user info.",
schema: z.object({}),
}
);
const greet = tool(
async (_, config) => {
const userName = config.context?.userName;
return `Hello ${userName}!`;
},
{
name: "greet",
description: "Use this to greet the user once you found their info.",
schema: z.object({}),
}
);
const agent = createAgent({
model,
tools: [updateUserInfo, greet],
stateSchema: CustomState,
});
await agent.invoke(
{ messages: [{ role: "user", content: "greet the user" }] },
{ context: { userId: "user_123" } }
);
Prompt
Access short term memory (state) in middleware to create dynamic prompts based on conversation history or custom state fields.
import * as z from "zod";
import { createAgent, tool, SystemMessage } from "langchain";
const contextSchema = z.object({
userName: z.string(),
});
const getWeather = tool(
async ({ city }, config) => {
return `The weather in ${city} is always sunny!`;
},
{
name: "get_weather",
description: "Get user info",
schema: z.object({
city: z.string(),
}),
}
);
const agent = createAgent({
model: "gpt-5-nano",
tools: [getWeather],
contextSchema,
prompt: (state, config) => {
return [
new SystemMessage(
`You are a helpful assistant. Address the user as ${config.context?.userName}.`
),
...state.messages,
},
});
const result = await agent.invoke(
{
messages: [{ role: "user", content: "What is the weather in SF?" }],
},
{
context: {
userName: "John Smith",
},
}
);
for (const message of result.messages) {
console.log(message);
}
/**
* HumanMessage {
* "content": "What is the weather in SF?",
* // ...
* }
* AIMessage {
* // ...
* "tool_calls": [
* {
* "name": "get_weather",
* "args": {
* "city": "San Francisco"
* },
* "type": "tool_call",
* "id": "call_tCidbv0apTpQpEWb3O2zQ4Yx"
* }
* ],
* // ...
* }
* ToolMessage {
* "content": "The weather in San Francisco is always sunny!",
* "tool_call_id": "call_tCidbv0apTpQpEWb3O2zQ4Yx"
* // ...
* }
* AIMessage {
* "content": "John Smith, here's the latest: The weather in San Francisco is always sunny!\n\nIf you'd like more details (temperature, wind, humidity) or a forecast for the next few days, I can pull that up. What would you like?",
* // ...
* }
*/
Before model
Access short term memory (state) in @[@before_model] middleware to process messages before model calls.
import { RemoveMessage } from "@langchain/core/messages";
import { createAgent, createMiddleware, trimMessages, type AgentState } from "langchain";
const trimMessageHistory = createMiddleware({
name: "TrimMessages",
beforeModel: async (state) => {
const trimmed = await trimMessages(state.messages, {
maxTokens: 384,
strategy: "last",
startOn: "human",
endOn: ["human", "tool"],
tokenCounter: (msgs) => msgs.length,
});
return { messages: trimmed };
},
});
const agent = createAgent({
model: "gpt-5-nano",
tools: [],
middleware: [trimMessageHistory],
});
After model
Access short term memory (state) in @[@after_model] middleware to process messages after model calls.
import { RemoveMessage } from "@langchain/core/messages";
import { createAgent, createMiddleware, type AgentState } from "langchain";
const validateResponse = createMiddleware({
name: "ValidateResponse",
afterModel: (state) => {
const lastMessage = state.messages.at(-1)?.content;
if (typeof lastMessage === "string" && lastMessage.toLowerCase().includes("confidential")) {
return {
messages: [new RemoveMessage({ id: "all" }), ...state.messages],
};
}
return;
},
});
const agent = createAgent({
model: "gpt-5-nano",
tools: [],
middleware: [validateResponse],
});