检索 - Docs by LangChain

大型语言模型（LLM）功能强大，但它们有两个关键限制：

有限的上下文 — 它们无法一次性摄取整个语料库。
静态知识 — 它们的训练数据在某个时间点被冻结。

检索通过在查询时获取相关的外部知识来解决这些问题。这是**检索增强生成（RAG）**的基础：使用特定上下文信息增强 LLM 的答案。

构建知识库

知识库是在检索期间使用的文档或结构化数据的存储库。如果您需要自定义知识库，可以使用 LangChain 的文档加载器和向量存储从您自己的数据构建一个。

如果您已经有知识库（例如，SQL 数据库、CRM 或内部文档系统），则不需要重建它。您可以：

将其作为智能体 RAG 中的工具连接。
查询它并将检索到的内容作为上下文提供给 LLM （两步 RAG）。

请参阅以下教程构建可搜索的知识库和最小 RAG 工作流程：

教程：语义搜索

学习如何使用 LangChain 的文档加载器、嵌入和向量存储从您自己的数据创建可搜索的知识库。在本教程中，您将在 PDF 上构建搜索引擎，从而能够检索与查询相关的段落。您还将在此引擎之上实现最小 RAG 工作流程，以了解如何将外部知识集成到 LLM 推理中。

从检索到 RAG

检索允许 LLM 在运行时访问相关上下文。但大多数实际应用程序更进一步：它们将检索与生成集成以产生有根据的、上下文感知的答案。这是**检索增强生成（RAG）**背后的核心思想。检索管道成为结合搜索与生成的更广泛系统的基础。

检索管道

典型的检索工作流程如下所示：每个组件都是模块化的：您可以交换加载器、分割器、嵌入或向量存储，而无需重写应用程序的逻辑。

构建模块

文档加载器

从外部来源（Google Drive、Slack、Notion 等）摄取数据，返回标准化的 Document 对象。

文本分割器

将大型文档分解为可以单独检索并适合模型上下文窗口的较小块。

嵌入模型

嵌入模型将文本转换为数字向量，使具有相似含义的文本在该向量空间中彼此接近。

向量存储

Specialized databases for storing and searching embeddings.

Retrievers

A retriever is an interface that returns documents given an unstructured query.

RAG Architectures

RAG can be implemented in multiple ways, depending on your system’s needs. We outline each type in the sections below.

Architecture	Description	Control	Flexibility	Latency	Example Use Case
2-Step RAG	Retrieval always happens before generation. Simple and predictable	✅ High	❌ Low	⚡ Fast	FAQs, documentation bots
Agentic RAG	An LLM-powered agent decides when and how to retrieve during reasoning	❌ Low	✅ High	⏳ Variable	Research assistants with access to multiple tools
Hybrid	Combines characteristics of both approaches with validation steps	⚖️ Medium	⚖️ Medium	⏳ Variable	Domain-specific Q&A with quality validation

Latency: Latency is generally more predictable in 2-Step RAG, as the maximum number of LLM calls is known and capped. This predictability assumes that LLM inference time is the dominant factor. However, real-world latency may also be affected by the performance of retrieval steps—such as API response times, network delays, or database queries—which can vary based on the tools and infrastructure in use.

2-step RAG

In 2-Step RAG, the retrieval step is always executed before the generation step. This architecture is straightforward and predictable, making it suitable for many applications where the retrieval of relevant documents is a clear prerequisite for generating an answer.

Tutorial: Retrieval-Augmented Generation (RAG)

See how to build a Q&A chatbot that can answer questions grounded in your data using Retrieval-Augmented Generation. This tutorial walks through two approaches:

A RAG agent that runs searches with a flexible tool—great for general-purpose use.
A 2-step RAG chain that requires just one LLM call per query—fast and efficient for simpler tasks.

智能体式 RAG

智能体式检索增强生成 (RAG) 结合了检索增强生成的优势和基于智能体的推理。不是在回答之前检索文档，而是由 LLM 驱动的智能体逐步推理，并决定在交互过程中何时以及如何检索信息。

The only thing an agent needs to enable RAG behavior is access to one or more tools that can fetch external knowledge — such as documentation loaders, web APIs, or database queries.

import requests
from langchain.tools import tool
from langchain.chat_models import init_chat_model
from langchain.agents import create_agent


@tool
def fetch_url(url: str) -> str:
    """Fetch text content from a URL"""
    response = requests.get(url, timeout=10.0)
    response.raise_for_status()
    return response.text

system_prompt = """\
Use fetch_url when you need to fetch information from a web-page; quote relevant snippets.
"""

agent = create_agent(
    model="claude-sonnet-4-5-20250929",
    tools=[fetch_url], # A tool for retrieval
    system_prompt=system_prompt,
)

Show Extended example: Agentic RAG for LangGraph's llms.txt

This example implements an Agentic RAG system to assist users in querying LangGraph documentation. The agent begins by loading llms.txt, which lists available documentation URLs, and can then dynamically use a fetch_documentation tool to retrieve and process the relevant content based on the user’s question.

import requests
from langchain.agents import create_agent
from langchain.messages import HumanMessage
from langchain.tools import tool
from markdownify import markdownify


ALLOWED_DOMAINS = ["https://langchain-ai.github.io/"]
LLMS_TXT = 'https://langchain-ai.github.io/langgraph/llms.txt'


@tool
def fetch_documentation(url: str) -> str:  
    """Fetch and convert documentation from a URL"""
    if not any(url.startswith(domain) for domain in ALLOWED_DOMAINS):
        return (
            "Error: URL not allowed. "
            f"Must start with one of: {', '.join(ALLOWED_DOMAINS)}"
        )
    response = requests.get(url, timeout=10.0)
    response.raise_for_status()
    return markdownify(response.text)


# We will fetch the content of llms.txt, so this can
# be done ahead of time without requiring an LLM request.
llms_txt_content = requests.get(LLMS_TXT).text

# System prompt for the agent
system_prompt = f"""
You are an expert Python developer and technical assistant.
Your primary role is to help users with questions about LangGraph and related tools.

Instructions:

1. If a user asks a question you're unsure about — or one that likely involves API usage,
   behavior, or configuration — you MUST use the `fetch_documentation` tool to consult the relevant docs.
2. When citing documentation, summarize clearly and include relevant context from the content.
3. Do not use any URLs outside of the allowed domain.
4. If a documentation fetch fails, tell the user and proceed with your best expert understanding.

You can access official documentation from the following approved sources:

{llms_txt_content}

You MUST consult the documentation to get up to date documentation
before answering a user's question about LangGraph.

Your answers should be clear, concise, and technically accurate.
"""

tools = [fetch_documentation]

model = init_chat_model("claude-sonnet-4-0", max_tokens=32_000)

agent = create_agent(
    model=model,
    tools=tools,  
    system_prompt=system_prompt,  
    name="Agentic RAG",
)

response = agent.invoke({
    'messages': [
        HumanMessage(content=(
            "Write a short example of a langgraph agent using the "
            "prebuilt create react agent. the agent should be able "
            "to look up stock pricing information."
        ))
    ]
})

print(response['messages'][-1].content)

Tutorial: Retrieval-Augmented Generation (RAG)

See how to build a Q&A chatbot that can answer questions grounded in your data using Retrieval-Augmented Generation. This tutorial walks through two approaches:

A RAG agent that runs searches with a flexible tool—great for general-purpose use.
A 2-step RAG chain that requires just one LLM call per query—fast and efficient for simpler tasks.

混合 RAG

混合 RAG 结合了 2 步 RAG 和智能体式 RAG 的特征。它引入了中间步骤，如查询预处理、检索验证和后生成检查。这些系统比固定管道提供更多灵活性，同时保持对执行的一些控制。 Typical components include:

Query enhancement: Modify the input question to improve retrieval quality. This can involve rewriting unclear queries, generating multiple variations, or expanding queries with additional context.
Retrieval validation: Evaluate whether retrieved documents are relevant and sufficient. If not, the system may refine the query and retrieve again.
Answer validation: Check the generated answer for accuracy, completeness, and alignment with source content. If needed, the system can regenerate or revise the answer.

The architecture often supports multiple iterations between these steps: This architecture is suitable for:

Applications with ambiguous or underspecified queries
Systems that require validation or quality control steps
Workflows involving multiple sources or iterative refinement

Tutorial: Agentic RAG with Self-Correction

An example of Hybrid RAG that combines agentic reasoning with retrieval and self-correction.

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

文档加载器

LangChain v1.0

Get started

Core components

Advanced usage

Use in production

检索

构建知识库

教程：语义搜索

从检索到 RAG

检索管道

构建模块

文本分割器

嵌入模型

向量存储

Retrievers

RAG Architectures

2-step RAG

Tutorial: Retrieval-Augmented Generation (RAG)

智能体式 RAG

Tutorial: Retrieval-Augmented Generation (RAG)

混合 RAG

Tutorial: Agentic RAG with Self-Correction

LangChain v1.0

Get started

Core components

Advanced usage

Use in production

​构建知识库

教程：语义搜索

​从检索到 RAG

​检索管道

​构建模块

文档加载器

文本分割器

嵌入模型

向量存储

Retrievers

​RAG Architectures

​2-step RAG

Tutorial: Retrieval-Augmented Generation (RAG)

​智能体式 RAG

Tutorial: Retrieval-Augmented Generation (RAG)

​混合 RAG

Tutorial: Agentic RAG with Self-Correction

构建知识库

从检索到 RAG

检索管道

构建模块

RAG Architectures

2-step RAG

智能体式 RAG

混合 RAG