- 有限的上下文 — 它们无法一次性摄取整个语料库。
- 静态知识 — 它们的训练数据在某个时间点被冻结。
构建知识库
知识库是在检索期间使用的文档或结构化数据的存储库。 如果您需要自定义知识库,可以使用 LangChain 的文档加载器和向量存储从您自己的数据构建一个。如果您已经有知识库(例如,SQL 数据库、CRM 或内部文档系统),则不需要重建它。您可以:
- 将其作为智能体 RAG 中的工具连接。
- 查询它并将检索到的内容作为上下文提供给 LLM (两步 RAG)。
教程:语义搜索
学习如何使用 LangChain 的文档加载器、嵌入和向量存储从您自己的数据创建可搜索的知识库。
在本教程中,您将在 PDF 上构建搜索引擎,从而能够检索与查询相关的段落。您还将在此引擎之上实现最小 RAG 工作流程,以了解如何将外部知识集成到 LLM 推理中。
从检索到 RAG
检索允许 LLM 在运行时访问相关上下文。但大多数实际应用程序更进一步:它们将检索与生成集成以产生有根据的、上下文感知的答案。 这是**检索增强生成(RAG)**背后的核心思想。检索管道成为结合搜索与生成的更广泛系统的基础。检索管道
典型的检索工作流程如下所示: 每个组件都是模块化的:您可以交换加载器、分割器、嵌入或向量存储,而无需重写应用程序的逻辑。构建模块
文档加载器
从外部来源(Google Drive、Slack、Notion 等)摄取数据,返回标准化的
Document 对象。文本分割器
将大型文档分解为可以单独检索并适合模型上下文窗口的较小块。
嵌入模型
嵌入模型将文本转换为数字向量,使具有相似含义的文本在该向量空间中彼此接近。
向量存储
Specialized databases for storing and searching embeddings.
Retrievers
A retriever is an interface that returns documents given an unstructured query.
RAG Architectures
RAG can be implemented in multiple ways, depending on your system’s needs. We outline each type in the sections below.| Architecture | Description | Control | Flexibility | Latency | Example Use Case |
|---|---|---|---|---|---|
| 2-Step RAG | Retrieval always happens before generation. Simple and predictable | ✅ High | ❌ Low | ⚡ Fast | FAQs, documentation bots |
| Agentic RAG | An LLM-powered agent decides when and how to retrieve during reasoning | ❌ Low | ✅ High | ⏳ Variable | Research assistants with access to multiple tools |
| Hybrid | Combines characteristics of both approaches with validation steps | ⚖️ Medium | ⚖️ Medium | ⏳ Variable | Domain-specific Q&A with quality validation |
Latency: Latency is generally more predictable in 2-Step RAG, as the maximum number of LLM calls is known and capped. This predictability assumes that LLM inference time is the dominant factor. However, real-world latency may also be affected by the performance of retrieval steps—such as API response times, network delays, or database queries—which can vary based on the tools and infrastructure in use.
2-step RAG
In 2-Step RAG, the retrieval step is always executed before the generation step. This architecture is straightforward and predictable, making it suitable for many applications where the retrieval of relevant documents is a clear prerequisite for generating an answer.Tutorial: Retrieval-Augmented Generation (RAG)
See how to build a Q&A chatbot that can answer questions grounded in your data using Retrieval-Augmented Generation.
This tutorial walks through two approaches:
- A RAG agent that runs searches with a flexible tool—great for general-purpose use.
- A 2-step RAG chain that requires just one LLM call per query—fast and efficient for simpler tasks.
智能体式 RAG
智能体式检索增强生成 (RAG) 结合了检索增强生成的优势和基于智能体的推理。不是在回答之前检索文档,而是由 LLM 驱动的智能体逐步推理,并决定在交互过程中何时以及如何检索信息。Tutorial: Retrieval-Augmented Generation (RAG)
See how to build a Q&A chatbot that can answer questions grounded in your data using Retrieval-Augmented Generation.
This tutorial walks through two approaches:
- A RAG agent that runs searches with a flexible tool—great for general-purpose use.
- A 2-step RAG chain that requires just one LLM call per query—fast and efficient for simpler tasks.
混合 RAG
混合 RAG 结合了 2 步 RAG 和智能体式 RAG 的特征。它引入了中间步骤,如查询预处理、检索验证和后生成检查。这些系统比固定管道提供更多灵活性,同时保持对执行的一些控制。 Typical components include:- Query enhancement: Modify the input question to improve retrieval quality. This can involve rewriting unclear queries, generating multiple variations, or expanding queries with additional context.
- Retrieval validation: Evaluate whether retrieved documents are relevant and sufficient. If not, the system may refine the query and retrieve again.
- Answer validation: Check the generated answer for accuracy, completeness, and alignment with source content. If needed, the system can regenerate or revise the answer.
- Applications with ambiguous or underspecified queries
- Systems that require validation or quality control steps
- Workflows involving multiple sources or iterative refinement
Tutorial: Agentic RAG with Self-Correction
An example of Hybrid RAG that combines agentic reasoning with retrieval and self-correction.