设置
要阅读本指南,你需要安装 Docker 和 Python 3.x。 首次快速运行 Memgraph Platform(Memgraph 数据库 + MAGE 库 + Memgraph Lab),按以下步骤进行: Linux/MacOS:memgraph-mage 与 memgraph-lab 服务。至此 Memgraph 已经启动运行!关于安装流程的更多信息,请参阅 Memgraph 文档。
要使用 LangChain,请安装并导入所有必需包。我们将使用包管理器 pip,并附加 --user 标志以确保权限正确。如果你已安装 Python 3.4 或更高版本,默认会包含 pip。运行以下命令安装全部依赖:
自然语言查询
Memgraph 与 LangChain 的集成支持自然语言查询。要使用它,先完成必要的导入;我们会在代码中逐一说明。 首先实例化MemgraphGraph。该对象保存与运行中 Memgraph 实例的连接。请确保环境变量设置正确。
refresh_schema is initially set to False because there is still no data in the database and we want to avoid unnecessary database calls.
填充数据库
在填充数据库前,先确保其为空。最有效的方式是切换到内存分析存储模式、删除图,然后切回内存事务模式。了解更多 Memgraph 存储模式。 我们将向数据库添加有关不同类型电子游戏的数据,这些游戏在多个平台上可用,并与发行商相关。graph 对象包含 query 方法。该方法在 Memgraph 中执行查询,同时也被 MemgraphQAChain 用于查询数据库。
刷新图模式
由于向 Memgraph 写入了新数据,需要刷新模式。生成的模式将用于MemgraphQAChain,以指导 LLM 更好地生成 Cypher 查询。
查询数据库
要与 OpenAI API 交互,必须将 API Key 配置为环境变量,以确保请求获得授权。可在这里了解如何获取. 使用 Python os 包配置:MemgraphQAChain,它将在基于图数据的问答流程中使用。将 temperature 设置为 0 以获得可预测且一致的回答。可将 verbose 设为 True,以获取更多查询生成过程中的详细信息。
Chain modifiers
To modify the behavior of your chain and obtain more context or additional information, you can modify the chain’s parameters.Return direct query results
Thereturn_direct modifier specifies whether to return the direct results of the executed Cypher query or the processed natural language response.
Return query intermediate steps
Thereturn_intermediate_steps chain modifier enhances the returned response by including the intermediate steps of the query in addition to the initial query result.
Limit the number of query results
Thetop_k modifier can be used when you want to restrict the maximum number of query results.
Advanced querying
As the complexity of your solution grows, you might encounter different use-cases that require careful handling. Ensuring your application’s scalability is essential to maintain a smooth user flow without any hitches. Let’s instantiate our chain once again and attempt to ask some questions that users might potentially ask.Prompt refinement
To address this, we can adjust the initial Cypher prompt of the QA chain. This involves adding guidance to the LLM on how users can refer to specific platforms, such as PS5 in our case. We achieve this using the LangChain PromptTemplate, creating a modified initial prompt. This modified prompt is then supplied as an argument to our refinedMemgraphQAChain instance.
Constructing knowledge graph
Transforming unstructured data to structured is not an easy or straightforward task. This guide will show how LLMs can be utilized to help us there and how to construct a knowledge graph in Memgraph. After knowledge graph is created, you can use it for your GraphRAG application. The steps of constructing a knowledge graph from the text are:- Extracting structured information from text: LLM is used to extract structured graph information from text in a form of nodes and relationships.
- Storing into Memgraph: Storing the extracted structured graph information into Memgraph.
Extracting structured information from text
Besides all the imports in the setup section, importLLMGraphTransformer and Document which will be used to extract structured information from text.
LLMGraphTransformer from the desired LLM and convert the document to the graph structure.
Storing into Memgraph
Once you have the data ready in a format ofGraphDocument, that is, nodes and relationships, you can use add_graph_documents method to import it into Memgraph. That method transforms the list of graph_documents into appropriate Cypher queries that need to be executed in Memgraph. Once that’s done, a knowledge graph is stored in Memgraph.
The graph construction process is non-deterministic, since LLM which is used to generate nodes and relationships from unstructured data in non-deterministic.
Additional options
Additionally, you have the flexibility to define specific types of nodes and relationships for extraction according to your requirements.__Entity__ labels on all nodes which will be indexed for faster retrieval.
include_source to True and then the source document is stored and it is linked to the nodes in the graph using the MENTIONS relationship.
id property is generated since the document didn’t have any id.
You can combine having both __Entity__ label and document source. Still, be aware that both take up memory, especially source included due to long strings for content.
In the end, you can query the knowledge graph, as explained in the section before: