Azure Cosmos DB for Apache Gremlin

Azure Cosmos DB for Apache Gremlin 是一项图数据库服务，可存储包含数十亿顶点与边的超大规模图。你可以毫秒级延迟查询图，并轻松演化其结构。 Gremlin 是由 Apache 软件基金会 旗下 Apache TinkerPop 项目开发的图遍历语言与虚拟机。

本笔记本展示如何使用 LLM 为可通过 Gremlin 查询的图数据库提供自然语言接口。

设置

安装依赖：

!pip3 install gremlinpython

需要一个 Azure Cosmos DB Graph 实例。可在 Azure 中创建免费 CosmosDB Graph 实例。创建 Cosmos DB 帐号与图时，将 /type 作为分区键。

cosmosdb_name = "mycosmosdb"
cosmosdb_db_id = "graphtesting"
cosmosdb_db_graph_id = "mygraph"
cosmosdb_access_Key = "longstring=="

import nest_asyncio
from langchain_community.chains.graph_qa.gremlin import GremlinQAChain
from langchain_community.graphs import GremlinGraph
from langchain_community.graphs.graph_document import GraphDocument, Node, Relationship
from langchain_core.documents import Document
from langchain_openai import AzureChatOpenAI

graph = GremlinGraph(
    url=f"wss://{cosmosdb_name}.gremlin.cosmos.azure.com:443/",
    username=f"/dbs/{cosmosdb_db_id}/colls/{cosmosdb_db_graph_id}",
    password=cosmosdb_access_Key,
)

填充数据库

假设数据库为空，可以使用 GraphDocument 进行填充。对于 Gremlin，务必为每个节点添加 label 属性；若未设置，Node.type 会作为标签。Cosmos 中使用自然 ID 更合适，因为在图浏览器中可见。

source_doc = Document(
    page_content="Matrix is a movie where Keanu Reeves, Laurence Fishburne and Carrie-Anne Moss acted."
)
movie = Node(id="The Matrix", properties={"label": "movie", "title": "The Matrix"})
actor1 = Node(id="Keanu Reeves", properties={"label": "actor", "name": "Keanu Reeves"})
actor2 = Node(
    id="Laurence Fishburne", properties={"label": "actor", "name": "Laurence Fishburne"}
)
actor3 = Node(
    id="Carrie-Anne Moss", properties={"label": "actor", "name": "Carrie-Anne Moss"}
)
rel1 = Relationship(
    id=5, type="ActedIn", source=actor1, target=movie, properties={"label": "ActedIn"}
)
rel2 = Relationship(
    id=6, type="ActedIn", source=actor2, target=movie, properties={"label": "ActedIn"}
)
rel3 = Relationship(
    id=7, type="ActedIn", source=actor3, target=movie, properties={"label": "ActedIn"}
)
rel4 = Relationship(
    id=8,
    type="Starring",
    source=movie,
    target=actor1,
    properties={"label": "Strarring"},
)
rel5 = Relationship(
    id=9,
    type="Starring",
    source=movie,
    target=actor2,
    properties={"label": "Strarring"},
)
rel6 = Relationship(
    id=10,
    type="Straring",
    source=movie,
    target=actor3,
    properties={"label": "Strarring"},
)
graph_doc = GraphDocument(
    nodes=[movie, actor1, actor2, actor3],
    relationships=[rel1, rel2, rel3, rel4, rel5, rel6],
    source=source_doc,
)

# python-gremlin 在 notebook 中运行时会有问题
# 以下代码用于修复
nest_asyncio.apply()

# 将文档写入 CosmosDB 图
graph.add_graph_documents([graph_doc])

刷新图模式

若数据库模式发生变化，可刷新相关信息。

graph.refresh_schema()

print(graph.schema)

查询图

现在可以使用 Gremlin QA 链对图提问：

chain = GremlinQAChain.from_llm(
    AzureChatOpenAI(
        temperature=0,
        azure_deployment="gpt-4-turbo",
    ),
    graph=graph,
    verbose=True,
)

chain.invoke("Who played in The Matrix?")

chain.run("How many people played in The Matrix?")

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

Integrations by component

Azure Cosmos DB for Apache Gremlin

设置

填充数据库

刷新图模式

查询图

Popular Providers

Integrations by component

​设置

​填充数据库

​刷新图模式

​查询图

设置

填充数据库

刷新图模式

查询图