如何评估现有实验（仅 Python）

现有实验的评估目前仅在 Python SDK 中受支持。如果您已经运行了实验并希望添加其他评估指标，您可以像以前一样使用 evaluate() / aevaluate() 方法将任何评估器应用于实验。只需传入实验名称/ ID 而不是目标函数：

from langsmith import evaluate

def always_half(inputs: dict, outputs: dict) -> float:
    return 0.5

experiment_name = "my-experiment:abc"  # Replace with an actual experiment name or ID

evaluate(experiment_name, evaluators=[always_half])

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

如何评估图

使用多模态内容运行评估