本节仅适用于 Kubernetes 部署。
接收器
日志
以下示例展示 Sidecar 收集器如何从所属 pod 读取日志,并排除非业务容器的日志。Sidecar 在此场景下很实用,因为需要访问各容器的文件系统;你也可以选择使用 DaemonSet。Copy
filelog:
exclude:
- "**/otc-container/*.log"
include:
- /var/log/pods/${POD_NAMESPACE}_${POD_NAME}_${POD_UID}/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
type: container
retry_on_failure:
enabled: true
start_at: end
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_UID
valueFrom:
fieldRef:
fieldPath: metadata.uid
volumes:
- name: varlogpods
hostPath:
path: /var/log/pods
volumeMounts:
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
此配置要求具备目标命名空间内 pod 的
get、list、watch 权限。指标
可以使用 Prometheus 端点抓取指标。为避免重复抓取,可部署单实例 Gateway 收集器。下面的配置会抓取所有遵循默认命名的 LangSmith 服务:Copy
prometheus:
config:
scrape_configs:
- job_name: langsmith-services
metrics_path: /metrics
scrape_interval: 15s
# Only scrape endpoints in the LangSmith namespace
kubernetes_sd_configs:
- role: endpoints
namespaces:
names: [<langsmith-namespace>]
relabel_configs:
# Only scrape services with the name langsmith-.*
- source_labels: [__meta_kubernetes_service_name]
regex: "langsmith-.*"
action: keep
# Only scrape ports with the following names
- source_labels: [__meta_kubernetes_endpoint_port_name]
regex: "(backend|platform|playground|redis-metrics|postgres-metrics|metrics)"
action: keep
# Promote useful metadata into regular labels
- source_labels: [__meta_kubernetes_service_name]
target_label: k8s_service
- source_labels: [__meta_kubernetes_pod_name]
target_label: k8s_pod
# Replace the default "host:port" as Prom's instance label
- source_labels: [__address__]
target_label: instance
此配置要求具备目标命名空间内 pod、service、endpoint 的
get、list、watch 权限。跟踪
若要采集追踪数据,需要启用 OTLP 接收器。以下配置在 4318 端口监听 HTTP,在 4317 端口监听 gRPC:Copy
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
处理器
推荐的 OTEL 处理器
使用 OTel 收集器时,推荐启用以下处理器:- Batch Processor:在发送到导出器之前将数据打包成批次。
- Memory Limiter:防止收集器占用过多内存导致崩溃;一旦超出软限制会暂停接收新数据。
- Kubernetes Attributes Processor:为遥测数据补充 Kubernetes 元数据(如 pod 名)。
导出器
导出器只需指向目标外部端点。下列配置示例展示如何分别设置日志、指标和追踪的端点:Copy
otlphttp/logs:
endpoint: <your_logs_endpoint>
otlphttp/metrics:
endpoint: <your_metrics_endpoint>
otlphttp/traces:
endpoint: <your_traces_endpoint>
OTel Collector 同样支持直接导出到 Datadog 端点。
示例收集器配置:日志 Sidecar
Copy
mode: sidecar
image: otel/opentelemetry-collector-contrib
config:
receivers:
filelog:
exclude:
- "**/otc-container/*.log"
include:
- /var/log/pods/${POD_NAMESPACE}_${POD_NAME}_${POD_UID}/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
type: container
retry_on_failure:
enabled: true
start_at: end
processors:
batch:
send_batch_size: 8192
timeout: 10s
memory_limiter:
check_interval: 1m
limit_percentage: 90
spike_limit_percentage: 80
exporters:
otlphttp/logs:
endpoint: <your-endpoint>
service:
pipelines:
logs/langsmith:
receivers: [filelog]
processors: [batch, memory_limiter]
exporters: [otlphttp/logs]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_UID
valueFrom:
fieldRef:
fieldPath: metadata.uid
volumes:
- name: varlogpods
hostPath:
path: /var/log/pods
volumeMounts:
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
示例收集器配置:指标与追踪网关
Copy
mode: deployment
image: otel/opentelemetry-collector-contrib
config:
receivers:
prometheus:
config:
scrape_configs:
- job_name: langsmith-services
metrics_path: /metrics
scrape_interval: 15s
# Only scrape endpoints in the LangSmith namespace
kubernetes_sd_configs:
- role: endpoints
namespaces:
names: [<langsmith-namespace>]
relabel_configs:
# Only scrape services with the name langsmith-.*
- source_labels: [__meta_kubernetes_service_name]
regex: "langsmith-.*"
action: keep
# Only scrape ports with the following names
- source_labels: [__meta_kubernetes_endpoint_port_name]
regex: "(backend|platform|playground|redis-metrics|postgres-metrics|metrics)"
action: keep
# Promote useful metadata into regular labels
- source_labels: [__meta_kubernetes_service_name]
target_label: k8s_service
- source_labels: [__meta_kubernetes_pod_name]
target_label: k8s_pod
# Replace the default "host:port" as Prom's instance label
- source_labels: [__address__]
target_label: instance
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
send_batch_size: 8192
timeout: 10s
memory_limiter:
check_interval: 1m
limit_percentage: 90
spike_limit_percentage: 80
exporters:
otlphttp/metrics:
endpoint: <metrics_endpoint>
otlphttp/traces:
endpoint: <traces_endpoint>
service:
pipelines:
metrics/langsmith:
receivers: [prometheus]
processors: [batch, memory_limiter]
exporters: [otlphttp/metrics]
traces/langsmith:
receivers: [otlp]
processors: [batch, memory_limiter]
exporters: [otlphttp/traces]
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.