Tracing
cycgraph includes opt-in OpenTelemetry tracing that gives you full visibility into workflow execution — node timings, LLM calls, supervisor decisions, and tool invocations. When tracing is disabled (the default), all tracing code is a no-op with zero overhead.
Quick start
Section titled “Quick start”1. Initialize at startup
Section titled “1. Initialize at startup”Call initTracing() once before any traced code runs:
import { initTracing } from '@cycgraph/orchestrator';
await initTracing('my-app');2. Set the endpoint
Section titled “2. Set the endpoint”Tracing activates when the OTEL_EXPORTER_OTLP_ENDPOINT environment variable is set:
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 node app.jsWhen the variable is unset, initTracing() returns immediately — OpenTelemetry modules are never imported (dynamic imports), so there is zero bundle or runtime cost.
3. View traces
Section titled “3. View traces”cycgraph ships with a Jaeger service in Docker Compose:
docker compose up jaegeropen http://localhost:16686Any OTLP-compatible collector works: Jaeger, Axiom, Honeycomb, Grafana Tempo, LangFuse, or your own.
Span hierarchy
Section titled “Span hierarchy”Every workflow run produces a tree of spans that maps directly to the execution flow:
workflow.run├── node.execute.supervisor│ └── supervisor.route (one per routing decision)├── node.execute.agent│ └── agent.execute (one per LLM call)├── node.execute.evolution│ └── evaluator.evaluate (one per candidate evaluation)└── node.execute.toolEach node.execute.* span captures the node ID and type. Child spans add execution-specific detail.
Span attributes
Section titled “Span attributes”workflow.run
Section titled “workflow.run”| Attribute | Type | Description |
|---|---|---|
workflow.id | string | Workflow ID |
graph.id | string | Graph definition ID |
graph.name | string | Graph name |
run.id | string | Unique run ID |
workflow.duration_ms | number | Total wall-clock duration |
workflow.status | string | Final status (completed, failed, etc.) |
workflow.iterations | number | Total graph iterations executed |
agent.execute
Section titled “agent.execute”| Attribute | Type | Description |
|---|---|---|
agent.id | string | Agent UUID |
agent.model | string | Model ID (e.g. claude-sonnet-4-20250514) |
agent.provider | string | Provider name (e.g. anthropic) |
agent.attempt | number | Retry attempt (1 = first try) |
agent.duration_ms | number | LLM call duration |
agent.tokens.input | number | Input tokens consumed |
agent.tokens.output | number | Output tokens generated |
agent.tokens.total | number | Total tokens |
agent.tools_called | number | Number of tool invocations |
agent.error | string | Error message (on failure only) |
supervisor.route
Section titled “supervisor.route”| Attribute | Type | Description |
|---|---|---|
supervisor.id | string | Supervisor node ID |
supervisor.decision | string | Chosen next node (or __done__) |
supervisor.reasoning | string | LLM’s explanation for the routing choice |
supervisor.iteration | number | Current supervisor iteration |
supervisor.input_tokens | number | Input tokens consumed |
supervisor.output_tokens | number | Output tokens generated |
evaluator.evaluate
Section titled “evaluator.evaluate”| Attribute | Type | Description |
|---|---|---|
evaluator.agent_id | string | Evaluator agent UUID |
evaluator.score | number | Quality score (0.0–1.0) |
evaluator.tokens | number | Total tokens consumed |
Using tracers in custom code
Section titled “Using tracers in custom code”If you build custom node executors or utilities, you can create spans using the exported helpers:
import { getTracer, withSpan } from '@cycgraph/orchestrator';
const tracer = getTracer('my-custom-module');
const result = await withSpan(tracer, 'my.operation', async (span) => { span.setAttribute('my.custom_attr', 'value'); // ... your logic ... return someResult;});withSpan automatically:
- Creates a child span under the current async context
- Sets span status to
OKon success - Sets span status to
ERRORand records the exception on failure - Ends the span in a
finallyblock (guaranteed cleanup)
getTracer() returns a no-op tracer when OpenTelemetry is not initialized, so your code works identically with or without tracing enabled.
Metrics
Section titled “Metrics”initTracing() also initializes an optional metrics subsystem (gated separately by METRICS_ENABLED=true). Built-in metric recording functions:
| Function | What it records |
|---|---|
recordWorkflowDuration(ms) | Workflow wall-clock time |
recordTokensUsed(count) | Token consumption |
recordCostUsd(amount) | Dollar cost |
recordAgentDuration(ms) | Per-agent LLM call time |
incrementWorkflowsStarted() | Workflow start counter |
incrementWorkflowsCompleted() | Workflow completion counter |
incrementWorkflowsFailed() | Workflow failure counter |
All metric functions accept optional labels and are zero-cost no-ops when metrics are disabled.
Graceful shutdown
Section titled “Graceful shutdown”initTracing() registers SIGTERM and SIGINT handlers that flush pending spans and shut down the SDK cleanly. No additional cleanup code is needed.
Next steps
Section titled “Next steps”- Evaluations — verify agent behavior with automated eval suites
- Streaming — real-time event observability (alternative to spans)