Human-in-the-Loop
The Human-in-the-Loop (HITL) pattern allows workflows to pause mid-execution, wait for human input or approval, and resume exactly where they left off without losing state or context.
How it works
Section titled “How it works”flowchart TB
Agent["Writer Agent"] --> Approval{"Approval Gate"}
Approval --> |"Workflow pauses"| Wait(["⏸ Waiting for human"])
Wait --> |"Human reviews"| Decision{Approved?}
Decision --> |"Yes"| Resume["Workflow resumes"]
Decision --> |"No"| Reject["Route to rejection node/retry"]
Resume --> Next["Publisher Agent"]
- Execution: A workflow proceeds normally until it hits a node of type
approval. - Pause: The orchestrator completely halts execution, persists the current state to the database, and emits a
workflow:waitingevent. - Review: The system waits. Human operators can take seconds or days to review the data via a UI or ChatOps integration.
- Resume: An API call is made back to the orchestrator supplying the human’s decision (e.g., approved/rejected) and optional feedback.
- Continuation: The workflow wakes up, injects the human’s response into the state memory, and traverses the next edge.
When to use this pattern
Section titled “When to use this pattern”- High-stakes actions: an agent proposes a production deployment, financial transaction, or email blast, but a human must sign off before execution.
- Content publication: a writer agent produces a draft, and a human editor reviews and approves before publishing.
- Compliance and auditing: automated analysis that requires a mandatory human compliance review before proceeding.
- Iterative feedback: a human provides specific, nuanced feedback during the pause, which is fed back to the agent for revision (a HITL + Self-Annealing hybrid).
Implementation example
Section titled “Implementation example”This example demonstrates a classic approval gate: A Writer agent drafts an article, execution pauses for a human to review the draft, and then upon approval, a Publisher agent finalizes it.
See the full runnable code.
1. The Approval Node
Section titled “1. The Approval Node”Instead of managing complex pausing logic in code, you simply declare an approval node in your createGraph definition.
import { createGraph } from '@cycgraph/orchestrator';
const graph = createGraph({ name: 'Human-in-the-Loop', nodes: [ // ... writer agent node ... { id: 'review', type: 'approval', read_keys: ['*'], write_keys: ['*', 'control_flow'], approval_config: { approval_type: 'human_review', prompt_message: 'Please review the draft before publication.', review_keys: ['draft'], // The specific memory keys the human needs to see timeout_ms: 300_000, // Hard timeout if the human never responds }, }, // ... publisher agent node ... ], edges: [ { source: 'write', target: 'review' }, { source: 'review', target: 'publish' }, ], start_node: 'write', end_nodes: ['publish'],});2. The Initial Run
Section titled “2. The Initial Run”When you execute the workflow, it will automatically pause when it reaches the review node.
const runner1 = new GraphRunner(graph, initialState, { persistStateFn: async (s) => persistence.saveWorkflowSnapshot(s),});
// The run() promise resolves early with a 'waiting' statusconst pausedState = await runner1.run();
if (pausedState.status === 'waiting') { const pending = pausedState.memory._pending_approval;
// E.g. Send to Slack: // "Please review the draft before publication." // "\nDraft content: " + pending.review_data.draft console.log(pending.prompt_message); console.log(pending.review_data.draft);}3. Resuming the Workflow
Section titled “3. Resuming the Workflow”Later, when your user clicks “Approve” or “Reject” in your UI, you instantiate a new GraphRunner with the persisted state, apply their response, and run it again.
// 1. Fetch the paused state from your DB (latest version for the run)const stateFromDB = await persistence.loadLatestWorkflowState(runId);
// 2. Create a fresh runnerconst runner2 = new GraphRunner(graph, stateFromDB, { persistStateFn: async (s) => persistence.saveWorkflowSnapshot(s),});
// 3. Inject the human's decisionrunner2.applyHumanResponse({ decision: 'approved', data: 'Looks great, but make the headline punchier.', // Optional feedback});
// 4. Resume executionconst finalState = await runner2.run();When the workflow resumes:
- The human’s
decisionstring is saved to state memory underhuman_decision. - The human’s
datastring is saved to state memory underhuman_response. - The downstream agents (like the Publisher) can read these fields to incorporate the feedback into their final output.
Approval gate timeouts
Section titled “Approval gate timeouts”When timeout_ms is set on an approval node, the engine sets waiting_timeout_at on the workflow state to the current time plus the timeout duration. This creates a hard deadline for human response.
If the workflow is resumed after the deadline has expired, the engine transitions the workflow to timeout status immediately with a WorkflowTimeoutError — the human’s response is discarded and execution does not continue. This prevents workflows from waiting indefinitely for human approval that may never come.
{ id: 'review', type: 'approval', approval_config: { approval_type: 'human_review', prompt_message: 'Approve deployment to production?', review_keys: ['deployment_plan'], timeout_ms: 600_000, // 10-minute deadline },}If no human responds within 10 minutes and a resume is attempted after that window, the workflow fails with WorkflowTimeoutError. To handle timeouts gracefully, check state.status === 'timeout' in your application code and trigger appropriate fallback logic.
Worker-based HITL
Section titled “Worker-based HITL”When using the WorkflowWorker for distributed execution, HITL is handled without blocking any worker:
- The API enqueues a
{ type: 'start' }job - A worker runs the workflow until it hits an approval node → returns
status: 'waiting' - The worker calls
queue.release(jobId)— transitions the job topausedstatus, freeing the slot immediately (no blocking). The paused job is not re-claimable bydequeue, so the worker won’t re-execute the approval gate. - Later, the API acks the original paused job (cleanup) and enqueues a
{ type: 'resume', human_response: { decision: 'approved' } }job - A worker (same or different) picks up the resume job, recovers via event log, applies the response, and continues
import { InMemoryWorkflowQueue } from '@cycgraph/orchestrator';
const queue = new InMemoryWorkflowQueue();
// 1. Start the workflowawait queue.enqueue({ type: 'start', run_id: runId, graph_id: graph.id, initial_state: { goal: 'Write an article' },});
// 2. Worker runs it, hits approval node, releases the job (→ paused status)
// 3. Later, when the human approves:await queue.ack(startJobId); // Clean up the paused original jobawait queue.enqueue({ type: 'resume', run_id: runId, // Same run graph_id: graph.id, human_response: { decision: 'approved', data: 'Looks great!' },});The key distinction: release (not nack) transitions the job to paused status without counting it as a failure. Paused jobs are not re-claimable by dequeue — this prevents the worker from re-executing the approval gate in a loop while awaiting a human response. A separate resume job carries the human’s response.