I built a sample that I think captures something important: AI agents that interact with the real world need workflows that pause, and Durable Functions make this much easier than current alternatives.
The Problem
Say you’re building a support agent. A customer asks for a refund. The agent can look up the order, check the return policy, and decide a refund is warranted — but it can’t just issue the refund. A human needs to approve it.
User in Teams:

Supervisor Dashboard:

So now you need to:
- Save the pending request somewhere
- Pause the workflow
- Wait for a supervisor to approve or reject (could be hours or days)
- Resume exactly where you left off
- Process the refund and notify the customer
The typical approach? A state machine. You model every state (pending_approval, approved, processing, completed), every transition, and wire up polling or webhooks to detect when things change. You write a bunch of glue code to serialize context, handle edge cases, and coordinate between services.
It works. It’s also tedious, error-prone, and obscures what’s actually a simple workflow.
The Durable Functions Approach
Let’s start with the diagram. A customer asks the bot for a refund. The bot uses AI to look up the order, creates a case, and starts a Durable Functions orchestration that pauses until a supervisor approves or rejects it. Once approved, the orchestrator processes the refund and notifies the customer, all without polling or a state machine.

Here’s the entire approval workflow in my sample:
export const supportCaseOrchestrator: OrchestrationHandler = function* (context) { const { caseId, action } = context.df.getInput(); // Mark as pending yield context.df.callActivity('updateCase', { caseId, status: 'pending_approval' }); // Wait for a human — costs nothing while paused const approvalTask = context.df.waitForExternalEvent('Approval'); const timeoutTask = context.df.createTimer(sevenDaysFromNow); const winner = yield context.df.Task.any([approvalTask, timeoutTask]); if (winner === approvalTask && approvalTask.result.approved) { yield context.df.callActivity('updateCase', { caseId, status: 'approved' }); if (action === 'refund') { yield context.df.callActivity('issueRefund', { caseId }); } yield context.df.callActivity('notifyBot', { caseId, message: 'Approved!' }); } else { yield context.df.callActivity('updateCase', { caseId, status: 'rejected' }); yield context.df.callActivity('notifyBot', { caseId, message: 'Rejected.' }); }};
That’s it. Read it top to bottom: it’s just the workflow. No state machine. No polling. No webhook plumbing. The orchestrator pauses at waitForExternalEvent, serializes its state, and stops executing entirely.
When a supervisor clicks “Approve” in the dashboard, the dashboard calls the Durable Functions HTTP API with:
raiseEvent('Approval', { approved: true })
passing the case ID. The framework matches this to the paused orchestration instance, deserializes its state, and resumes execution from the exact yield where it was waiting. The orchestrator then runs the remaining steps — update the case, process the refund, notify the customer — as if no time had passed.
Key: waitForExternalEvent costs nothing while waiting. No process running. No timer ticking. No compute billed. Each customer’s case gets its own orchestration instance, waiting independently.
Why This Matters for AI Agents
As we build agents that do more than just answer questions, agents that take actions, trigger workflows, and interact with external systems, we’re going to hit this pattern constantly:
- Refund approvals: agent submits, human approves
- Deployment requests: agent prepares a change, human confirms
- Escalations: agent triages, human takes over
- Multi-step processes: agent starts, waits for external data, continues
Every one of these is a “pause and wait” problem. You could solve each one with a state machine, a database, and some glue code. Or you could write the workflow as a straight-line function and let the infrastructure handle the rest.
What About the Alternatives?
| Approach | How it works | Why it hurts |
|---|---|---|
| Polling loop | Bot checks a “pending” flag in a database every N seconds | Wastes compute. 1,000 pending cases = 1,000 polling loops. Latency depends on poll interval. |
| Queue + worker | Bot writes to a queue; worker picks up after approval | You build the state machine yourself: track which step each case is on, handle retries, deal with poison messages. “Wait for approval” doesn’t map naturally to a queue. |
| Webhook callback | Bot registers a callback URL; approval service calls it | Bot must be running when the callback arrives hours later. If it restarts, the callback URL may be stale. No built-in retry or state tracking. |
| Database + cron | Store pending cases in DB, cron job checks for approved ones | Same polling problem. Cron frequency = latency floor. State machine lives in application code. Error handling is manual. |
| Durable Functions | waitForExternalEvent pauses at zero cost; raiseEvent resumes instantly | Requires Azure Functions runtime. But: no polling, no state machine code, built-in retry, scales to thousands of concurrent cases. |
Durable Functions win here because:
- Zero-cost waiting: a case pending for 3 days uses no compute until approved
- No state machine: the orchestrator reads like a sequential function, but the framework handles checkpointing, replay, and fault tolerance
- Parallel independence: Alice’s refund and Bob’s escalation are separate instances; approving one doesn’t affect the other
The Full Sample
The durable-support-agent sample has three pieces:
- A Teams bot that uses GPT-4o with tool calling to handle customer support — order lookups, knowledge base search, refund requests, escalations
- Azure Durable Functions that orchestrate the approval workflow with zero-cost pausing
- A Next.js dashboard where supervisors approve or reject pending cases
The whole thing runs locally. The bot creates cases, the orchestrator pauses, the dashboard lets you approve, and the customer gets notified, all coordinated through a workflow you can read in 30 lines.
If you’re building agents that need human-in-the-loop workflows, give Durable Functions a look.
Learn More
- Azure Durable Functions overview — what they are and how they work
- Human interaction pattern — the exact pattern used in this sample (
waitForExternalEvent+raiseEvent) - Durable Functions for JavaScript/TypeScript — quickstart for the Node.js SDK
- Orchestrator function constraints — rules for deterministic replay (important to understand before writing orchestrators)
- Timers in Durable Functions — how
createTimerworks for timeouts and deadlines - durable-support-agent sample — the full source code for this post

Leave a comment