Stop Building State Machines for Your AI Agents (use Durable Functions instead)

I built a sample that I think captures something important: AI agents that interact with the real world need workflows that pause, and Durable Functions make this much easier than current alternatives.

The Problem

Say you’re building a support agent. A customer asks for a refund. The agent can look up the order, check the return policy, and decide a refund is warranted — but it can’t just issue the refund. A human needs to approve it.

User in Teams:

Supervisor Dashboard:

So now you need to:

  1. Save the pending request somewhere
  2. Pause the workflow
  3. Wait for a supervisor to approve or reject (could be hours or days)
  4. Resume exactly where you left off
  5. Process the refund and notify the customer

The typical approach? A state machine. You model every state (pending_approvalapprovedprocessingcompleted), every transition, and wire up polling or webhooks to detect when things change. You write a bunch of glue code to serialize context, handle edge cases, and coordinate between services.

It works. It’s also tedious, error-prone, and obscures what’s actually a simple workflow.

The Durable Functions Approach

Let’s start with the diagram. A customer asks the bot for a refund. The bot uses AI to look up the order, creates a case, and starts a Durable Functions orchestration that pauses until a supervisor approves or rejects it. Once approved, the orchestrator processes the refund and notifies the customer, all without polling or a state machine. 

Sequence diagram showing the full refund workflow — from customer message through AI tool calling, Durable Functions orchestration, supervisor approval, and proactive notification

Here’s the entire approval workflow in my sample:

export const supportCaseOrchestrator: OrchestrationHandler = function* (context) {
const { caseId, action } = context.df.getInput();
// Mark as pending
yield context.df.callActivity('updateCase', { caseId, status: 'pending_approval' });
// Wait for a human — costs nothing while paused
const approvalTask = context.df.waitForExternalEvent('Approval');
const timeoutTask = context.df.createTimer(sevenDaysFromNow);
const winner = yield context.df.Task.any([approvalTask, timeoutTask]);
if (winner === approvalTask && approvalTask.result.approved) {
yield context.df.callActivity('updateCase', { caseId, status: 'approved' });
if (action === 'refund') {
yield context.df.callActivity('issueRefund', { caseId });
}
yield context.df.callActivity('notifyBot', { caseId, message: 'Approved!' });
} else {
yield context.df.callActivity('updateCase', { caseId, status: 'rejected' });
yield context.df.callActivity('notifyBot', { caseId, message: 'Rejected.' });
}
};

That’s it. Read it top to bottom: it’s just the workflow. No state machine. No polling. No webhook plumbing. The orchestrator pauses at waitForExternalEvent, serializes its state, and stops executing entirely.

When a supervisor clicks “Approve” in the dashboard, the dashboard calls the Durable Functions HTTP API with:

raiseEvent('Approval', { approved: true })

passing the case ID. The framework matches this to the paused orchestration instance, deserializes its state, and resumes execution from the exact yield where it was waiting. The orchestrator then runs the remaining steps — update the case, process the refund, notify the customer — as if no time had passed.

Key: waitForExternalEvent costs nothing while waiting. No process running. No timer ticking. No compute billed. Each customer’s case gets its own orchestration instance, waiting independently.

Why This Matters for AI Agents

As we build agents that do more than just answer questions, agents that take actions, trigger workflows, and interact with external systems, we’re going to hit this pattern constantly:

  • Refund approvals: agent submits, human approves
  • Deployment requests: agent prepares a change, human confirms
  • Escalations: agent triages, human takes over
  • Multi-step processes: agent starts, waits for external data, continues

Every one of these is a “pause and wait” problem. You could solve each one with a state machine, a database, and some glue code. Or you could write the workflow as a straight-line function and let the infrastructure handle the rest.

What About the Alternatives?

ApproachHow it worksWhy it hurts
Polling loopBot checks a “pending” flag in a database every N secondsWastes compute. 1,000 pending cases = 1,000 polling loops. Latency depends on poll interval.
Queue + workerBot writes to a queue; worker picks up after approvalYou build the state machine yourself: track which step each case is on, handle retries, deal with poison messages. “Wait for approval” doesn’t map naturally to a queue.
Webhook callbackBot registers a callback URL; approval service calls itBot must be running when the callback arrives hours later. If it restarts, the callback URL may be stale. No built-in retry or state tracking.
Database + cronStore pending cases in DB, cron job checks for approved onesSame polling problem. Cron frequency = latency floor. State machine lives in application code. Error handling is manual.
Durable FunctionswaitForExternalEvent pauses at zero cost; raiseEvent resumes instantlyRequires Azure Functions runtime. But: no polling, no state machine code, built-in retry, scales to thousands of concurrent cases.

Durable Functions win here because:

  • Zero-cost waiting: a case pending for 3 days uses no compute until approved
  • No state machine: the orchestrator reads like a sequential function, but the framework handles checkpointing, replay, and fault tolerance
  • Parallel independence: Alice’s refund and Bob’s escalation are separate instances; approving one doesn’t affect the other

The Full Sample

The durable-support-agent sample has three pieces:

  1. A Teams bot that uses GPT-4o with tool calling to handle customer support — order lookups, knowledge base search, refund requests, escalations
  2. Azure Durable Functions that orchestrate the approval workflow with zero-cost pausing
  3. A Next.js dashboard where supervisors approve or reject pending cases

The whole thing runs locally. The bot creates cases, the orchestrator pauses, the dashboard lets you approve, and the customer gets notified, all coordinated through a workflow you can read in 30 lines.

If you’re building agents that need human-in-the-loop workflows, give Durable Functions a look.

Learn More


Posted

in

by

Comments

Leave a comment