AI agent prototypes are becoming easier to build. A model can reason about a task, call tools, retrieve context, and generate useful results with surprisingly little code. The tooling is improving quickly, and many demonstrations are genuinely impressive.

Getting an agent into production is a very different challenge. In a prototype, the primary question is simple: Can the agent complete the task?

In production, the questions change:

  • What happens if the agent completes three of five steps and then fails?
  • Can a human approve a sensitive action before it executes?
  • Can we retry safely without creating duplicate side effects?
  • How do we maintain consistency when the agent acts across multiple systems of record?
  • What should happen when the state of a downstream system is not failed, but unknown?

These are not purely AI problems. They are workflow, distributed systems, runtime control, and operational problems.

This article shares some of the engineering lessons I have been working through while building agentic execution capabilities in Oracle MicroTx Workflows.

Agent Frameworks and Workflow Engines Solve Different Layers

Agent frameworks are valuable. They help developers build prompts, manage memory, coordinate tools, and implement reasoning loops.

Workflow engines solve a different problem. Once an agent participates in a long-running business process, especially one that touches enterprise systems, the runtime must handle durability, approvals, observability, access controls, recovery, and consistency. Those concerns typically sit outside the core scope of an agent framework.

This is where workflow architecture becomes important. Agent frameworks help an agent decide what to do. Workflow platforms help ensure the business process completes reliably, even when failures occur.

In production, the agent is not the whole application. It becomes one participant in a larger execution flow.

In production, the agent becomes one participant in a durable workflow that coordinates tools, approvals, systems of record, observability, and transaction behavior.
In production, the agent becomes one participant in a durable workflow that coordinates tools, approvals, systems of record, observability, and transaction behavior.

The Tool Call Problem – Side Effects Do Not Roll Back

Tool calling is where agentic systems become truly powerful. An agent can update records, initiate transactions, retrieve operational data, trigger downstream processes, or ask other systems to perform work. But that power introduces risk.

Consider an order exception workflow:

  • The agent checks order status.
  • The agent reserves inventory.
  • The agent queues a payment adjustment.
  • The agent updates shipping instructions – this call fails.
  • The agent submits the exception for approval – never reached.

The inventory reservation already exists. The payment adjustment has already been queued. Real business actions have already occurred. Simply retrying the failed step does not undo those actions. Restarting the workflow from the beginning may reserve inventory twice or create duplicate payment adjustments.

There is another layer to this problem. A timeout does not necessarily mean the downstream system failed.

Timeout is not a rollback. It only means the caller stopped hearing back.

The downstream system may still be processing. It may have completed the operation but failed to return a response. A callback may arrive later. The difficult state is not always failure. Often, the difficult state is unknown.

As AI agents start calling tools that update systems of record, they become participants in distributed workflows. Their actions need the same discipline as microservice actions: durable state, authorization, idempotency, retry policy, compensation, observability, and auditability.

This is where workflow orchestration and distributed transaction patterns must work together.

Durable Execution and Compensation Solve Different Problems

Durable execution helps a workflow survive infrastructure failure. It allows a process to continue from a known point instead of losing all progress when a worker, service, or container fails. That is necessary, but not always sufficient.

When workflow steps create business side effects, the process also needs a recovery model for partial completion. Inventory may have been reserved. Payment may have been authorized. A shipping request may be unresolved. A customer record may have been updated. At that point, the transaction boundary is no longer limited to a single database or service. In many enterprise processes, the transaction boundary moves to the business process itself.

One model MicroTx supports for this class of problem is the Saga pattern. Each step that produces a side effect can have a corresponding compensation: an explicit business action defined as part of the workflow. If a later step fails, compensating actions can return the process to a known recovery state.

Inventory reservations can be released. Payment adjustments can be canceled. The workflow can recover predictably instead of leaving the business process partially complete. The key point is that developers define the business compensation logic. MicroTx coordinates execution, tracks completed steps and orchestrates recovery.

Durable execution helps workflows survive infrastructure failures.
Compensation patterns help business processes survive partial side effects.

Both are necessary for reliable enterprise automation.

Put another way:

Workflow orchestration gives distributed transactions durable memory.

It gives the process a durable record of what ran, what completed, what failed, what is uncertain, what can be retried, and what must be compensated or escalated.

Durable execution preserves workflow progress. Compensation patterns help recover business state after partial side effects.
Durable execution preserves workflow progress. Compensation patterns help recover business state after partial side effects.

Human Approval Is a Control Point, Not a Limitation

Many discussions about agentic AI assume human approvals will eventually disappear as models improve. Enterprise systems often require a different approach.

Some actions should always include accountability: high-value refunds, payment releases, compliance overrides, production remediations, entitlement changes, or actions that affect regulated records. These actions may require a named approver regardless of how capable the model becomes.

The agent can still add significant value. It can gather context, retrieve policies, summarize evidence, and recommend a course of action. The workflow then pauses while a human reviews the recommendation.

In practice, the approval itself is often the easy part. The harder questions are:

  • What happens if nobody responds?
  • Should the workflow escalate?
  • Should it timeout and compensate?
  • Should it remain paused until an operator intervenes?
  • Should the approval decision become part of the audit record?

These are business decisions, and they vary by organization. MicroTx supports human intervention tasks as workflow steps, allowing workflows to pause for external input and resume when a decision is provided. Timeout, escalation, and routing behavior can be modeled as part of the workflow design.

The pattern is straightforward:

Agents accelerate decision-making. Humans provide accountability. Workflows connect the two. 

Retrieval Becomes Part of the Business Process

Retrieval-Augmented Generation, or RAG, is often viewed as a way to improve model responses. In enterprise workflows, retrieval becomes much more than that.

A workflow may retrieve policy documents before approving an invoice, customer history before recommending a support action, compliance rules before authorizing a transaction, or previous case history before escalating an issue. The retrieved information is not simply prompt context. It becomes evidence.

That evidence may influence an agent’s recommendation, be reviewed by a human approver, and ultimately become part of the workflow’s execution history. In regulated environments, organizations may later need to answer a critical question: What information was used to make this decision?

For that reason, retrieval can become part of the audit trail. Production grade RAG is not just about better answers. It is about traceable, observable, and controlled execution. 

Runtime Controls Are Different from Prompt Guardrails

Most AI safety discussions focus on prompt guardrails and output filtering. Those controls are important, but enterprise agentic systems require another layer:

Runtime controls.

Prompt guardrails controls what the model can say. Runtime controls define what the agent can do during workflow execution. For example, a prompt guardrail may prevent harmful output. Runtime controls determines whether an agent is allowed to invoke a payment API, update customer data, call an MCP tool, or execute a sensitive workflow step without approval.

Runtime controls includes tool access controls, MCP access policies, human approval requirements, workflow-level audit records, execution history, connector-level configuration, pause, retry, and termination controls, and visibility into agent, tool, and transaction activity.

These controls become the foundation for operating agentic systems in regulated and mission-critical environments. Runtime controls for agentic systems is still maturing across the industry. But as agents move closer to systems of record, execution-level controls and auditability become foundational requirements.

A Practical Mental Model

For teams working through this shift, I have found this mental model useful:

Prototype concernProduction workflow concern
Can the model reason?Can the workflow execute reliably?
Can the agent call a tool?Can tool calls be authorized and audited?
Can the task be completed once?Can it recover from partial failure?
Can we retry?Can we retry without duplicate side effects?
Can the agent act?Can sensitive actions require approval?
Can the agent retrieve context?Can retrieved context become part of the workflow record?
Can the agent select a tool?Can tool access be scoped, logged, and restricted?
Can we update a system?Can we maintain consistency if something fails midway?
Can we detect failure?Can we handle unknown states, late responses, and reconciliation?

Every row on the right is an engineering problem. Agent frameworks typically focus on the left side of the table. Production enterprise workflows require the right side as well.

Because these execution patterns are still maturing across the industry, workflow platforms need to bring established enterprise disciplines to this new frontier: state, recovery, approvals, auditability, and transaction-aware execution.

Putting These Patterns into Practice with Oracle MicroTx Workflows

Oracle MicroTx Workflows brings these concerns together: workflow orchestration, agentic AI execution, and distributed transaction management. It builds on a established workflow execution foundation and extends it for enterprise agentic AI use cases with agentic tasks, LLM connectors, reusable agent profiles, tool and MCP integrations, RAG retrieval, human approvals, access and execution controls, and distributed transaction support, including XA, Saga/LRA, and TCC patterns.

In my experience, the hard part is usually not getting the model to produce a useful recommendation. The hard part is making that recommendation part of a workflow that can safely act across people, services, data, and transactions, and recover cleanly when something goes wrong.

Developers can evaluate MicroTx locally using the available developer setup and sample applications. Oracle Database 23ai Free can be used as the state store in Docker Compose-based environments, making local evaluation self-contained. For more details about MicroTx Free, see the Oracle MicroTx product page.

The Agent Is Not the Whole Application

Moving from prototype to production is not simply a deployment exercise. It changes the engineering problem.

Production-grade agentic workflows require durable state, scoped tool access, human approval checkpoints, compensation and recovery patterns, operational visibility, transaction-aware execution, and a way to handle uncertainty when distributed systems do not give clear answers.

These are runtime concerns, not model concerns. As AI agents move closer to core business processes, these capabilities become increasingly important.

The agent is not the entire application.

In production, the agent becomes one participant in a larger workflow that must remain durable, observable, controlled and consistent, even when individual steps fail or when the outcome of a distributed action is temporarily unknown.

In the next post, I will cover an architecture decision that is often overlooked: why building on a established workflow execution foundation matters, and how existing workflow patterns can evolve to support agentic capabilities without requiring teams to rewrite everything from scratch.

Learn More