AI agents are becoming core components of enterprise applications, automating workflows, coordinating tools, and orchestrating complex tasks. However, a key challenge persists: fragmentation. Developers often build agents around a specific framework, then attach observability, evaluation, testing, and deployment workflows to that same framework.
This can make it difficult to compare runtimes, migrate agents, or standardize development practices across teams. A framework selected early in development can become a long-term architectural constraint, even as agentic frameworks continue to evolve.
To address this challenge, Opik is integrating with Oracle Open Agent Specification—Agent Spec in brief. Together, Agent Spec and Opik enable developers to define agents once, run them across compatible frameworks, and observe and evaluate their behavior through a consistent workflow.
Portable agent definitions with Agent Spec
Agent Spec is an open source, framework-agnostic configuration language for defining AI agents and workflows. It captures the core components of an agent—LLM settings, prompts, tools, and flow structure—in a portable representation that can be executed across compatible runtimes such as LangGraph, AutoGen, and WayFlow.
With Agent Spec, developers can separate the agent definition from the execution framework. This enables teams to preserve prompts, tool schemas, and orchestration logic while testing or migrating across different runtimes.
For enterprises, this portability helps reduce dependency on a single framework and supports shared development patterns across teams, projects, and deployment environments.
Opik for observability and evaluation
Opik provides tracing, debugging, and evaluation for LLM applications and agents. With the new Agent Spec integration, Opik can capture Agent Spec-defined executions consistently across supported runtimes.
This enables developers to inspect agent behavior across frameworks, including:
- LLM calls, tool calls, and intermediate steps
- Inputs, outputs, metadata, and final responses
- Runtime-specific differences in behavior, latency, and cost
- Evaluation results across the same agent definition
Because traces follow a consistent structure, teams can compare frameworks and agent changes without rebuilding observability or evaluation workflows for each runtime.

What this integration enables
The Opik and Agent Spec integration provides a modular foundation for agent development:
- Define once, run across frameworks: Share and reuse Agent Spec configurations across compatible runtimes.
- Observe consistently: Trace LLM calls, tool executions, intermediate steps, and outputs in Opik across frameworks.
- Evaluate repeatedly: Run the same Opik evaluations and test suites across runtimes without framework-specific evaluation code.
- Compare implementations: Benchmark latency, cost, and output quality when changing frameworks, prompts, tools, or LLMs.
- Support enterprise workflows: Build shared CI/CD, testing, and governance processes across agents developed by different teams.
This separation of concerns allows the agent definition, runtime, and observability layer to evolve independently.
Evaluation and benchmarking
Once Agent Spec executions are captured in Opik, teams can apply repeatable evaluations across runtimes. Typical workflows include deterministic checks for output structure, required fields, and tool usage, as well as LLM-as-judge evaluations for qualities such as correctness, helpfulness, completeness, and relevance.
The same evaluation setup can be used to validate prompt changes, model swaps, tool updates, or runtime migrations. This helps teams identify regressions earlier and compare agent behavior using a common evaluation harness.
Getting started
You can start using Agent Spec agents with Opik in three steps:
- Define your agent using Oracle’s PyAgentSpec SDK.
- Load it onto a runtime using the appropriate adapter for your framework of choice.
- Wrap execution with Opik’s AgentSpecInstrumentor to capture traces in Opik.



