Recently, Oracle introduced Oracle AI Agent Memory Python package, a model and framework-agnostic memory solution that gives enterprise AI teams a governed memory core on Oracle AI Database: short-term threads with summaries and context cards, long-term durable memories with vector search, automatic LLM-based memory extraction, and the governance and isolation production agents require. 

Diagram titled “Oracle Agent Memory” showing a layered architecture. At the top, a framework layer includes LangGraph, Claude Agent SDK, OpenAI Agent SDK, WayFlow, and custom integrations feeding into a unified Oracle Agent Memory client. The client manages working, semantic, episodic, and procedural memory types. Below, the system is powered by Oracle AI Database, which provides governed, isolated, audited, encrypted, and highly available infrastructure with vector search, graph traversal, and relational query capabilities.
Oracle Agent Memory: a unified agent memory layer built on Oracle AI Database.

Oracle AI Agent Memory is available now via PyPI as oracleagentmemory and documented in the Oracle Help Center. It is designed to replace the patchwork memory stack most production agents inherit with a single governed memory substrate built on Oracle AI Database. 

This is the difference between an agent that is memory-augmented, given a vector store to consult, and one that is memory-aware, responsible for reading from and writing to its own governed, durable state with one enterprise-grade backend. 

Agent memory has shifted from a research curiosity to a production requirement in under two years. Teams shipping serious agentic systems need a backend that handles vectors, structured data, and transactional consistency in one place, not three stitched together. Oracle AI Database is one of the few platforms that delivers all of that natively, which is why we built Hindsight to run on it as a first-class backend.

Chris Latimer, Co-Founder & CEO of Vectorize 


Why Agent Memory, Why Now 

Diagram titled “The four types of agent memory” showing an Oracle Agent Memory taxonomy. A central “Agent memory” layer branches into four categories: working memory (active state like current conversation and in-flight tasks), semantic memory (durable facts such as user preferences and entity data), episodic memory (specific past experiences like prior sessions and resolved tasks), and procedural memory (behavioral rules and tool preferences). Each category includes a short description and example elements.
The four types of agent memory: working, semantic, episodic, and procedural.

Most agent implementations treat memory as a bolt-on. A vector store consulted at retrieval time, a chat history table glued on beside it, and whatever hand-written extraction logic the team can maintain. 

That stack holds together for a demo. It falls apart from the moment an enterprise AI team asks questions that actually matter. Who owns the memory? Where is it governed? How do we isolate tenants? How do we audit what the agent learned, and how do we forget it on request? 

Context windows have grown over the years, but no context window is large enough to hold the full state of a long-running agent: weeks of user preferences, accumulated domain knowledge, prior tool outcomes, evolving task state, and the reasoning history that makes each decision defensible.  

Agents need memory for the same reasons people do: to hold an active state while working on a problem, to retain facts learned over time, to recall specific past experiences, and to encode behavioral rules and procedures. 

A practical taxonomy for agent memory commonly used in agent design covers four types: 

  1. Working memory is the active state the agent is reasoning over right now, the running conversation and the scratchpad the model sees at inference time. 
  1. Semantic memory is the durable facts and knowledge the agent accumulates about users, entities, and the world: preferences, canonical definitions, structured reference data.  
  1. Episodic memory is specific past experiences the agent can recall, what happened on a prior session, what the user asked three weeks ago, how a similar task resolved last time.  
  1. Procedural memory is the behavioral rules, guidelines, and learned procedures that shape how the agent acts, how to handle customers, which tools to prefer, what not to do. 

These are not four different systems. They are four access patterns over the same underlying state, which is what makes a unified memory core the right architectural answer rather than four bolted-together services. 


Oracle AI Database as the Memory Core

Reference architecture diagram titled “Oracle AI Agent Memory on Oracle AI Database.” It shows the flow from a customer-owned application tier (end users interacting via natural language with an AI agent built using frameworks like LangGraph or OpenAI Agent SDK) to the Oracle-owned memory SDK and Oracle AI Database. The Oracle AI Agent Memory layer provides APIs for search, message handling, and memory extraction with tenant isolation and governance. It connects to Oracle AI Database, which supports vector search, relational queries, graph traversal, and JSON storage. The diagram also highlights enterprise capabilities like backup, replication, high availability, encryption, access control, and auditing, with arrows indicating request and response flow.
Reference architecture for Oracle AI Agent Memory on Oracle AI Database.

Oracle AI Database combines vector similarity search, relational querying, and graph-aware data access in one governed engine, enabling semantic recall alongside precise transactional and relationship-centric retrieval. Combined with Oracle’s operational story, backups, replication, high availability, encryption, fine-grained access control, and audit, teams get a path from notebook to regulated production without swapping storage layers, rewriting compliance reviews, or stitching together bespoke isolation logic along the way. 

Memory engineering, as a discipline, demands substrate choices that hold up under the access patterns a real enterprise agent actually has concurrent writes, per-user and per-tenant scoping, full audit, and semantic retrieval at scale. 

“Enterprise agents need an agent memory solution with robust security guarantees, strong governance controls, sophisticated workload isolation, as well as deep integration within the enterprise data platform.  Oracle AI Agent Memory greatly simplifies building agent memory solutions by consolidating what are usually multiple separate and fragmented services, within the converged database architecture that customers already trust for their most critical data.” 

Tirthankar Lahiri SVP, Mission-Critical Data and AI Engines, Oracle Database 

Production agent memory carries two loads. Developers wire the stack together: vector store, chat log, extraction scripts, isolation logic, governance per piece. Agents reason over the fragments, deciding what to retrieve from where and fitting the relevant world into a finite context window each turn. 

Oracle AI Agent Memory lifts both. One governed client replaces the four-service stack, with one set of credentials, one compliance review, and one backup story. Working, semantic, episodic, and procedural memory share one substrate and one retrieval surface, so the model reasons over a coherent view of its state. Summarisation and scoped retrieval put the right subset into context at the right moment, freeing the model to spend its reasoning budget on the task rather than memory bookkeeping. 

Automatic LLM-based extraction turns conversation into durable memories without hand-rolled prompt chains. Multi-tenant isolation is enforced at the store layer, so a single schema can host multiple deployments without cross-tenant leakage. And because the SDK is framework-agnostic, integrating with LangGraph, Claude Agent SDK, OpenAI Agent SDK, WayFlow, and custom harnesses, teams aren’t locked into a single runtime to get the substrate. 


Key Benefits For AI Workloads

Bar chart showing LongMemEval results with 93.8% overall accuracy (469/500). Per-category scores: single-session assistant 100%, temporal reasoning 96.2%, knowledge update 94.9%, single-session user 94.3%, single-session preference 93.3%, and multi-session 88.0%. Configuration notes include GPT-5.5, nomic-embed-v1.5 embeddings, and HNSW index.

Configuration · gpt-5.5 (reasoning effort xhigh) · nomic-embed-v1.5 embeddings
Local HNSW index · top-K = 200
X-axis truncated; all categories scored above 88%.

Oracle AI Agent Memory is built for the operational realities of running AI agents in production. 

Production-grade recall on long-horizon memory benchmarks. On LongMemEval, the standard academic benchmark for long-context agent memory, Oracle AI Agent Memory scores 93.8% (469 of 500), with the strongest results on the categories that matter most for production agents: 100% on single-session assistant recall, 96% on temporal reasoning, and 95% on knowledge-update tasks. Multi-session recall, the hardest category in the benchmark, lands at 88%. Configuration: OpenAI gpt-5.5 (reasoning effort xhigh), nomic-embed-text-v1.5 embeddings, local HNSW index, top-K = 200.

Bounded per-turn cost as sessions extend. Periodic thread summarization, durable memory extraction, and prompt-time message compaction keep the working context bounded as conversations grow. In an 80-turn scripted conversation, Oracle AI Agent Memory held per-request input around 1,300 tokens for the full run while a flat-history baseline grew linearly past 13,900 — roughly 9.5× more tokens per request by the final turn, and a much steeper bill across the full conversation. Teams shipping long-running agents trade a linear-in-history cost curve for a flat one.

Line chart comparing tokens per request over 80 conversation turns. A gray line (no memory management) rises steadily to ~13,900 tokens, while a red line (Oracle AI Agent Memory) stays flat around ~1,300 tokens. The chart highlights ~9.5× lower token usage with memory, showing stable context size as conversations grow.

80-turn ChromAtlas-ND scripted conversation · gpt-5.4 (raw OpenAI client, no framework).
Token estimate: chars / 4 (notebook convention)

Better answers than a flat-history baseline. A flat-history agent has the entire verbatim conversation in its prompt — every fact ever mentioned, in order. By rights it should be hard to beat on recall. Across the same 80-turn conversation, evaluated by an impartial gpt-5.4 judge on accuracy, completeness, relevance, and coherence, Oracle AI Agent Memory won 48 turns to flat history’s 13, with 19 ties: 3.7× more wins despite the baseline’s information advantage. A retrieved context card focuses the model on what matters; a sprawling transcript dilutes attention across noise.

Bar chart showing agent performance over 80 turns. Oracle AI Agent Memory wins 48 turns (60%), compared to 13 wins (16%) for a naive flat history

80-turn ChromAtlas-ND scripted conversation · judge: gpt-5.4
scored on accuracy, completeness, relevance, coherence

Cost is a tunable knob, not a fixed value. The summarization trigger controls how aggressively the package compacts thread context, and it moves the cost-fidelity trade-off directly. In an 8-query demo conversation (five runs per threshold), a 10,000-token trigger landed at a mean of 121,268 total tokens, about 60% under the 306,823-token flat-history baseline. As the trigger rises, the package compacts less often and preserves more raw context per turn; by a 50–70k trigger, mean total tokens approach or exceed the baseline, and run-to-run variance widens. Teams pick the threshold that matches their answer-quality requirements and lock in the cost envelope they want, rather than accepting whatever curve a fragmented stack produces.

Line chart titled “Oracle Agent Memory Threshold Sweep” showing total tokens vs. summarization trigger (10k–70k). Mean tokens rise as the threshold increases, with shaded min–max and standard deviation bands. The lowest mean (~121k tokens) occurs at 10k, while higher thresholds approach or exceed a dashed naive baseline (~306k tokens).

Memory Agent Efficiency vs Summarization Threshold on Demo Conversation
(Num Queries = 8, Num Runs Per Threshold = 5)

One backend, every Python runtime. LangGraph, the Claude Agent SDK, the OpenAI Agents SDK, WayFlow, and custom Python harnesses all instantiate the same OracleAgentMemory client and read and write the same Oracle Database store. Teams running more than one framework no longer rebuild memory per runtime, and migrations between frameworks no longer mean migrating memory.

Primitives for audit and erasure on a single substrate. Every record carries user, agent, thread, and timestamp scoping fields, and the SDK exposes search, list, and per-record delete operations across memories, threads, and messages, so callers can locate records for a subject and remove them on request. Oracle Database’s native auditing covers the storage layer underneath. Compliance reviews land on a single substrate (one database with audit, retention, and access controls already in the data plane) rather than four services with four reviews.

One vendor relationship for production agent memory. A single Oracle AI Database instance carries vector search, structured state, JSON document retrieval, transactional consistency, and database-native audit. No second vector database to license, no third service to monitor and scale, no fourth backup pipeline to maintain.

Who Oracle AI Agent Memory Is For

Oracle AI Agent Memory is designed for:

  • AI Developers and engineers building production agents who need durable short-term and long-term memory in one place, with enterprise security and isolation.
  • Teams already running Oracle AI Database who want their agents to write to the same governed backend as the rest of the business
  • Technical leaders evaluating Oracle AI Database for agent memory infrastructure at scale, with compliance and audit requirements

Getting Started

Install the Oracle AI Agent Memory package:

pip install oracleagentmemory 

A minimal end-to-end loop in Python looks like this:

from oracleagentmemory import AgentMemory 
 
memory = AgentMemory.from_connection( 
    connection_string="...", 
    user_id="user_123", 
) 
 
# Add conversation turns to a short-term thread 
thread_id = memory.create_thread(user_id="user_123") 
memory.add_messages(thread_id, messages=[ 
    {"role": "user", "content": "I prefer vegan meals."}, 
    {"role": "assistant", "content": "Noted."}, 
]) 
 
# Extract durable long-term memories from the thread 
memory.extract_memories(thread_id) 
 
# Scoped search over long-term memory, enforced per-user 
results = memory.search( 
    user_id="user_123", 
    query="dietary preferences", 
    limit=5, 
) 

Code samples are illustrative; the final API surface is documented in the Oracle Help Center.

Documentation, the quickstart notebook, how-to guides for each framework integration, and the full API reference are all available in the Oracle Help Center.  

Oracle AI Agent Memory is the first release of a broader commitment to a governed memory substrate enterprise agents. Memory engineering is still an emerging discipline. The infrastructure behind it should not be.