As we adopt more AI-based tooling, I wanted to share a concept that we’ve found incredibly powerful and helpful in our AI agent journey. We call these Recipes. Dan Volz and I came up with them while struggling to get agents to predictably follow simple workflows.

Recipes are multi-step AI conversation workflows defined in Markdown files: structured prompts that guide an AI agent through a series of steps, ensuring consistency and completeness.

The only required syntax is --- (triple dash) to separate sections. A recipe can be as simple as:

Explain what this code does.

---

Suggest improvements.

Through simple Markdown prompts, this method breaks up complex tasks into individual steps. By forcing smaller steps, the tasks are naturally broken into observable checkpoints along the way. Recipes improve consistency, traceability, and step coverage.

We struggled to find a portable way to enforce consistency and repeatability for workflows like automatic commit message generation, code review checklists, documentation evaluation and structured reporting. We found that with standard single-shot prompting, agents were prone to skip the “boring parts” and go straight to the answer; Recipes make sure the agents do the boring part too.

How we use Recipes

Today, we use these recipes for many complex workflows to do things from bug analysis to code review suggestions, to filling in missing Fixes: tags for Linux kernel commits. In each of these cases, without Recipes, we’ve found agents skip steps, project confidence, and don’t provide a paper trail to verify that the intermediate work was done.

To illustrate where Recipes fit in, consider this example. A non-Recipe single-shot prompt might look like this:

Identify buffer overflows in this code, and explain how each of those buffer overflows could be exploitable as a security vulnerability.

Presented as a Recipe, we break the identification and analysis into discrete steps:

Identify buffer overflows in this code.
---
Review those buffer overflows and explain how each of those buffer overflows could be exploitable as a security vulnerability.

By breaking up the job like this, we can review the output to ensure that the intermediate steps were followed without needing to introspect the LLM model’s internal state. This way, we can quickly verify that the agent followed the required steps. Instruction-adherence behavior can vary based on the model being used: some Instruct models are better at adhering to instructions than others. Recipes help force a measure of consistency across different models.

Why this works

In our experience, agents tend to take the shortest path towards solving a problem, especially if the answer can be gleaned from the question itself. For us, the formative example was validating a human review of a security bug: if you give the model the human’s answer, it will often skip straight to the answer; if you want an ‘independent’ verification from first principles, don’t let it peek at your answer key!

In addition to independent verification, for workflows with multiple required intermediate steps, it becomes a challenge to ensure that the AI model has consumed the necessary information. I’ve often seen people ask, “Did you actually review the data at step 3?” — but they also can’t trust that answer. Recipes address both of these limitations.

A Cooking Metaphor

Drawing inspiration from cooking, Recipes break a complex task into individual steps much like a cooking recipe divides up a complex process into individual steps. With single-shot prompting, the model is likely to omit several steps when presented all at once. Following a recipe helps guarantee that all the ingredients were added, and added in the correct order.

We’re using recipes across domains, for better patch reviews and for doing vulnerability impact analysis. This brings a little bit of determinism to an inherently non-deterministic workflow.

Advanced Control Flows

Sometimes it’s useful to end a recipe early, especially if a critical ingredient is not available. To enable this, we allow certain specific requests to be returned from the AI model, which affect the execution of the recipe. This can include things like conditionally skipping certain recipe steps or ask the user for input; these are excluded from the sample recipes provided here. The only control flow available in the sample script is @@ABORT, which causes the recipe execution to stop immediately. This directive is returned as a response from the model, and not as part of the recipe, though the recipe should indicate when to use this directive. As an example, try “Return @@ABORT if the requested data is incomplete or not available” and watch the llm stop executing after that step. It’s a good validation step to prevent unnecessary computation.

You are running a simple coin-flip demonstration.
First, briefly explain that a coin flip can land on heads or tails, and that this recipe stops immediately if the result is tails.

---

Flip the coin once.
If the result is tails, reply with only `@@ABORT`.
Otherwise, reply with only `WINNER`.

---

Announce the successful result in one short sentence.

Second-Order Effects: Comparing Recipes and Agents

Recipes are not Agentic: each recipe is strictly followed in order, exactly as written. Agents are less constrained, often choosing their own path through the solution space, but agent exploration can also fail in unpredictable ways.

Agents and Recipes can also complement each other when the agent itself is used to generate the Recipe. We’ve used agents to create a specific, pre-planned workflow which is used to enforce its own analysis. We get better, more consistent results from that added specificity; compare, “analyze the null pointer dereference in fs/inode.c at line 143” than “look for null pointer dereferences in the code and analyze them.”

Look back at the buffer overflow prompt from above. Now consider if we ask the agent to write its own recipe.

Is this all necessary?

In the age of SKILL.md and sophisticated “thinking” models, how much value is there in hand-holding the execution of models. While models have gotten much better at following single-step prompting — even when those prompts provide sequential instructions — Recipes still have a place when you want to be absolutely certain that the model has integrated all the necessary inputs for solving a problem. Recipes can also help avoid problems with premature compaction, by ensuring that the right information is part of the context window. It’s a simple concept but one with many potential applications.

Recipes can also have a key role to help separate “data gathering” steps from “inference” steps. Consider the case of a writing a postmortem assessment of a particular scenario: it’s often beneficial to gather data impartially without drawing conclusions from that data. In such a scenario, having a clear separation between the steps in a postmortem analysis can prove beneficial. In that scenario, I find it useful to separately ask, “Collect factual evidence only” before “Build a timeline of events”, and finally, “Identify root cause and contributing factors”. Separating those steps can prevent prematurely jumping to conclusions.

Trying it out

Curious to see how this works? We’ve provided some sample code that can be plugged into Simon Willison’s llm client .

"""
** llm-recipes-plugin
** https://blogs.oracle.com/linux/llm-recipes
**
** Copyright (c) 2026 Oracle and/or its affiliates.
** Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/

Install instructions:
  mkdir src/llm_recipes
  put this file as src/llm_recipes/__init__.py and run
  llm install -e .

CLI usage:
  llm recipe workflow.md
  cat file.txt | llm recipe workflow.md

Recipe behavior:
  - A recipe is split into sections by lines containing exactly `---`.
  - If stdin is piped and `stdin` is not already provided via `-p`,
    stdin is read once and appended to initial prompt.

Model output controls - the model can return these directives:
  @@ABORT          Stop execution immediately.
"""

from __future__ import annotations

import pathlib
import re
import sys

import click
import llm
import llm.cli as llm_cli

ABORT_DIRECTIVE = "@@ABORT"
RECIPE_SPLIT_RE = re.compile(r"(?m)^\s*---\s*$")
UNKNOWN_MODEL_ERROR = getattr(llm, "UnknownModelError", Exception)


def _parse_recipe(text: str) -> list[str]:
    return [step.strip() for step in RECIPE_SPLIT_RE.split(text) if step.strip()]


def _read_stdin() -> str | None:
    if sys.stdin.isatty():
        return None
    text = sys.stdin.read()
    return text if text else None


def _load_model(model_id: str | None):
    try:
        return llm_cli.get_model(model_id or llm_cli.get_default_model())
    except UNKNOWN_MODEL_ERROR as ex:
        raise click.ClickException(str(ex)) from ex


def _run_recipe(
    recipe_path: pathlib.Path,
    *,
    model_id: str | None,
    system: str | None,
    stdin_text: str | None,
) -> None:
    steps = _parse_recipe(recipe_path.read_text(encoding="utf-8"))
    if not steps:
        raise click.ClickException(f"{recipe_path} does not contain any recipe steps")

    if stdin_text is not None:
        steps[0] = f"{stdin_text.rstrip()}\n\n{steps[0]}"

    model = _load_model(model_id)
    try:
        conversation = model.conversation()
    except (AttributeError, NotImplementedError) as ex:
        raise click.ClickException(
            f"Model '{model.model_id}' does not support multi-turn recipe execution"
        ) from ex

    for index, step in enumerate(steps, start=1):
        prompt_kwargs = {}
        if system is not None:
            prompt_kwargs["system"] = system
        response = conversation.prompt(step, **prompt_kwargs)
        text = response.text()
        if index > 1:
            click.echo()
        click.echo(f"Step {index}:")
        click.echo(text)
        if text.strip() == ABORT_DIRECTIVE:
            return


@llm.hookimpl
def register_commands(cli):
    @cli.command(name="recipe")
    @click.argument(
        "recipe_path",
        type=click.Path(exists=True, dir_okay=False, path_type=pathlib.Path),
    )
    @click.option("-m", "--model", "model_id", help="Model to use")
    @click.option("-s", "--system", help="System prompt")
    def recipe_command(recipe_path: pathlib.Path, model_id: str | None, system: str | None):
        _run_recipe(
            recipe_path,
            model_id=model_id,
            system=system,
            stdin_text=_read_stdin(),
        )

Recipes for Determinism: Helping Agents Follow Instructions

How we use Recipes

Why this works

A Cooking Metaphor

Advanced Control Flows

Second-Order Effects: Comparing Recipes and Agents

Is this all necessary?

Trying it out

Greg Marsden

Topology Matters: Modeling EPYC Genoa topology with QEMU

Oracle Linux on Apple Silicon with UTM

Recipes for Determinism: Helping Agents Follow Instructions

How we use Recipes

Why this works

A Cooking Metaphor

Advanced Control Flows

Second-Order Effects: Comparing Recipes and Agents

Is this all necessary?

Trying it out

Authors

Greg Marsden

Topology Matters: Modeling EPYC Genoa topology with QEMU

Oracle Linux on Apple Silicon with UTM