Run

Running an agent means wiring user input, model responses, and tool executions into a feedback loop. Each call to run (or run_stream) starts with the AgentRequest you provide, then the library drives the following cycle until it produces a final answer or hits a guard rail such as max_turns.

sequenceDiagram
  participant Caller
  participant Agent
  participant RunSession
  participant Model
  participant Tools

  Caller->>Agent: run({ input, context })
  Agent->>RunSession: bind context + params
  loop each turn
    RunSession->>Model: generate(messages, tool list)
    Model-->>RunSession: content + optional tool_calls
    alt tool calls present
      RunSession->>Tools: execute(tool_call.args)
      Tools-->>RunSession: tool results (Part[])
      RunSession->>Model: append tool results
      Model-->>RunSession: content (should now exclude tool_calls)
    else no tool calls
      Note over RunSession: content already final
    end
  end
  RunSession-->>Caller: AgentResponse

Behind the scenes the session tracks every intermediate item so you can append them to the next request and resume the conversation.

Shaping the request

interface AgentRequest<TContext> {
  /**
   * The input items for this run, such as LLM messages.
   */
  input: AgentItem[];
  /**
   * The context used to resolve instructions and passed to tool executions.
   */
  context: TContext;
}

pub struct AgentRequest<TCtx> {
    /// The input items for this run, such as LLM messages.
    pub input: Vec<AgentItem>,
    /// The context used to resolve instructions and passed to tool executions.
    pub context: TCtx,
}

type AgentRequest[C any] struct {
  // Input contains the items for this run, such as LLM messages.
  Input []AgentItem `json:"input"`
  // Context is the value used to resolve instructions and passed to tool executions.
  Context C `json:"context"`
}

An AgentRequest combines two pieces:

input: an ordered list of AgentItems that represent the conversation you want the model to see.
context: the per-run value passed to instructions, toolkits, and tools.

Agent items

All inputs and outputs share the same AgentItem union, making it easy to stitch runs together. Each variant captures a different kind of turn artifact.

type AgentItem =
  | AgentItemMessage
  | AgentItemModelResponse
  | AgentItemTool;

type AgentItemMessage = { type: "message" } & Message;

interface AgentItemModelResponse extends ModelResponse {
  type: "model";
}

interface AgentItemTool {
  type: "tool";
  /**
   * A unique ID for the tool call
   */
  tool_call_id: string;
  /**
   * The name of the tool called
   */
  tool_name: string;
  /**
   * The input provided to the tool
   */
  input: Record<string, unknown>;
  /**
   * The result of the tool call
   */
  output: Part[];
  /**
   * Whether the tool call resulted in an error
   */
  is_error: boolean;
}

pub enum AgentItem {
    /// A LLM message used in the run
    Message(Message),
    /// A model response generated in the run
    Model(ModelResponse),
    /// Tool call input and output generated during the run
    Tool(AgentItemTool),
}

pub struct AgentItemTool {
    /// A unique ID for the tool call
    pub tool_call_id: String,
    /// The name of the tool that was called
    pub tool_name: String,
    /// The input provided to the tool
    pub input: Value,
    /// The result content of the tool call
    pub output: Vec<Part>,
    /// Whether the tool call resulted in an error
    pub is_error: bool,
}

type AgentItem struct {
  // A LLM message used in the run
  Message *llmsdk.Message `json:"-"`
  // A model response generated in the run
  Model *AgentItemModelResponse `json:"-"`
  // A tool call result generated during the run
  Tool *AgentItemTool `json:"-"`
}

type AgentItemTool struct {
  ToolCallID string          `json:"tool_call_id"`
  ToolName   string          `json:"tool_name"`
  Input      json.RawMessage `json:"input"`
  Output     []llmsdk.Part   `json:"output"`
  IsError    bool            `json:"is_error"`
}

type AgentItemModelResponse struct {
  *llmsdk.ModelResponse
}

AgentItemMessage: user messages, assistant messages, and tool messages you collected before calling run. Use these to seed the conversation or to replay prior turns.
AgentItemModelResponse: the model output produced during a run. It mirrors an assistant message but also carries usage and cost data. You can pass this back as input and it will be interpreted the same as an AgentItemMessage of AssistantMessage, yet the extra metadata is preserved for analytics.
AgentItemTool: the full record of a tool invocation for the turn, including the call ID, arguments, model-selected tool name, tool output, and the is_error flag. Conceptually it has the same effect as passing an AgentItemMessage of ToolMessage, but the richer shape makes it easier to audit or replay tool interactions when you resume a session.

Agent response

The RunState created per run (not exported directly) stitches these items together, constructs the messages payload for the model each turn, enforces max_turns, and ultimately produces the AgentResponse.

interface AgentResponse {
  /**
   * The items generated during the run, such as new tool and assistant messages.
   */
  output: AgentItem[];

  /**
   * The final output content generated by the agent.
   */
  content: Part[];
}

pub struct AgentResponse {
    /// The items generated during the run, such as new tool and assistant
    /// messages.
    pub output: Vec<AgentItem>,

    /// The last assistant output content generated by the agent.
    pub content: Vec<Part>,
}

type AgentResponse struct {
  // Output contains the items generated during the run, such as new tool and assistant messages.
  Output []AgentItem `json:"output"`

  // Content is the final output content generated by the agent.
  Content []llmsdk.Part `json:"content"`
}

The content field is what you typically show to the user; output is the full list of items generated during the run and is the piece you should append to the next request’s input when continuing the conversation.

flowchart TD
  H[Persisted history] --> A[Compose AgentRequest input]
  A --> B[run]
  B --> C[AgentResponse output]
  C --> D[Append output to history]
  D -->|Append into history| H

Streaming runs

run_stream emits progress as soon as the model responds or a tool finishes. You receive three event shapes, which align with the non-streaming artifacts:

type AgentStreamEvent =
  | AgentStreamPartialEvent
  | AgentStreamItemEvent
  | AgentStreamResponseEvent;

interface AgentStreamPartialEvent extends PartialModelResponse {
  event: "partial";
}

interface AgentStreamItemEvent {
  event: "item";
  index: number;
  item: AgentItem;
}

interface AgentStreamResponseEvent extends AgentResponse {
  event: "response";
}

pub enum AgentStreamEvent {
    Partial(PartialModelResponse),
    Item(AgentStreamItemEvent),
    Response(AgentResponse),
}

pub struct AgentStreamItemEvent {
    pub index: usize,
    pub item: AgentItem,
}

type AgentStreamEvent struct {
  Partial  *llmsdk.PartialModelResponse `json:"-"`
  Item     *AgentStreamItemEvent        `json:"-"`
  Response *AgentResponse               `json:"-"`
}

type AgentStreamItemEvent struct {
  Index int       `json:"index"`
  Item  AgentItem `json:"item"`
}

AgentStreamPartialEvent: incremental deltas from the language model (for example, text chunks, audio samples). Use them to render streaming output.
AgentStreamItemEvent: emitted whenever the run state records a new item (a model response or a tool call/result). This mirrors what will later show up in the output array of the final response.
AgentStreamResponseEvent: fired once the run ends, bundling the same AgentResponse shape you get from non-streaming runs.

Use the stream when you want to surface partial answers, display tool progress, or fan out work while the run continues. The same items emitted during streaming will appear in the final AgentResponse.output, so you can rely on stream events for real-time UX and use the terminal response to persist state.