Skip to content

Run

Running an agent means wiring user input, model responses, and tool executions into a feedback loop. Each call to run (or run_stream) starts with the AgentRequest you provide, then the library drives the following cycle until it produces a final answer or hits a guard rail such as max_turns.

sequenceDiagram
  participant Caller
  participant Agent
  participant RunSession
  participant Model
  participant Tools

  Caller->>Agent: run({ input, context })
  Agent->>RunSession: bind context + params
  loop each turn
    RunSession->>Model: generate(messages, tool list)
    Model-->>RunSession: content + optional tool_calls
    alt tool calls present
      RunSession->>Tools: execute(tool_call.args)
      Tools-->>RunSession: tool results (Part[])
      RunSession->>Model: append tool results
      Model-->>RunSession: content (should now exclude tool_calls)
    else no tool calls
      Note over RunSession: content already final
    end
  end
  RunSession-->>Caller: AgentResponse

Behind the scenes the session tracks every intermediate item so you can append them to the next request and resume the conversation.

types.ts
interface AgentRequest<TContext> {
/**
* The input items for this run, such as LLM messages.
*/
input: AgentItem[];
/**
* The context used to resolve instructions and passed to tool executions.
*/
context: TContext;
}

An AgentRequest combines two pieces:

  • input: an ordered list of AgentItems that represent the conversation you want the model to see.
  • context: the per-run value passed to instructions, toolkits, and tools.

All inputs and outputs share the same AgentItem union, making it easy to stitch runs together. Each variant captures a different kind of turn artifact.

types.ts
type AgentItem =
| AgentItemMessage
| AgentItemModelResponse
| AgentItemTool;
type AgentItemMessage = { type: "message" } & Message;
interface AgentItemModelResponse extends ModelResponse {
type: "model";
}
interface AgentItemTool {
type: "tool";
/**
* A unique ID for the tool call
*/
tool_call_id: string;
/**
* The name of the tool called
*/
tool_name: string;
/**
* The input provided to the tool
*/
input: Record<string, unknown>;
/**
* The result of the tool call
*/
output: Part[];
/**
* Whether the tool call resulted in an error
*/
is_error: boolean;
}
  • AgentItemMessage: user messages, assistant messages, and tool messages you collected before calling run. Use these to seed the conversation or to replay prior turns.
  • AgentItemModelResponse: the model output produced during a run. It mirrors an assistant message but also carries usage and cost data. You can pass this back as input and it will be interpreted the same as an AgentItemMessage from the assistant, yet the extra metadata is preserved for analytics.
  • AgentItemTool: the full record of a tool invocation for the turn, including the call ID, arguments, model-selected tool name, tool output, and the is_error flag. Conceptually it corresponds to a tool message, but the richer shape makes it easier to audit or replay tool interactions when you resume a session.

The RunState created per run (not exported directly) stitches these items together, constructs the messages payload for the model each turn, enforces max_turns, and ultimately produces the AgentResponse.

types.ts
interface AgentResponse {
/**
* The items generated during the run, such as new tool and assistant messages.
*/
output: AgentItem[];
/**
* The final output content generated by the agent.
*/
content: Part[];
}

The content field is what you typically show to the user; output is the full list of items generated during the run and is the piece you should append to the next request’s input when continuing the conversation.

flowchart TD
  H[Persisted history] --> A[Compose AgentRequest input]
  A --> B[run]
  B --> C[AgentResponse output]
  C --> D[Append output to history]
  D -->|Append into history| H

run_stream emits progress as soon as the model responds or a tool finishes. You receive three event shapes, which align with the non-streaming artifacts:

types.ts
type AgentStreamEvent =
| AgentStreamEventPartial
| AgentStreamItemEvent
| AgentStreamResponseEvent;
interface AgentStreamEventPartial extends PartialModelResponse {
event: "partial";
}
interface AgentStreamItemEvent {
event: "item";
index: number;
item: AgentItem;
}
interface AgentStreamResponseEvent extends AgentResponse {
event: "response";
}
  • AgentStreamEventPartial: incremental deltas from the language model (for example, text chunks, audio samples). Use them to render streaming output.
  • AgentStreamItemEvent: emitted whenever the run state records a new item (a model response or a tool call/result). This mirrors what will later show up in the output array of the final response.
  • AgentStreamResponseEvent: fired once the run ends, bundling the same AgentResponse shape you get from non-streaming runs.

Use the stream when you want to surface partial answers, display tool progress, or fan out work while the run continues. The same items emitted during streaming will appear in the final AgentResponse.output, so you can rely on stream events for real-time UX and use the terminal response to persist state.