Run
Running an agent means wiring user input, model responses, and tool executions into a feedback loop. Each call to run
(or run_stream
) starts with the AgentRequest
you provide, then the library drives the following cycle until it produces a final answer or hits a guard rail such as max_turns
.
sequenceDiagram participant Caller participant Agent participant RunSession participant Model participant Tools Caller->>Agent: run({ input, context }) Agent->>RunSession: bind context + params loop each turn RunSession->>Model: generate(messages, tool list) Model-->>RunSession: content + optional tool_calls alt tool calls present RunSession->>Tools: execute(tool_call.args) Tools-->>RunSession: tool results (Part[]) RunSession->>Model: append tool results Model-->>RunSession: content (should now exclude tool_calls) else no tool calls Note over RunSession: content already final end end RunSession-->>Caller: AgentResponse
Behind the scenes the session tracks every intermediate item so you can append them to the next request and resume the conversation.
Shaping the request
Section titled “Shaping the request”interface AgentRequest<TContext> { /** * The input items for this run, such as LLM messages. */ input: AgentItem[]; /** * The context used to resolve instructions and passed to tool executions. */ context: TContext;}
pub struct AgentRequest<TCtx> { /// The input items for this run, such as LLM messages. pub input: Vec<AgentItem>, /// The context used to resolve instructions and passed to tool executions. pub context: TCtx,}
type AgentRequest[C any] struct { // Input contains the items for this run, such as LLM messages. Input []AgentItem `json:"input"` // Context is the value used to resolve instructions and passed to tool executions. Context C `json:"context"`}
An AgentRequest
combines two pieces:
input
: an ordered list ofAgentItem
s that represent the conversation you want the model to see.context
: the per-run value passed to instructions, toolkits, and tools.
Agent items
Section titled “Agent items”All inputs and outputs share the same AgentItem
union, making it easy to stitch runs together. Each variant captures a different kind of turn artifact.
type AgentItem = | AgentItemMessage | AgentItemModelResponse | AgentItemTool;
type AgentItemMessage = { type: "message" } & Message;
interface AgentItemModelResponse extends ModelResponse { type: "model";}
interface AgentItemTool { type: "tool"; /** * A unique ID for the tool call */ tool_call_id: string; /** * The name of the tool called */ tool_name: string; /** * The input provided to the tool */ input: Record<string, unknown>; /** * The result of the tool call */ output: Part[]; /** * Whether the tool call resulted in an error */ is_error: boolean;}
pub enum AgentItem { /// A LLM message used in the run Message(Message), /// A model response generated in the run Model(ModelResponse), /// Tool call input and output generated during the run Tool(AgentItemTool),}
pub struct AgentItemTool { /// A unique ID for the tool call pub tool_call_id: String, /// The name of the tool that was called pub tool_name: String, /// The input provided to the tool pub input: Value, /// The result content of the tool call pub output: Vec<Part>, /// Whether the tool call resulted in an error pub is_error: bool,}
type AgentItem struct { // A LLM message used in the run Message *llmsdk.Message `json:"-"` // A model response generated in the run Model *AgentItemModelResponse `json:"-"` // A tool call result generated during the run Tool *AgentItemTool `json:"-"`}
type AgentItemTool struct { ToolCallID string `json:"tool_call_id"` ToolName string `json:"tool_name"` Input json.RawMessage `json:"input"` Output []llmsdk.Part `json:"output"` IsError bool `json:"is_error"`}
type AgentItemModelResponse struct { *llmsdk.ModelResponse}
AgentItemMessage
: user messages, assistant messages, and tool messages you collected before callingrun
. Use these to seed the conversation or to replay prior turns.AgentItemModelResponse
: the model output produced during a run. It mirrors an assistant message but also carries usage and cost data. You can pass this back as input and it will be interpreted the same as anAgentItemMessage
from the assistant, yet the extra metadata is preserved for analytics.AgentItemTool
: the full record of a tool invocation for the turn, including the call ID, arguments, model-selected tool name, tool output, and theis_error
flag. Conceptually it corresponds to a tool message, but the richer shape makes it easier to audit or replay tool interactions when you resume a session.
Agent response
Section titled “Agent response”The RunState
created per run (not exported directly) stitches these items together, constructs the messages
payload for the model each turn, enforces max_turns
, and ultimately produces the AgentResponse
.
interface AgentResponse { /** * The items generated during the run, such as new tool and assistant messages. */ output: AgentItem[];
/** * The final output content generated by the agent. */ content: Part[];}
pub struct AgentResponse { /// The items generated during the run, such as new tool and assistant /// messages. pub output: Vec<AgentItem>,
/// The last assistant output content generated by the agent. pub content: Vec<Part>,}
type AgentResponse struct { // Output contains the items generated during the run, such as new tool and assistant messages. Output []AgentItem `json:"output"`
// Content is the final output content generated by the agent. Content []llmsdk.Part `json:"content"`}
The content
field is what you typically show to the user; output
is the full list of items generated during the run and is the piece you should append to the next request’s input
when continuing the conversation.
flowchart TD H[Persisted history] --> A[Compose AgentRequest input] A --> B[run] B --> C[AgentResponse output] C --> D[Append output to history] D -->|Append into history| H
Streaming runs
Section titled “Streaming runs”run_stream
emits progress as soon as the model responds or a tool finishes. You receive three event shapes, which align with the non-streaming artifacts:
type AgentStreamEvent = | AgentStreamEventPartial | AgentStreamItemEvent | AgentStreamResponseEvent;
interface AgentStreamEventPartial extends PartialModelResponse { event: "partial";}
interface AgentStreamItemEvent { event: "item"; index: number; item: AgentItem;}
interface AgentStreamResponseEvent extends AgentResponse { event: "response";}
pub enum AgentStreamEvent { Partial(PartialModelResponse), Item(AgentStreamItemEvent), Response(AgentResponse),}
pub struct AgentStreamItemEvent { pub index: usize, pub item: AgentItem,}
type AgentStreamEvent struct { Partial *llmsdk.PartialModelResponse `json:"-"` Item *AgentStreamItemEvent `json:"-"` Response *AgentResponse `json:"-"`}
type AgentStreamItemEvent struct { Index int `json:"index"` Item AgentItem `json:"item"`}
AgentStreamEventPartial
: incremental deltas from the language model (for example, text chunks, audio samples). Use them to render streaming output.AgentStreamItemEvent
: emitted whenever the run state records a new item (a model response or a tool call/result). This mirrors what will later show up in theoutput
array of the final response.AgentStreamResponseEvent
: fired once the run ends, bundling the sameAgentResponse
shape you get from non-streaming runs.
Use the stream when you want to surface partial answers, display tool progress, or fan out work while the run continues. The same items emitted during streaming will appear in the final AgentResponse.output
, so you can rely on stream events for real-time UX and use the terminal response to persist state.