Skip to content

Model Context Protocol (MCP)

Model Context Protocol (MCP) is an open standard for wiring assistants to external data sources, automation, and services without baking those integrations into your agent. MCP servers expose tools over a shared protocol, while clients (like this agent library) negotiate capabilities and relay tool calls. The MCP integration lets each run session attach to an MCP server, discover the tools it offers, and route tool invocations through the protocol using per-user credentials or transports.

Under the hood this integration is just an implementation of the Toolkit primitive, so it shares the same lifecycle and composition rules as your other toolkits. The MCP connection opens during session creation and closes when the session ends.

mcp/types.ts
type MCPInit<TContext> =
| MCPParams
| ((context: TContext) => MCPParams | Promise<MCPParams>);
type MCPParams = MCPStdioParams | MCPStreamableHTTPParams;
interface MCPStdioParams {
type: "stdio";
// Executable that implements the MCP server.
command: string;
// Optional arguments passed to the command.
args?: string[];
}
interface MCPStreamableHTTPParams {
type: "streamable-http";
// Base URL for the MCP server.
url: string;
// Authorization header value; OAuth2 flows are not handled automatically so
// callers must provide a token when required.
authorization?: string;
}
  • MCPInit lets you pass static parameters or derive them from the per-run context. Use synchronous or asynchronous resolvers to fetch tokens, endpoints, or feature flags on demand.
  • MCPParams chooses the transport (stdio for local processes, streamable-http for hosted servers).
  • authorization is forwarded as provided. Supply ready-to-use credentials because the library does not negotiate OAuth flows on your behalf.

This library is typically deployed as a single agent service that serves many users. The agent itself stays stateless; the run context carries caller-specific identifiers so Toolkit.create_session can resolve per-session details before the conversation starts. Use that resolver to look up OAuth tokens, choose the correct MCP endpoint, or apply tenant-level feature flags. It keeps the agent reusable while still routing each session through the appropriate integration, matching the lifecycle in Agent vs Run session.

The snippets below show an init function that loads an OAuth token from storage based on the user ID in context and returns MCP parameters for that session.

agent.ts
import { Agent } from "@hoangvvo/llm-agent";
import { mcpToolkit, type MCPParams } from "@hoangvvo/llm-agent/mcp";
interface SessionContext {
tenantId: string;
userId: string;
}
async function resolveMcpParams(context: SessionContext): Promise<MCPParams> {
const token = await fetchMcpTokenForUser(context.userId);
const url = await resolveMcpEndpointForTenant(context.tenantId);
return {
type: "streamable-http",
url,
authorization: token,
};
}
const agent = new Agent<SessionContext>({
name: "Transit Concierge",
model,
toolkits: [mcpToolkit(resolveMcpParams)],
});
  • At session start the toolkit calls list_tools and subscribes to tool_list_changed, so updates published by the server flow back automatically.
  • Each remote definition becomes an AgentTool, letting the agent runtime evaluate tool selection and error handling exactly the same way it does for local tools.
  • Tool responses are converted into SDK parts (text, image, audio) before being appended to the transcript.

The full examples below stand up a minimal shuttle-planning MCP server, register it through the toolkit, and run a single conversation turn against it.

examples/mcp.ts
import { Agent } from "@hoangvvo/llm-agent";
import { mcpToolkit } from "@hoangvvo/llm-agent/mcp";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { randomUUID } from "node:crypto";
import { once } from "node:events";
import { createServer, type IncomingMessage } from "node:http";
import { z } from "zod";
import { getModel } from "./get-model.ts";
// This example demonstrates:
// 1. Launching a minimal streamable HTTP MCP server using the official TypeScript SDK.
// 2. Registering that server through the MCP toolkit primitive.
// 3. Having the agent call the remote tool during a conversation.
const PORT = 39813;
const SERVER_URL = `http://127.0.0.1:${PORT}`;
const AUTH_TOKEN = "transit-hub-secret";
interface SessionContext {
riderName: string;
authorization: string;
}
async function main(): Promise<void> {
const stopServer = await startStubMcpServer();
try {
const model = getModel("openai", "gpt-4o-mini");
const agent = new Agent<SessionContext>({
name: "Sage",
model,
instructions: [
"You are Sage, the shuttle concierge for the Transit Hub.",
"Lean on connected transit systems before guessing, and tailor advice to the rider's shift.",
(context) =>
`You are assisting ${context.riderName} with tonight's shuttle planning.`,
],
// The MCP toolkit primitive resolves transport params per session. Here we pull the rider-specific
// authorization token from context so each agent session connects with the correct credentials.
toolkits: [
mcpToolkit((context) => ({
type: "streamable-http",
url: SERVER_URL,
authorization: context.authorization,
})),
],
});
const session = await agent.createSession({
riderName: "Avery",
authorization: AUTH_TOKEN,
});
try {
const turn = await session.run({
input: [
{
type: "message",
role: "user",
content: [
{ type: "text", text: "What shuttles are running tonight?" },
],
},
],
});
console.log("=== Agent Response ===");
const replyText = turn.content
.filter(
(part): part is { type: "text"; text: string } =>
part.type === "text",
)
.map((part) => part.text)
.join("\n");
console.log(replyText || JSON.stringify(turn.content, null, 2));
} finally {
await session.close();
}
} finally {
await stopServer();
}
}
await main().catch((error) => {
console.error(error);
process.exitCode = 1;
});
function createShuttleServer(): McpServer {
const server = new McpServer({
name: "shuttle-scheduler",
version: "1.0.0",
});
server.registerTool(
"list_shuttles",
{
description: "List active shuttle routes for the selected shift",
inputSchema: {
shift: z
.enum(["evening", "overnight"])
.describe(
"Which operating window to query. OpenAI requires `additionalProperties: false` and every property listed in `required`, so this schema keeps a single required field.",
),
},
},
async ({ shift }) => ({
content: [
{
type: "text",
text:
shift === "overnight"
? "Harbor Express and Dawn Flyer are staged for the overnight shift."
: "Midnight Loop and Harbor Express are on duty tonight.",
},
],
}),
);
return server;
}
function isAuthorized(req: IncomingMessage): boolean {
const header = req.headers.authorization;
return typeof header === "string" && header === `Bearer ${AUTH_TOKEN}`;
}
async function startStubMcpServer(): Promise<() => Promise<void>> {
const server = createShuttleServer();
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => randomUUID(),
enableJsonResponse: true,
});
await server.connect(transport);
const httpServer = createServer((req, res) => {
if (req.url === "/status" && req.method === "GET") {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ status: "ok" }));
return;
}
if (!isAuthorized(req)) {
res.writeHead(401, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
error: "unauthorized",
message: "Provide the shuttle access token.",
}),
);
return;
}
if (req.url === "/" && req.method === "POST") {
const chunks: Buffer[] = [];
req.on("data", (chunk) => chunks.push(chunk as Buffer));
req.on("end", async () => {
const body = Buffer.concat(chunks);
await transport.handleRequest(req, res, JSON.parse(body.toString()));
});
return;
}
res.writeHead(404);
res.end();
});
httpServer.listen(PORT);
await once(httpServer, "listening");
return async () => {
await new Promise<void>((resolve, reject) => {
httpServer.close((err) => {
if (err) reject(err);
else resolve();
});
});
};
}

When you create a run session manually, call its close method after the conversation so the MCP connection and any spawned processes shut down cleanly. The one-shot run and run_stream helpers already manage that lifecycle for you; see Agent vs Run session for a refresher.