Skip to content

Image understanding

Image modality is represented as ImageParts.

types.ts
interface ImagePart {
type: "image";
/**
* The MIME type of the image. E.g. "image/jpeg", "image/png".
*/
mime_type: string;
/**
* The base64-encoded image data.
*/
image_data: string;
/**
* The width of the image in pixels.
*/
width?: number;
/**
* The height of the image in pixels.
*/
height?: number;
/**
* ID of the image part, if applicable
*/
id?: string;
}

Images can be sent to the model as ImagePart objects.

describe-image

describe-image.ts
import { getModel } from "./get-model.ts";
const imageUrl = "https://images.unsplash.com/photo-1464809142576-df63ca4ed7f0";
const imageRes = await fetch(imageUrl);
const image = await imageRes.arrayBuffer();
const model = getModel("openai", "gpt-4o");
const response = await model.generate({
messages: [
{
role: "user",
content: [
{
type: "text",
text: "Describe this image",
},
{
type: "image",
image_data: Buffer.from(image).toString("base64"),
mime_type: imageRes.headers.get("content-type") ?? "image/jpeg",
},
],
},
],
});
console.dir(response, { depth: null });