ctx.ai()

LLM interface with structured output, streaming, and automatic rate limiting

The ctx.ai() method provides a unified interface to LLMs with support for structured output via Zod schemas, streaming responses, and automatic rate limit handling with exponential backoff.

Basic Usage

agent.reasoner('analyze', async (ctx) => {
  // Simple text generation
  const response = await ctx.ai('Explain quantum computing in simple terms');
  return { explanation: response };
});

With System Prompt

agent.reasoner('translate', async (ctx) => {
  const { text, targetLanguage } = ctx.input;

  const translation = await ctx.ai(
    `Translate to ${targetLanguage}: ${text}`,
    { system: 'You are a professional translator.' }
  );

  return { translation };
});

Structured Output with Zod

Get type-safe structured responses using Zod schemas:

import { z } from 'zod';

const SentimentSchema = z.object({
  sentiment: z.enum(['positive', 'negative', 'neutral']),
  confidence: z.number().min(0).max(1),
  keywords: z.array(z.string()),
  reasoning: z.string()
});

agent.reasoner('analyze_sentiment', async (ctx) => {
  const result = await ctx.ai(
    `Analyze the sentiment: ${ctx.input.text}`,
    { schema: SentimentSchema }
  );

  // result is fully typed as:
  // { sentiment: 'positive'|'negative'|'neutral', confidence: number, keywords: string[], reasoning: string }
  return result;
});

When using a schema, the response is automatically parsed and validated. Invalid responses trigger automatic retries.

Options

Prop

Type

Tool Calling

Let LLMs automatically discover and invoke agent capabilities. Use ctx.aiWithTools() to enter a tool-call loop where the LLM discovers available tools, decides which to call, and receives results until it produces a final answer.

Simple Auto-Discovery

agent.reasoner('ask_with_tools', async (ctx) => {
  const { text, trace } = await ctx.aiWithTools(ctx.input.question, {
    tools: 'discover',
    system: 'You are a helpful assistant. Use the available tools to answer accurately.',
  });

  console.log(`Tool calls: ${trace.totalToolCalls}, Turns: ${trace.totalTurns}`);
  for (const call of trace.calls) {
    console.log(`  ${call.toolName}(${JSON.stringify(call.arguments)}) => ${call.latencyMs.toFixed(0)}ms`);
  }

  return { answer: text };
});

Filtered Discovery

import type { ToolCallConfig } from '@agentfield/sdk';

// Filter by tags
agent.reasoner('weather_report', async (ctx) => {
  const { text } = await ctx.aiWithTools(
    `What's the weather in: ${ctx.input.cities}?`,
    {
      tools: { tags: ['weather'] } satisfies ToolCallConfig,
      system: 'You are a weather reporter.',
    }
  );
  return { report: text };
});

With Guardrails

agent.reasoner('guarded', async (ctx) => {
  const { text, trace } = await ctx.aiWithTools(ctx.input.question, {
    tools: 'discover',
    system: 'You are a helpful assistant. Be efficient with tool usage.',
    maxTurns: 3,
    maxToolCalls: 5,
  });
  return { answer: text, trace };
});

Progressive Discovery (Lazy Hydration)

For large capability catalogs, lazy hydration sends only tool names and descriptions first. When the LLM selects tools, their full schemas are hydrated on demand.

agent.reasoner('smart_query', async (ctx) => {
  const { text } = await ctx.aiWithTools(ctx.input.question, {
    tools: {
      schemaHydration: 'lazy',
      maxCandidateTools: 30,
      maxHydratedTools: 8,
    } satisfies ToolCallConfig,
    system: 'You are a helpful assistant with access to tools.',
  });
  return { answer: text };
});

ToolCallConfig Options

Prop

Type

Response Shape

ctx.aiWithTools() returns { text: string; trace: ToolCallTrace }:

Prop

Type

Each ToolCallRecord includes toolName, arguments, result, error, latencyMs, and turn for full per-call observability.

Streaming

Use ctx.aiStream() for real-time streaming responses:

agent.reasoner('generate_story', async (ctx) => {
  const { prompt } = ctx.input;

  const stream = await ctx.aiStream(
    `Write a short story about: ${prompt}`,
    { system: 'You are a creative storyteller.' }
  );

  let fullResponse = '';
  for await (const chunk of stream) {
    fullResponse += chunk;
    // Each chunk is a string fragment
  }

  return { story: fullResponse };
});

Streaming to HTTP Response

agent.reasoner('stream_response', async (ctx) => {
  const stream = await ctx.aiStream(ctx.input.prompt);

  ctx.res.setHeader('Content-Type', 'text/event-stream');
  ctx.res.setHeader('Cache-Control', 'no-cache');

  for await (const chunk of stream) {
    ctx.res.write(`data: ${JSON.stringify({ chunk })}\n\n`);
  }

  ctx.res.end();
  return null; // Response already sent
});

Rate Limiting

The SDK includes automatic rate limit handling with exponential backoff:

const agent = new Agent({
  nodeId: 'my-agent',
  aiConfig: {
    model: 'gpt-4o',
    enableRateLimitRetry: true,      // Enable automatic retries
    rateLimitMaxRetries: 20,          // Maximum retry attempts
    rateLimitBaseDelay: 1.0,          // Initial delay (seconds)
    rateLimitMaxDelay: 300.0,         // Maximum delay (seconds)
    rateLimitJitterFactor: 0.25,      // ±25% jitter
    rateLimitCircuitBreakerThreshold: 10,  // Open circuit after 10 failures
    rateLimitCircuitBreakerTimeout: 300    // Reset after 5 minutes
  }
});

The rate limiter automatically detects 429 responses and Retry-After headers, applying exponential backoff with jitter to prevent thundering herd problems.

Direct AIClient Access

Access the AIClient directly for advanced usage:

agent.reasoner('advanced', async (ctx) => {
  const aiClient = ctx.aiClient;

  // Generate with full control
  const response = await aiClient.generate('Hello', {
    model: 'gpt-4o-mini',
    temperature: 0.5
  });

  return response;
});

AIClient Methods

Prop

Type

Embeddings

Generate embeddings for semantic search:

agent.reasoner('semantic_search', async (ctx) => {
  const { query, documents } = ctx.input;

  // Embed query
  const queryEmbedding = await ctx.aiClient.embed(query);

  // Embed documents
  const docEmbeddings = await ctx.aiClient.embedMany(documents);

  // Find most similar (cosine similarity)
  const similarities = docEmbeddings.map((emb, i) => ({
    index: i,
    score: cosineSimilarity(queryEmbedding, emb)
  }));

  similarities.sort((a, b) => b.score - a.score);

  return {
    query,
    topMatches: similarities.slice(0, 5).map(s => ({
      document: documents[s.index],
      score: s.score
    }))
  };
});

Examples

Multi-step Analysis

import { z } from 'zod';

const ExtractSchema = z.object({
  entities: z.array(z.object({
    name: z.string(),
    type: z.string()
  })),
  topics: z.array(z.string())
});

const SummarySchema = z.object({
  summary: z.string(),
  keyPoints: z.array(z.string())
});

agent.reasoner('deep_analyze', async (ctx) => {
  const { document } = ctx.input;

  // Step 1: Extract entities
  const extracted = await ctx.ai(
    `Extract entities and topics from:\n${document}`,
    { schema: ExtractSchema }
  );

  // Step 2: Generate summary
  const summary = await ctx.ai(
    `Summarize focusing on: ${extracted.topics.join(', ')}\n\nDocument:\n${document}`,
    { schema: SummarySchema }
  );

  return {
    ...extracted,
    ...summary
  };
});

Different Models for Different Tasks

agent.reasoner('smart_routing', async (ctx) => {
  const { task, content } = ctx.input;

  // Simple tasks use cheaper model
  if (task === 'classify') {
    return await ctx.ai(
      `Classify: ${content}`,
      { model: 'gpt-4o-mini', temperature: 0 }
    );
  }

  // Complex tasks use powerful model
  return await ctx.ai(
    `Analyze deeply: ${content}`,
    { model: 'gpt-4o', temperature: 0.7 }
  );
});