Name: Agentfield
Rating: 5 (1 reviews)

Agentfield provides unified methods for generating images, video, audio, and transcribing speech. All methods automatically route to the correct provider based on model prefix.

Setup

Set your API keys:

# For DALL-E and OpenAI TTS
export OPENAI_API_KEY="sk-..."

# For Fal.ai (Flux images, video, Whisper)
export FAL_KEY="..."

Or configure in code:

from agentfield import Agent, AIConfig

app = Agent(
    node_id="media-agent",
    ai_config=AIConfig(
        fal_api_key="...",  # Optional - falls back to FAL_KEY env var
        video_model="fal-ai/minimax-video/image-to-video"  # Default video model
    )
)

Generate Images

# Fal.ai - Flux (fast, high quality)
result = await app.ai_with_vision(
    "A cyberpunk city at night",
    model="fal-ai/flux/schnell"  # Fast
)
result.images[0].save("city.png")

# DALL-E 3 (via LiteLLM)
result = await app.ai_with_vision(
    "A serene mountain landscape",
    model="dall-e-3",
    size="1792x1024",
    quality="hd"
)

# OpenRouter
result = await app.ai_with_vision(
    "Abstract art",
    model="openrouter/google/gemini-2.5-flash-image-preview"
)

Generate Audio (TTS)

# OpenAI TTS
result = await app.ai_with_audio(
    "Hello, welcome to the presentation.",
    voice="nova",  # alloy, echo, fable, onyx, nova, shimmer
    model="tts-1-hd"
)
result.audio.save("greeting.mp3")
result.audio.play()  # Requires pygame

Generate Video

# Image-to-video (default model)
result = await app.ai_generate_video(
    "Camera slowly zooms in on the landscape",
    image_url="https://example.com/image.jpg"
)
result.files[0].save("video.mp4")

# Text-to-video
result = await app.ai_generate_video(
    "A cat playing with yarn",
    model="fal-ai/kling-video/v1/standard"
)

Transcribe Audio (STT)

# Basic transcription
result = await app.ai_transcribe_audio(
    "https://example.com/recording.mp3"
)
print(result.text)

# Faster transcription with language hint
result = await app.ai_transcribe_audio(
    "https://example.com/spanish.mp3",
    model="fal-ai/wizper",  # 2x faster than whisper
    language="es"
)

Provider Routing

Methods automatically route to providers based on model prefix:

Model Prefix	Provider	Methods
`fal-ai/`	Fal.ai	Image, Video, Audio, Transcription
`openrouter/`	OpenRouter	Image
(default)	LiteLLM	Image, Audio (TTS)

app.ai() - Full API reference
AIConfig - Configuration options

Generate Media

Generate Media

Setup

Generate Images

Generate Audio (TTS)

Generate Video

Transcribe Audio (STT)

Provider Routing

Generate Media

Generate Media

Setup

Generate Images

Generate Audio (TTS)

Generate Video

Transcribe Audio (STT)

Provider Routing

Related