Generate Media
Generate images, video, and audio with AI
Generate Media
Generate images, video, and audio with AI
Agentfield provides unified methods for generating images, video, audio, and transcribing speech. All methods automatically route to the correct provider based on model prefix.
Setup
Set your API keys:
# For DALL-E and OpenAI TTS
export OPENAI_API_KEY="sk-..."
# For Fal.ai (Flux images, video, Whisper)
export FAL_KEY="..."Or configure in code:
from agentfield import Agent, AIConfig
app = Agent(
node_id="media-agent",
ai_config=AIConfig(
fal_api_key="...", # Optional - falls back to FAL_KEY env var
video_model="fal-ai/minimax-video/image-to-video" # Default video model
)
)Generate Images
# Fal.ai - Flux (fast, high quality)
result = await app.ai_with_vision(
"A cyberpunk city at night",
model="fal-ai/flux/schnell" # Fast
)
result.images[0].save("city.png")
# DALL-E 3 (via LiteLLM)
result = await app.ai_with_vision(
"A serene mountain landscape",
model="dall-e-3",
size="1792x1024",
quality="hd"
)
# OpenRouter
result = await app.ai_with_vision(
"Abstract art",
model="openrouter/google/gemini-2.5-flash-image-preview"
)Generate Audio (TTS)
# OpenAI TTS
result = await app.ai_with_audio(
"Hello, welcome to the presentation.",
voice="nova", # alloy, echo, fable, onyx, nova, shimmer
model="tts-1-hd"
)
result.audio.save("greeting.mp3")
result.audio.play() # Requires pygameGenerate Video
# Image-to-video (default model)
result = await app.ai_generate_video(
"Camera slowly zooms in on the landscape",
image_url="https://example.com/image.jpg"
)
result.files[0].save("video.mp4")
# Text-to-video
result = await app.ai_generate_video(
"A cat playing with yarn",
model="fal-ai/kling-video/v1/standard"
)Transcribe Audio (STT)
# Basic transcription
result = await app.ai_transcribe_audio(
"https://example.com/recording.mp3"
)
print(result.text)
# Faster transcription with language hint
result = await app.ai_transcribe_audio(
"https://example.com/spanish.mp3",
model="fal-ai/wizper", # 2x faster than whisper
language="es"
)Provider Routing
Methods automatically route to providers based on model prefix:
| Model Prefix | Provider | Methods |
|---|---|---|
fal-ai/ | Fal.ai | Image, Video, Audio, Transcription |
openrouter/ | OpenRouter | Image |
| (default) | LiteLLM | Image, Audio (TTS) |