Deployment Overview
Understanding Agentfield's deployment architecture and patterns
Deployment Overview
How to deploy Agentfield from localhost to enterprise scale
Traditional agent deployments require Redis for state, Kafka for async jobs, load balancers for routing, and service meshes for observability.
Agentfield deployments need PostgreSQL. That's it.
The architecture eliminates infrastructure complexity while enabling enterprise-scale patterns: horizontal scaling, independent agent deployment, and complete observability—all built into the control plane.
The Architecture Advantage
Stateless Control Plane
Written in Go. Shares nothing but Postgres. Scale from 1 to N instances behind a load balancer with zero coordination overhead. Each instance handles requests independently.
Independent Agent Scaling
Each agent node is a separate service. Marketing team deploys their agents. Support team deploys theirs. No monolithic coordination. Control plane discovers and routes automatically.
Routing Through Control Plane
Agents never call each other directly. All app.call() requests route through the control plane. This enables: automatic load balancing, complete DAG visibility, circuit breaking, context propagation.
Database-Backed Everything
Async execution queue: Postgres. Distributed locks: Postgres. Memory fabric: Postgres. Webhook state: Postgres. No Redis, Kafka, or RabbitMQ. One dependency.
Deployment Patterns
Choose based on your scale and team structure:
Local Development
When: Solo prototyping, rapid iteration Database: BoltDB (embedded, zero-config) Scale: Single machine
# Start control plane with embedded database
af server
# Run your agent in another terminal
cd my-agent
af runWhat you get:
- Zero infrastructure setup
- Hot reload for agent code
- Local web UI at http://localhost:8080
- BoltDB auto-initializes in
~/.agentfield/
Limitations: Single writer (BoltDB constraint), no horizontal scaling
Docker Compose
When: Team development, CI/CD testing Database: PostgreSQL (containerized) Scale: Multiple agents, single control plane
# docker-compose.yml (available on GitHub)
services:
postgres:
image: postgres:15-alpine
environment:
POSTGRES_DB: agentfield
POSTGRES_USER: agentfield
POSTGRES_PASSWORD: change-in-production
control-plane:
image: agentfield/control-plane:latest
ports: ["8080:8080"]
environment:
AGENTFIELD_POSTGRES_URL: postgres://agentfield:password@postgres:5432/agentfield
depends_on: [postgres]
agent-support:
build: ./agents/support
environment:
AGENTFIELD_SERVER: http://control-plane:8080
agent-analytics:
build: ./agents/analytics
environment:
AGENTFIELD_SERVER: http://control-plane:8080What you get:
- Isolated environments per team
- Multiple agent nodes in one stack
- PostgreSQL persistence
- Reproducible deployments
Official compose file: github.com/Agent-Field/agentfield/deployments/docker/docker-compose.yml
Managed Platforms
When: Production without Kubernetes complexity Database: Managed Postgres (add-on) Scale: Horizontal, auto-scaling
Supported platforms:
- Railway (auto-detected via
RAILWAY_ENVIRONMENT) - Render (auto-detected via
RENDER_SERVICE_NAME) - Heroku (auto-detected via
DYNO) - Fly.io (manual config)
# Railway example
railway init
railway add postgresql # Managed Postgres add-on
railway up # Deploys control plane
# Deploy agent (separate service)
cd agents/support
railway upWhat you get:
- Zero ops: platform handles scaling, SSL, monitoring
- Persistent agent processes (not serverless cold starts)
- Cost-effective for most workloads
- Auto-detected by Agentfield SDK
Why not Lambda/Cloud Functions? Agents maintain HTTP servers and heartbeat connections. Managed platforms (Railway, Render) run persistent processes at lower cost than provisioned Lambda.
Serverless (Lambda, Cloud Functions, Cloud Run)
When: Bursty workloads, pay-per-use agents, zero ops runtime Database: Managed Postgres for the control plane (persistent state still lives there) Scale: Auto-scales with function provider; cold starts apply
# 1) Build a handler with the SDK + an adapter for your platform's event shape
# Python: handler = lambda event, ctx: app.handle_serverless(event, adapter)
# TS: export const handler = agent.handler((event) => normalize(event))
# Go: Handler() for HTTP, or HandleServerlessEvent(ctx, event, adapter) for raw payloads
# 2) Deploy the function (Lambda/Cloud Run/Cloud Functions)
# 3) Register the function URL with the control plane
af nodes register-serverless --url https://<function-url>
# 4) Invoke normally through the gateway
curl -X POST "$CP/api/v1/reasoners/<node_id>.<reasoner>" -d '{"input":{...}}'What you get:
- No heartbeats or always-on servers; control plane calls
/discoverthen/executeon demand - Full workflow/run IDs propagated into serverless executions; adapters let you align any event blob
- Cross-agent calls still route through the control plane (
app.call("other-node.fn")) even in serverless - Great for spiky or low-duty-cycle agents; consider managed (non-serverless) for steady traffic
Kubernetes
When: Enterprise scale, multi-region, compliance Database: Managed Postgres (AWS RDS, Cloud SQL, Azure) or operator Scale: Unlimited horizontal, HPA-ready
# Control plane: Stateless deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: agentfield-control-plane
spec:
replicas: 3 # Scale horizontally
selector:
matchLabels:
app: agentfield-cp
template:
spec:
containers:
- name: control-plane
image: agentfield/control-plane:latest
env:
- name: AGENTFIELD_POSTGRES_URL
valueFrom:
secretKeyRef:
name: postgres-secret
key: url
---
# Agent: Independent deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: support-agent
spec:
replicas: 5 # Scale based on agent load
template:
spec:
containers:
- name: agent
image: your-org/support-agent:latest
env:
- name: AGENTFIELD_SERVER
value: "http://agentfield-cp-service:8080"What you get:
- Control plane scales independently of agents
- Each agent type scales based on its load
- Standard K8s patterns (HPA, ingress, service mesh)
- Multi-region with shared Postgres
Database Setup
Agentfield uses your database as the single source of truth—all state, workflow executions, and agent registry data live there.
Local Development: BoltDB
af server # Auto-creates ~/.agentfield/agentfield.boltZero-config embedded database for solo development. Limitation: single process only.
Production: PostgreSQL
export AGENTFIELD_DATABASE_URL="postgres://user:pass@host:5432/agentfield"
af server # Auto-runs migrationsRequired for team development and production. Enables horizontal scaling and multi-writer support.
Managed Postgres providers:
- AWS RDS, Google Cloud SQL, Azure Database
- Railway, Render, Heroku (platform add-ons)
For advanced database tuning (connection pools, indexes, read replicas), see Database Performance Guide.
Environment Variables for Production
Control Plane
Essential:
# Storage
AGENTFIELD_STORAGE_MODE=postgres
AGENTFIELD_DATABASE_URL="postgres://user:pass@db.example.com:5432/agentfield?sslmode=require"
# Server
AGENTFIELD_PORT=8080
AGENTFIELD_MODE=cloud
# Performance
AGENTFIELD_STORAGE_POSTGRES_MAX_CONNECTIONS=100
AGENTFIELD_STORAGE_POSTGRES_MAX_IDLE_CONNECTIONS=25
# Security
AGENTFIELD_API_CORS_ALLOWED_ORIGINS=https://app.company.com,https://admin.company.comOptional (with sensible defaults):
AGENTFIELD_UI_ENABLED=true
AGENTFIELD_STORAGE_POSTGRES_CONNECTION_TIMEOUT=60s
AGENTFIELD_STORAGE_POSTGRES_QUERY_TIMEOUT=60sPython Agents
Essential:
# Control plane connectivity
AGENTFIELD_SERVER=https://agentfield.company.com
AGENT_CALLBACK_URL=http://agent-service:8000 # Critical for containers
PORT=8000
# AI providers
OPENAI_API_KEY=sk-proj-...
# OR
ANTHROPIC_API_KEY=sk-ant-...Performance tuning:
# Multi-core scaling
UVICORN_WORKERS=4 # Set to (2 × CPU cores) + 1
# Async execution
AGENTFIELD_ASYNC_MAX_CONCURRENT_EXECUTIONS=2048
AGENTFIELD_ASYNC_CONNECTION_POOL_SIZE=128
AGENTFIELD_ASYNC_BATCH_SIZE=200
# Logging (set to INFO or WARNING in production)
AGENTFIELD_LOG_LEVEL=INFO
AGENTFIELD_LOG_TRUNCATE=500Go Agents
Essential:
# AI integration
OPENAI_API_KEY=sk-proj-...
# OR for OpenRouter (multi-provider access)
OPENROUTER_API_KEY=sk-or-v1-...
# Model selection
AI_MODEL=gpt-4o # or anthropic/claude-3-5-sonnet for OpenRouterSee Environment Variables Reference for the complete list.
Production Checklist
Before deploying to production, verify:
Infrastructure
- PostgreSQL 15+ provisioned (managed or self-hosted)
- Database connection pooling configured (min 25 connections)
- Postgres backups enabled (point-in-time recovery)
- Load balancer in front of control plane (if running multiple instances)
Security
- TLS enabled for control plane endpoints
- Database credentials in secrets (not environment variables)
- Agent callback URLs use HTTPS (not HTTP)
- DID/VC feature enabled for audit trails (
features.did_vc.enabled: true) - Webhook HMAC secrets configured for callback authentication
Configuration
-
AGENTFIELD_POSTGRES_URLset on control plane -
AGENTFIELD_SERVERset on all agent nodes -
AGENT_CALLBACK_URLexplicitly set (don't rely on auto-detection in production) - Queue worker count tuned for workload (
execution_queue.workers) - Execution retention configured (
cleanup.retention_hours)
Observability
- Metrics endpoint exposed (
/api/v1/metrics) - Health check configured on load balancer (
/health) - Agent heartbeat interval tuned (
agent_registry.heartbeat_interval) - Workflow DAG visualization accessible (control plane UI)
- Log aggregation configured (stdout/stderr to ELK, Datadog, etc.)
Testing
- Load test async execution queue (verify backpressure)
- Test control plane failover (kill instance, verify requests route to others)
- Test agent scaling (verify load balancing across replicas)
- Verify webhook delivery and retries
- Test database failover (if using HA Postgres)
Common Questions
Related Documentation
- Docker Deployment - Container deployment patterns
- Kubernetes Deployment - Enterprise Kubernetes deployment
- Managed Platforms - Railway, Render, Heroku, Fly.io
- Database Performance Tuning - Connection pools, indexes, optimization
- Production Hardening - Security, resources, best practices
- Advanced Deployment Patterns - Multi-region, team isolation, hybrid cloud
- Monitoring - Production observability and metrics
- Environment Variables Reference - Complete variable list
Next Steps
Start Local
Begin with local development using BoltDB. Zero infrastructure setup.
Deploy with Docker
Team development with Docker Compose. PostgreSQL included.
Scale with Kubernetes
Enterprise deployment patterns with horizontal scaling.
Managed Platforms
Production without Kubernetes. Railway, Render, Heroku.
The architecture eliminates complexity. One database. Stateless services. Independent scaling. Deploy with confidence.