Kubernetes Deployment
Enterprise-grade orchestration for distributed agent systems
Kubernetes Deployment
Deploy stateless control plane and independent agent nodes at enterprise scale
Agentfield's architecture is Kubernetes-native: Stateless control plane services scale horizontally. Agent nodes deploy independently. All state lives in PostgreSQL. No complex service mesh required—the control plane handles routing.
This guide provides production-ready Kubernetes manifests for deploying Agentfield at enterprise scale.
Architecture on Kubernetes
┌─────────────────────────────────────────────────────┐
│ Ingress Controller (nginx, traefik) │
│ TLS termination + routing │
└────────────────┬────────────────────────────────────┘
│
┌───────┴───────┐
▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Control │ │ Control │ │ Control │
│ Plane Pod #1 │ │ Plane Pod #2 │ │ Plane Pod #3 │
│ (Stateless) │ │ (Stateless) │ │ (Stateless) │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└───────┬───────┴───────┬───────┘
▼ ▼
┌──────────────┐ ┌──────────────┐
│ PostgreSQL │ │ Agent Pods │
│ (StatefulSet │ │ (Deployments)│
│ or managed) │ │ • support │
└──────────────┘ │ • analytics │
│ • email │
└──────────────┘Key principles:
- Control plane: Deployment (stateless, horizontally scalable)
- Agents: Separate Deployments (scale independently)
- Database: StatefulSet or managed service (AWS RDS, Cloud SQL, Azure Database)
- No service mesh needed: Control plane routes all agent traffic
Prerequisites
Kubernetes Cluster
Minimum requirements:
- Kubernetes 1.24+
- 3+ worker nodes (for HA control plane)
- Ingress controller (nginx, traefik, or cloud-native)
Tested on:
- Google Kubernetes Engine (GKE)
- Amazon Elastic Kubernetes Service (EKS)
- Azure Kubernetes Service (AKS)
- Self-managed clusters (kubeadm, k3s, Rancher)
PostgreSQL Database
Options:
-
Managed service (recommended for production):
- AWS RDS PostgreSQL 15+
- Google Cloud SQL for PostgreSQL
- Azure Database for PostgreSQL
-
Postgres Operator (self-managed):
- Zalando Postgres Operator
- CrunchyData Postgres Operator
- CloudNativePG
-
StatefulSet (development/testing):
- Included in manifests below
kubectl and Cluster Access
# Verify access
kubectl cluster-info
kubectl get nodesQuick Start: Complete Stack
Deploy Agentfield with PostgreSQL StatefulSet:
Create Namespace
kubectl create namespace agentfield
kubectl config set-context --current --namespace=agentfieldCreate Secrets
# Database credentials
kubectl create secret generic postgres-secret \
--from-literal=username=agentfield \
--from-literal=password=change-in-production \
--from-literal=database=agentfield
# Agent API keys
kubectl create secret generic agent-secrets \
--from-literal=openai-api-key=sk-your-key-hereApply Manifests
Save and apply the manifests from sections below, or use our example repository:
# Clone example manifests
git clone https://github.com/Agent-Field/agentfield
cd agentfield/deployments/kubernetes
# Apply in order
kubectl apply -f postgres-statefulset.yaml
kubectl apply -f control-plane-deployment.yaml
kubectl apply -f control-plane-service.yaml
kubectl apply -f agent-support-deployment.yaml
kubectl apply -f ingress.yamlVerify Deployment
# Check all pods are running
kubectl get pods
# Check control plane logs
kubectl logs -l app=agentfield-control-plane
# Check agent logs
kubectl logs -l app=support-agentPostgreSQL StatefulSet
For development/testing only. Production should use managed databases (AWS RDS, Cloud SQL, Azure Database).
# Get complete manifests from GitHub
git clone https://github.com/Agent-Field/agentfield
kubectl apply -f agentfield/deployments/kubernetes/postgres-statefulset.yamlKey configuration:
- Image:
postgres:15-alpine - Resources: 256Mi-1Gi memory, 250m-1000m CPU
- Volume: 10Gi persistent storage
- Health checks:
pg_isreadyfor liveness/readiness
Control Plane Deployment
Stateless, horizontally scalable deployment:
# Get complete manifests from GitHub
kubectl apply -f k8s-manifests/control-plane-deployment.yaml
kubectl apply -f k8s-manifests/control-plane-service.yamlCritical configuration:
- Image:
agentfield/control-plane:latest - Replicas: 3 (scale horizontally)
- Resources: 256Mi-1Gi memory, 250m-1000m CPU
- Environment:
AGENTFIELD_POSTGRES_URLfrom secret - Health:
/healthendpoint for liveness/readiness
Scaling:
# Manual
kubectl scale deployment agentfield-control-plane --replicas=5
# Auto-scale (HPA)
kubectl autoscale deployment agentfield-control-plane --cpu-percent=70 --min=3 --max=10Agent Deployment
Each agent type gets its own Deployment, scaled independently:
# Get agent manifests from GitHub
kubectl apply -f k8s-manifests/agent-support-deployment.yaml
kubectl apply -f k8s-manifests/agent-support-service.yamlCritical configuration:
- Environment:
AGENTFIELD_SERVER=http://agentfield-control-plane:8080 - Environment:
AGENT_CALLBACK_URL=http://support-agent:8001(service name) - API keys from secrets (
OPENAI_API_KEY, etc.) - Resources: 512Mi-2Gi memory, 500m-2000m CPU
Deploy multiple agent types:
kubectl apply -f k8s-manifests/agent-support.yaml # High traffic → 5 replicas
kubectl apply -f k8s-manifests/agent-analytics.yaml # CPU-intensive → 3 replicas
kubectl apply -f k8s-manifests/agent-email.yaml # Low traffic → 1 replicaEach agent type scales independently based on its workload.
Ingress Configuration
Expose control plane publicly with TLS:
kubectl apply -f k8s-manifests/ingress.yamlKey configuration:
- Host:
agentfield.yourdomain.com - TLS: cert-manager with Let's Encrypt
- Backend:
agentfield-control-plane:8080 - Ingress class: nginx (or traefik, etc.)
Setup cert-manager for SSL:
# Install cert-manager (one-time setup)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
# Configure Let's Encrypt issuer (see GitHub manifests)
kubectl apply -f k8s-manifests/cert-issuer.yamlHealth Checks
All manifests include health checks via the /health endpoint:
- Liveness: Restarts unhealthy pods (every 10-30s)
- Readiness: Removes unready pods from load balancer (every 5-10s)
Both control plane and agents expose /health with connection status.
Monitoring
Control plane exposes Prometheus metrics at /metrics:
- Execution counts and latency
- Queue depth
- Active agents
- HTTP traffic
For complete monitoring setup (Prometheus, Grafana, logging), see the Monitoring Guide.
Troubleshooting
Pod Crashes on Startup
Symptom: CrashLoopBackOff
Check logs:
kubectl logs -l app=agentfield-control-plane --tail=100Common causes:
- Database connection failed (check
AGENTFIELD_POSTGRES_URL) - Secrets not found (verify
kubectl get secrets) - Resource limits too low (increase memory/CPU)
Agents Can't Connect to Control Plane
Symptom: Agent logs show "connection refused"
Verify service:
kubectl get svc agentfield-control-plane
# Test from another pod
kubectl run curl --image=curlimages/curl -it --rm -- curl http://agentfield-control-plane:8080/healthSolution:
- Ensure control plane service exists
- Verify agent
AGENTFIELD_SERVERenvironment variable - Check network policies (if using Calico, Cilium, etc.)
Database Connection Pool Exhausted
Symptom: "too many connections" errors
Check current connections:
kubectl exec -it postgres-0 -- psql -U agentfield -c "SELECT count(*) FROM pg_stat_activity;"Solutions:
- Reduce control plane replicas
- Configure
AGENTFIELD_POSTGRES_MAX_CONNSenvironment variable - Use PgBouncer as connection pooler
Production Best Practices
For complete production hardening (RBAC, network policies, security contexts, resource limits), see the Production Hardening Guide.
Quick checklist:
- Use managed PostgreSQL (not StatefulSet)
- Enable HPA for auto-scaling
- Configure resource limits
- Use TLS for all external traffic
- Store secrets in Kubernetes Secrets (not env vars)
- Enable pod security policies
- Set up monitoring and alerting
Related Documentation
- Deployment Overview - Architecture and deployment patterns
- Production Hardening - Security, RBAC, resource limits
- Advanced Patterns - Multi-region, blue-green, namespace isolation
- Monitoring - Prometheus, Grafana, logging setup
- Managed Platforms - Railway, Render (simpler alternative)
- Complete Manifests on GitHub - All YAML files
Kubernetes deploys Agentfield at enterprise scale. Stateless control plane. Independent agents. Standard patterns. Production-ready.