Kubernetes Deployment

Enterprise-grade orchestration for distributed agent systems

Kubernetes Deployment

Deploy stateless control plane and independent agent nodes at enterprise scale

Agentfield's architecture is Kubernetes-native: Stateless control plane services scale horizontally. Agent nodes deploy independently. All state lives in PostgreSQL. No complex service mesh required—the control plane handles routing.

This guide provides production-ready Kubernetes manifests for deploying Agentfield at enterprise scale.


Architecture on Kubernetes

┌─────────────────────────────────────────────────────┐
│ Ingress Controller (nginx, traefik)                │
│ TLS termination + routing                          │
└────────────────┬────────────────────────────────────┘

         ┌───────┴───────┐
         ▼               ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ Control      │  │ Control      │  │ Control      │
│ Plane Pod #1 │  │ Plane Pod #2 │  │ Plane Pod #3 │
│ (Stateless)  │  │ (Stateless)  │  │ (Stateless)  │
└──────────────┘  └──────────────┘  └──────────────┘
         │               │               │
         └───────┬───────┴───────┬───────┘
                 ▼               ▼
         ┌──────────────┐  ┌──────────────┐
         │ PostgreSQL   │  │ Agent Pods   │
         │ (StatefulSet │  │ (Deployments)│
         │  or managed) │  │ • support    │
         └──────────────┘  │ • analytics  │
                           │ • email      │
                           └──────────────┘

Key principles:

  • Control plane: Deployment (stateless, horizontally scalable)
  • Agents: Separate Deployments (scale independently)
  • Database: StatefulSet or managed service (AWS RDS, Cloud SQL, Azure Database)
  • No service mesh needed: Control plane routes all agent traffic

Prerequisites

Kubernetes Cluster

Minimum requirements:

  • Kubernetes 1.24+
  • 3+ worker nodes (for HA control plane)
  • Ingress controller (nginx, traefik, or cloud-native)

Tested on:

  • Google Kubernetes Engine (GKE)
  • Amazon Elastic Kubernetes Service (EKS)
  • Azure Kubernetes Service (AKS)
  • Self-managed clusters (kubeadm, k3s, Rancher)

PostgreSQL Database

Options:

  1. Managed service (recommended for production):

    • AWS RDS PostgreSQL 15+
    • Google Cloud SQL for PostgreSQL
    • Azure Database for PostgreSQL
  2. Postgres Operator (self-managed):

    • Zalando Postgres Operator
    • CrunchyData Postgres Operator
    • CloudNativePG
  3. StatefulSet (development/testing):

    • Included in manifests below

kubectl and Cluster Access

# Verify access
kubectl cluster-info
kubectl get nodes

Quick Start: Complete Stack

Deploy Agentfield with PostgreSQL StatefulSet:

Create Namespace

kubectl create namespace agentfield
kubectl config set-context --current --namespace=agentfield

Create Secrets

# Database credentials
kubectl create secret generic postgres-secret \
  --from-literal=username=agentfield \
  --from-literal=password=change-in-production \
  --from-literal=database=agentfield

# Agent API keys
kubectl create secret generic agent-secrets \
  --from-literal=openai-api-key=sk-your-key-here

Apply Manifests

Save and apply the manifests from sections below, or use our example repository:

# Clone example manifests
git clone https://github.com/Agent-Field/agentfield
cd agentfield/deployments/kubernetes

# Apply in order
kubectl apply -f postgres-statefulset.yaml
kubectl apply -f control-plane-deployment.yaml
kubectl apply -f control-plane-service.yaml
kubectl apply -f agent-support-deployment.yaml
kubectl apply -f ingress.yaml

Verify Deployment

# Check all pods are running
kubectl get pods

# Check control plane logs
kubectl logs -l app=agentfield-control-plane

# Check agent logs
kubectl logs -l app=support-agent

PostgreSQL StatefulSet

For development/testing only. Production should use managed databases (AWS RDS, Cloud SQL, Azure Database).

# Get complete manifests from GitHub
git clone https://github.com/Agent-Field/agentfield
kubectl apply -f agentfield/deployments/kubernetes/postgres-statefulset.yaml

Key configuration:

  • Image: postgres:15-alpine
  • Resources: 256Mi-1Gi memory, 250m-1000m CPU
  • Volume: 10Gi persistent storage
  • Health checks: pg_isready for liveness/readiness

Control Plane Deployment

Stateless, horizontally scalable deployment:

# Get complete manifests from GitHub
kubectl apply -f k8s-manifests/control-plane-deployment.yaml
kubectl apply -f k8s-manifests/control-plane-service.yaml

Critical configuration:

  • Image: agentfield/control-plane:latest
  • Replicas: 3 (scale horizontally)
  • Resources: 256Mi-1Gi memory, 250m-1000m CPU
  • Environment: AGENTFIELD_POSTGRES_URL from secret
  • Health: /health endpoint for liveness/readiness

Scaling:

# Manual
kubectl scale deployment agentfield-control-plane --replicas=5

# Auto-scale (HPA)
kubectl autoscale deployment agentfield-control-plane --cpu-percent=70 --min=3 --max=10

Agent Deployment

Each agent type gets its own Deployment, scaled independently:

# Get agent manifests from GitHub
kubectl apply -f k8s-manifests/agent-support-deployment.yaml
kubectl apply -f k8s-manifests/agent-support-service.yaml

Critical configuration:

  • Environment: AGENTFIELD_SERVER=http://agentfield-control-plane:8080
  • Environment: AGENT_CALLBACK_URL=http://support-agent:8001 (service name)
  • API keys from secrets (OPENAI_API_KEY, etc.)
  • Resources: 512Mi-2Gi memory, 500m-2000m CPU

Deploy multiple agent types:

kubectl apply -f k8s-manifests/agent-support.yaml    # High traffic → 5 replicas
kubectl apply -f k8s-manifests/agent-analytics.yaml  # CPU-intensive → 3 replicas
kubectl apply -f k8s-manifests/agent-email.yaml      # Low traffic → 1 replica

Each agent type scales independently based on its workload.


Ingress Configuration

Expose control plane publicly with TLS:

kubectl apply -f k8s-manifests/ingress.yaml

Key configuration:

  • Host: agentfield.yourdomain.com
  • TLS: cert-manager with Let's Encrypt
  • Backend: agentfield-control-plane:8080
  • Ingress class: nginx (or traefik, etc.)

Setup cert-manager for SSL:

# Install cert-manager (one-time setup)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml

# Configure Let's Encrypt issuer (see GitHub manifests)
kubectl apply -f k8s-manifests/cert-issuer.yaml

Health Checks

All manifests include health checks via the /health endpoint:

  • Liveness: Restarts unhealthy pods (every 10-30s)
  • Readiness: Removes unready pods from load balancer (every 5-10s)

Both control plane and agents expose /health with connection status.


Monitoring

Control plane exposes Prometheus metrics at /metrics:

  • Execution counts and latency
  • Queue depth
  • Active agents
  • HTTP traffic

For complete monitoring setup (Prometheus, Grafana, logging), see the Monitoring Guide.


Troubleshooting

Pod Crashes on Startup

Symptom: CrashLoopBackOff

Check logs:

kubectl logs -l app=agentfield-control-plane --tail=100

Common causes:

  • Database connection failed (check AGENTFIELD_POSTGRES_URL)
  • Secrets not found (verify kubectl get secrets)
  • Resource limits too low (increase memory/CPU)

Agents Can't Connect to Control Plane

Symptom: Agent logs show "connection refused"

Verify service:

kubectl get svc agentfield-control-plane

# Test from another pod
kubectl run curl --image=curlimages/curl -it --rm -- curl http://agentfield-control-plane:8080/health

Solution:

  • Ensure control plane service exists
  • Verify agent AGENTFIELD_SERVER environment variable
  • Check network policies (if using Calico, Cilium, etc.)

Database Connection Pool Exhausted

Symptom: "too many connections" errors

Check current connections:

kubectl exec -it postgres-0 -- psql -U agentfield -c "SELECT count(*) FROM pg_stat_activity;"

Solutions:

  1. Reduce control plane replicas
  2. Configure AGENTFIELD_POSTGRES_MAX_CONNS environment variable
  3. Use PgBouncer as connection pooler

Production Best Practices

For complete production hardening (RBAC, network policies, security contexts, resource limits), see the Production Hardening Guide.

Quick checklist:

  • Use managed PostgreSQL (not StatefulSet)
  • Enable HPA for auto-scaling
  • Configure resource limits
  • Use TLS for all external traffic
  • Store secrets in Kubernetes Secrets (not env vars)
  • Enable pod security policies
  • Set up monitoring and alerting


Kubernetes deploys Agentfield at enterprise scale. Stateless control plane. Independent agents. Standard patterns. Production-ready.