Running Kubernetes in production is more than just deploying pods and hoping for the best. After managing clusters across GCP, AWS, and on-premises environments, I've compiled the practices that consistently separate reliable, cost-efficient clusters from fragile, expensive ones.
1. Always Set Resource Requests and Limits
Without resource requests and limits, pods can starve each other and the scheduler cannot make intelligent placement decisions.
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
Why it matters: Requests tell the scheduler how much capacity to reserve. Limits prevent a runaway process from consuming node resources and causing cascading failures.
2. Use Namespaces to Isolate Environments
Never run dev, staging, and production workloads in the same namespace. Use separate namespaces — or ideally separate clusters — with NetworkPolicy objects to enforce isolation.
kubectl create namespace production
kubectl create namespace staging
Apply ResourceQuota per namespace to cap total resource consumption and prevent one team from starving another.
3. Implement Liveness and Readiness Probes
Kubernetes cannot know your application is healthy without probes. Missing probes means traffic gets routed to broken pods and stuck deployments never fail fast.
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
4. Use Pod Disruption Budgets
During node upgrades or voluntary evictions, Kubernetes can terminate multiple pods simultaneously. A PodDisruptionBudget ensures a minimum number of replicas stay available.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api
5. Secure Your Cluster with RBAC
Principle of least privilege applies to Kubernetes too. Create ServiceAccounts with only the permissions each workload needs, and never use the default service account with broad permissions.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
6. Store Secrets Properly
Never commit Kubernetes Secrets encoded in base64 to version control — base64 is not encryption. Use:
- Sealed Secrets (Bitnami) for GitOps workflows
- External Secrets Operator with AWS Secrets Manager or GCP Secret Manager
- HashiCorp Vault with the Vault Agent Injector for dynamic secrets
7. Configure Horizontal Pod Autoscaling
Let Kubernetes scale your workloads automatically based on CPU, memory, or custom metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
8. Use Multi-Zone Node Pools
Single-zone node pools are a single point of failure. Spread nodes across availability zones and use topologySpreadConstraints to ensure pod distribution follows.
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: api
Closing Thoughts
Kubernetes is powerful but unforgiving. The teams that operate it reliably aren't using secret tools — they're disciplined about the fundamentals: resource management, health checks, security boundaries, and automation. Start with these eight practices and you'll avoid the most common production incidents.
Have a Kubernetes challenge you're dealing with? Book a free consultation and let's talk through it.