Azure

Kubernetes on Azure: 5 YAML Tweaks That Cut Cluster Cost in Half

Team

Techseria

Team

Post Content

Your Kubernetes cluster is hemorrhaging money while you sleep.

Last month, a mid-sized fintech company came to us with a shocking discovery: their Azure Kubernetes Service was consuming $8,300 monthly—yet their actual workload requirements justified only $4,200. They were essentially paying double for unused compute capacity, inefficient scaling, and poor resource allocation.

Sound familiar? You're not alone. According to our analysis of 200+ Azure deployments, 68% of organizations waste 40-60% of their Kubernetes budget on preventable inefficiencies.

The frustrating part? Most of these cost drains stem from default configurations that prioritize convenience over cost optimization. Your cluster provisions resources for peak loads that rarely materialize, maintains oversized pods during low-traffic periods, and keeps expensive nodes running 24/7 even when workloads could run on cheaper alternatives.

But here's what most teams don't realize: dramatic cost reduction often requires just five strategic YAML modifications. These aren't complex architectural overhauls—they're surgical tweaks that align your resource consumption with actual demand.

The $47,000 Annual Savings Formula

Through systematic optimization of hundreds of Azure Kubernetes clusters, we've identified five configuration changes that consistently deliver 40-55% cost reductions:

1. Horizontal Pod Autoscaler (HPA) with Smart Thresholds

Most teams set HPA thresholds too conservatively, causing premature scaling that wastes resources. This optimized configuration saves an average of $847 monthly:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: cost-optimized-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
kind: HorizontalPodAutoscaler
metadata:
  name: cost-optimized-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
metadata:
  name: cost-optimized-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  name: cost-optimized-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    kind: Deployment
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    name: your-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      stabilizationWindowSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      policies:
      - type: Percent
        value: 50
        periodSeconds: 30
      - type: Percent
        value: 50
        periodSeconds: 30
        value: 50
        periodSeconds: 30
        periodSeconds: 30

Key optimization: The 75% CPU and 80% memory thresholds prevent wasteful early scaling, while the stabilization windows reduce thrashing between scale events.

2. Vertical Pod Autoscaler (VPA) for Right-Sizing

VPA automatically adjusts resource requests based on actual usage patterns, eliminating the common practice of over-provisioning "just to be safe":

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: resource-optimizer
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
kind: VerticalPodAutoscaler
metadata:
  name: resource-optimizer
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
metadata:
  name: resource-optimizer
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
  name: resource-optimizer
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
    apiVersion: apps/v1
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
    kind: Deployment
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
    name: your-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
  resourcePolicy:
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
    containerPolicies:
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
    - containerName: your-container
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      maxAllowed:
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
        cpu: 2000m
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
        memory: 4Gi
      controlledResources: ["cpu", "memory"]
      controlledResources: ["cpu", "memory"]

Real impact: One client's e-commerce platform reduced pod resource allocation by 42% while maintaining response times under 180ms during Black Friday traffic.

3. Pod Disruption Budget with Cost-Conscious Availability

Traditional PDBs often over-prioritize availability at the expense of cost efficiency. This configuration maintains reliability while allowing aggressive cost optimization:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: cost-aware-pdb
spec:
  minAvailable: 60%
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
kind: PodDisruptionBudget
metadata:
  name: cost-aware-pdb
spec:
  minAvailable: 60%
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
metadata:
  name: cost-aware-pdb
spec:
  minAvailable: 60%
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
  name: cost-aware-pdb
spec:
  minAvailable: 60%
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
spec:
  minAvailable: 60%
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
  minAvailable: 60%
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
  selector:
    matchLabels:
      app: your-app
  maxUnavailable: 2
    matchLabels:
      app: your-app
  maxUnavailable: 2
      app: your-app
  maxUnavailable: 2
  maxUnavailable: 2

Strategic benefit: The 60% availability threshold allows for more aggressive use of spot instances while maintaining sufficient redundancy for business continuity.

4. Node Auto-Shutdown for Non-Production Environments

Development and staging clusters running 24/7 represent pure waste. This CronJob automatically powers down clusters during off-hours:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-shutdown
spec:
  schedule: "0 19 * * 1-5"  # 7 PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
kind: CronJob
metadata:
  name: cluster-shutdown
spec:
  schedule: "0 19 * * 1-5"  # 7 PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
metadata:
  name: cluster-shutdown
spec:
  schedule: "0 19 * * 1-5"  # 7 PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
  name: cluster-shutdown
spec:
  schedule: "0 19 * * 1-5"  # 7 PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
spec:
  schedule: "0 19 * * 1-5"  # 7 PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
  schedule: "0 19 * * 1-5"  # 7 PM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
    spec:
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
      template:
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
        spec:
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          containers:
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          - name: shutdown-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            command:
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            - /bin/sh
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            - -c
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            - |
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
              az aks stop --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
kind: CronJob
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
metadata:
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
  name: cluster-startup
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
spec:
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
  schedule: "0 8 * * 1-5"   # 8 AM weekdays
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
    spec:
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
      template:
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
        spec:
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          containers:
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          - name: startup-agent
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            image: mcr.microsoft.com/azure-cli:latest
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            command:
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            - /bin/sh
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            - -c
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
            - |
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
              az aks start --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          restartPolicy: OnFailure
          serviceAccountName: cluster-operator
          serviceAccountName: cluster-operator

Cost reduction: This single change saves $1,680 monthly for a typical 3-node development cluster.

5. Spot Instance Node Pool Configuration

Azure Spot VMs offer up to 90% discounts on compute costs. This configuration maximizes spot usage while maintaining workload stability:

apiVersion: v1
kind: NodePool
metadata:
  name: spot-workers
spec:
  agentPoolProfiles:
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
kind: NodePool
metadata:
  name: spot-workers
spec:
  agentPoolProfiles:
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
metadata:
  name: spot-workers
spec:
  agentPoolProfiles:
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
  name: spot-workers
spec:
  agentPoolProfiles:
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
spec:
  agentPoolProfiles:
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
  agentPoolProfiles:
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
  - name: spotnodes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    count: 3
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    vmSize: Standard_D4s_v3
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    scaleSetPriority: Spot
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    scaleSetEvictionPolicy: Delete
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    spotMaxPrice: 0.05
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    nodeTaints:
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    - "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
    nodeLabels:
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
      "kubernetes.azure.com/scalesetpriority": "spot"
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
      "node-type": "cost-optimized"
  orchestratorProfile:
    orchestratorType: Kubernetes
  orchestratorProfile:
    orchestratorType: Kubernetes
    orchestratorType: Kubernetes

Complementary toleration for workloads:

tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
  operator: "Equal"
  value: "spot"
  effect: "NoSchedule"
nodeSelector:
  "kubernetes.azure.com/scalesetpriority": "spot"
- key: "kubernetes.azure.com/scalesetpriority"
  operator: "Equal"
  value: "spot"
  effect: "NoSchedule"
nodeSelector:
  "kubernetes.azure.com/scalesetpriority": "spot"
  operator: "Equal"
  value: "spot"
  effect: "NoSchedule"
nodeSelector:
  "kubernetes.azure.com/scalesetpriority": "spot"
  value: "spot"
  effect: "NoSchedule"
nodeSelector:
  "kubernetes.azure.com/scalesetpriority": "spot"
  effect: "NoSchedule"
nodeSelector:
  "kubernetes.azure.com/scalesetpriority": "spot"
nodeSelector:
  "kubernetes.azure.com/scalesetpriority": "spot"
  "kubernetes.azure.com/scalesetpriority": "spot"

Financial impact: Properly configured spot instances reduce compute costs by 65-80% for fault-tolerant workloads.

Measuring Success: Beyond Cost Reduction

While cost savings averaging $47,000 annually represent the primary benefit, these optimizations deliver additional value:

  • Performance consistency: Proper resource allocation eliminates performance degradation from resource contention
  • Operational efficiency: Automated scaling reduces manual intervention requirements by 78%
  • Environmental impact: Reduced resource consumption lowered one client's carbon footprint by 34%
  • Team productivity: Development teams spend 40% less time troubleshooting resource-related issues

Common Pitfalls to Avoid

Over-aggressive spot instance usage: Limit spot instances to 60% of your total capacity for production workloads. Critical services should remain on standard instances.

Insufficient monitoring: These optimizations require continuous monitoring. Implement Prometheus and Grafana to track the impact of each change.

Ignoring application-specific requirements: Machine learning workloads, databases, and stateful applications require customized optimization approaches.

The Compound Effect

These YAML modifications create a compound effect that extends beyond immediate cost savings. Teams that implement comprehensive Kubernetes cost optimization typically see:

  • 45-60% reduction in monthly Azure bills
  • 30% improvement in resource utilization efficiency
  • 25% faster deployment cycles due to right-sized environments
  • 50% reduction in performance-related incidents

The financial impact compounds monthly. A $4,000 monthly saving becomes $48,000 annually—funds that can accelerate product development, support team expansion, or improve infrastructure security.

Your Next Steps

Kubernetes cost optimization isn't a one-time project—it's an ongoing discipline that requires expertise, monitoring, and continuous refinement. These five YAML configurations provide the foundation, but maximizing your savings requires understanding your specific workload patterns, compliance requirements, and business objectives.

Start with the HPA and VPA configurations in your development environment. Monitor the impact for two weeks, then gradually expand to staging and production clusters. Track both cost metrics and application performance to ensure optimizations deliver value without compromising user experience.

Remember: every day you delay implementation costs money. A typical mid-sized organization running inefficient Kubernetes clusters loses $127 daily to preventable waste. That's $46,355 annually—money that could fund two additional senior developers or a comprehensive security upgrade.

Ready to slash your Azure Kubernetes costs without compromising performance?

Start your own digital transformation journey with us.