Cluster Autoscaler

The Cluster Autoscaler is a tool that automatically adjusts the size of a Kubernetes cluster so that all pods have a place to run and no unneeded nodes remain.

When pods are unschedulable because there are not enough resources, the Cluster Autoscaler scales up the cluster. When nodes are underutilized, the Cluster Autoscaler scales the cluster down.

Cluster API supports the Cluster Autoscaler. See the Cluster Autoscaler on Cluster API for more information.

Getting started with the Cluster Autoscaler on Kamaji

Kamaji supports the Cluster Autoscaler through Cluster API. There are several ways to run the Cluster Autoscaler with Cluster API. In this guide, we leverage the unique features of Kamaji to run the Cluster Autoscaler as part of the Hosted Control Plane.

In other words, the Cluster Autoscaler runs as a pod in the Kamaji Management Cluster, alongside the Tenant Control Plane pods, and connects directly to the API server of the workload cluster. This approach hides sensitive data from the tenant. It works by mounting the kubeconfig of the tenant cluster into the Cluster Autoscaler pod.

Create the workload cluster

Create a workload cluster using the Kamaji Control Plane Provider and the Infrastructure Provider of your choice. The following example creates a workload cluster using the vSphere Infrastructure Provider.

The template file capi-kamaji-vsphere-autoscaler-template.yaml provides a full example of a cluster with the autoscaler enabled. You can generate the cluster manifest using clusterctl.

Before doing so, list all the variables in the template file:

cat capi-kamaji-vsphere-autoscaler-template.yaml | clusterctl generate yaml --list-variables

Fill them with the desired values and generate the manifest:

clusterctl generate yaml \
    --from capi-kamaji-vsphere-autoscaler-template.yaml \
    > capi-kamaji-vsphere-cluster.yaml

Apply the generated manifest to create the ClusterClass:

kubectl apply -f capi-kamaji-vsphere-cluster.yaml

Install the Cluster Autoscaler

Install the Cluster Autoscaler via Helm in the Management Cluster, in the same namespace where the workload cluster is deployed.

Options for installing the Cluster Autoscaler

The Cluster Autoscaler works on a single cluster, meaning every cluster must have its own Cluster Autoscaler instance. This can be addressed by leveraging Project Sveltos automations to deploy a Cluster Autoscaler instance for each Kamaji Cluster API instance.

helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo update
helm upgrade --install ${CLUSTER_NAME}-autoscaler autoscaler/cluster-autoscaler \
    --set cloudProvider=clusterapi \
    --set autodiscvovery.namespace=default \
    --set "autoDiscovery.labels[0].autoscaling=enabled" \
    --set clusterAPIKubeconfigSecret=${CLUSTER_NAME}-kubeconfig \
    --set clusterAPIMode=kubeconfig-incluster

The autoDiscovery.labels values are used to dynamically select clusters to autoscale.

These labels must be set on the workload cluster, specifically in the Cluster and MachineDeployment resources.

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: sample
    # Cluster Autoscaler labels
    autoscaling: enabled
  name: sample

# other fields omitted for brevity
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  annotations:
    # Cluster Autoscaler annotations
    cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "0"
    cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "6"
    capacity.cluster-autoscaler.kubernetes.io/cpu: "2"  # YMMV
    capacity.cluster-autoscaler.kubernetes.io/memory: 4Gi  # YMMV
    capacity.cluster-autoscaler.kubernetes.io/maxPods: "110"  # YMMV
  labels:
    cluster.x-k8s.io/cluster-name: sample
    # Cluster Autoscaler labels
    autoscaling: enabled
  name: sample-md-0

# other fields omitted for brevity
---
# other Cluster API resources omitted for brevity

Verify the Cluster Autoscaler

To verify that the Cluster Autoscaler is working as expected, deploy a workload in the Tenant cluster with specific CPU requirements to simulate resource demand.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: hello-node
  name: hello-node
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hello-node
  template:
    metadata:
      labels:
        app: hello-node
    spec:
      containers:
      - image: quay.io/google-containers/pause-amd64:3.0
        imagePullPolicy: IfNotPresent
        name: pause-amd64
        resources:
          limits:
            cpu: 500m

Apply the workload to the Tenant cluster and simulate a load spike by increasing the number of replicas. The Cluster Autoscaler should scale up the cluster to accommodate the workload. Cooldown times must be configured correctly on a per-cluster basis.

Possible Resource Wastage

With the Cluster Autoscaler, new machines may be created very quickly, which can lead to over-provisioning and potentially wasted resources. The official Cluster Autoscaler documentation should be consulted to configure appropriate values based on your infrastructure and provisioning times.

`ProvisioningRequest` support

The ProvisioningRequest introduces a Kubernetes-native way for Cluster Autoscaler to request new capacity without talking directly to cloud provider APIs. Instead of embedding provider-specific logic, the autoscaler simply describes the capacity it needs, and an external provisioner decides how to create the required nodes. This makes scaling portable across clouds, on-prem platforms, and custom provisioning systems, while greatly reducing complexity inside the autoscaler.

Once the cluster has been provisioned, install the ProvisioningRequest definition.

kubectl kamaji kubeconfig get capi-quickstart-kubevirt > /tmp/capi-quickstart-kubevirt
KUBECONFIG=/tmp/capi-quickstart-kubevirt kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/refs/tags/cluster-autoscaler-1.34.1/cluster-autoscaler/apis/config/crd/autoscaling.x-k8s.io_provisioningrequests.yaml

Proceed with the installation of Cluster Autoscaler by enabling some additional parameters: YMMV.

cloudProvider: clusterapi
autoDiscovery:
  namespace: default
  labels:
  - autoscaling.x-k8s.io: enabled

clusterAPIKubeconfigSecret: capi-quickstart-kubeconfig
clusterAPIMode: kubeconfig-incluster

extraArgs:
  enable-provisioning-requests: true
  kube-api-content-type: "application/json"
  cloud-config: /etc/kubernetes/management/kubeconfig

extraVolumeSecrets:
  # Mount the management kubeconfig to talk with the management cluster:
  # the in-rest configuration doesn't work
  management-kubeconfig:
    name: management-kubeconfig
    mountPath: /etc/kubernetes/management
    items:
    - key: kubeconfig
      path: kubeconfig

The Cluster Autoscaler should be up and running, enabled to connect to the management and tenant cluster API Server: follow the official example from the repository to assess the ProvisioningRequest feature.