Resource-Based Control Plane Autoscaling
Resource-based control plane autoscaling enables automatic sizing of HostedClusters based on actual Kube API server resource usage rather than worker node count. This feature uses Vertical Pod Autoscaler (VPA) recommendations to determine the optimal cluster size class for a HostedCluster.
Platform Support
Important: This feature is only available for HostedClusters using the request serving isolation architecture on AWS. The feature requires the dedicated-request-serving-components topology annotation to be set on the HostedCluster.
Prerequisites
Before enabling resource-based control plane autoscaling, ensure the following prerequisites are met:
VPA Operator
The Vertical Pod Autoscaler operator must be installed on the management cluster via Operator Lifecycle Manager (OLM). The VPA operator only supports the OwnNamespace install mode, so it must be installed in its own namespace.
Step 1: Create the VPA Namespace
Create the namespace for the VPA operator:
kubectl create namespace openshift-vertical-pod-autoscaler
Step 2: Create the OperatorGroup
Create an OperatorGroup that targets only the VPA namespace:
kubectl apply -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: vpa-operator-group
namespace: openshift-vertical-pod-autoscaler
spec:
targetNamespaces:
- openshift-vertical-pod-autoscaler
EOF
Step 3: Create the Subscription
Create a Subscription to install the VPA operator from the redhat-operators catalog source:
kubectl apply -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: vertical-pod-autoscaler
namespace: openshift-vertical-pod-autoscaler
spec:
channel: stable
name: vertical-pod-autoscaler
source: redhat-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Automatic
EOF
Step 4: Verify Installation
Wait for the ClusterServiceVersion (CSV) to reach the Succeeded phase:
kubectl wait --for=condition=phase=Succeeded \
csv -n openshift-vertical-pod-autoscaler \
-l operators.coreos.com/vertical-pod-autoscaler.openshift-vertical-pod-autoscaler \
--timeout=15m
Verify that the VPA CRD is available:
kubectl get crd verticalpodautoscalers.autoscaling.k8s.io
Verify that the VPA recommender deployment is available:
kubectl wait --for=condition=available \
deployment/vpa-recommender-default \
-n openshift-vertical-pod-autoscaler \
--timeout=15m
VPA Controller Instance
After the VPA operator is installed, a VerticalPodAutoscalerController instance named default is automatically created in the openshift-vertical-pod-autoscaler namespace. You must configure this controller instance for resource-based autoscaling.
Example VerticalPodAutoscalerController configuration:
apiVersion: autoscaling.openshift.io/v1
kind: VerticalPodAutoscalerController
metadata:
name: default
namespace: openshift-vertical-pod-autoscaler
spec:
recommendationOnly: true
deploymentOverrides:
recommender:
container:
args:
- --memory-aggregation-interval=1h
- --memory-aggregation-interval-count=12
- --memory-histogram-decay-half-life=1h
For a complete list of VPA recommender configuration flags, refer to the VPA recommender documentation.
ClusterSizingConfiguration
A ClusterSizingConfiguration resource named cluster must exist in the management cluster with:
- Optional: Size configurations that include
capacityspecifications with memory values. Ifcapacityis not specified for a size, the system will introspect the memory and CPU capacity from existing MachineSets in theopenshift-machine-apinamespace. The system looks for MachineSets with thehypershift.openshift.io/cluster-sizelabel matching the size name and reads themachine.openshift.io/memoryMbandmachine.openshift.io/vCPUannotations from those MachineSets. The first MachineSet found for each size label is used as the authoritative source for that size's capacity. - Optional:
resourceBasedAutoscaling.kubeAPIServerMemoryFractionconfiguration
Enabling Resource-Based Autoscaling
To enable resource-based control plane autoscaling for a HostedCluster, add the following annotation:
apiVersion: hypershift.openshift.io/v1beta1
kind: HostedCluster
metadata:
name: example-cluster
namespace: clusters
annotations:
hypershift.openshift.io/topology: dedicated-request-serving-components
hypershift.openshift.io/resource-based-cp-auto-scaling: "true"
The feature requires both annotations:
- hypershift.openshift.io/topology: dedicated-request-serving-components - Enables request serving isolation architecture
- hypershift.openshift.io/resource-based-cp-auto-scaling: "true" - Enables resource-based autoscaling
Annotations
The following annotations are used by the resource-based control plane autoscaling feature:
Input Annotations
-
hypershift.openshift.io/resource-based-cp-auto-scaling: Set to"true"to enable resource-based autoscaling for the HostedCluster. This annotation must be used in conjunction with thededicated-request-serving-componentstopology. -
hypershift.openshift.io/topology: Must be set todedicated-request-serving-componentsfor the feature to work.
Output Annotations
hypershift.openshift.io/recommended-cluster-size: This annotation is automatically set by the autoscaler controller with the recommended cluster size class based on VPA recommendations. The value corresponds to a size name from theClusterSizingConfiguration.
ClusterSizingConfiguration
The ClusterSizingConfiguration resource controls how cluster sizes are determined. For resource-based autoscaling, configure the following:
Size Capacity Configuration
Each size configuration in spec.sizes can include a capacity field with memory and cpu specifications:
apiVersion: scheduling.hypershift.openshift.io/v1alpha1
kind: ClusterSizingConfiguration
metadata:
name: cluster
spec:
sizes:
- name: small
criteria:
from: 0
to: 10
capacity:
memory: 32Gi
cpu: "8"
- name: medium
criteria:
from: 11
to: 50
capacity:
memory: 64Gi
cpu: "16"
- name: large
criteria:
from: 51
capacity:
memory: 128Gi
cpu: "32"
Resource-Based Autoscaling Configuration
The resourceBasedAutoscaling section allows you to configure how memory recommendations are interpreted:
spec:
resourceBasedAutoscaling:
kubeAPIServerMemoryFraction: "0.65"
kubeAPIServerMemoryFraction: A value between 0 and 1 that determines what fraction of a machine's total memory is available for the Kube API server pod. This fraction is used to determine whether a VPA memory recommendation fits within a particular cluster size. If not specified, a default fraction of 0.65 is used.
The autoscaler selects the smallest cluster size for which:
machine_memory * kubeAPIServerMemoryFraction >= VPA_recommended_memory
How It Works
-
When resource-based autoscaling is enabled, the HyperShift operator creates a
VerticalPodAutoscalerresource targeting thekube-apiserverdeployment in the control plane namespace. -
The VPA monitors the kube-apiserver container's resource usage and generates memory recommendations.
-
The autoscaler controller reads the VPA recommendations and determines the appropriate cluster size class based on:
- The VPA's memory recommendation for the kube-apiserver container
- The machine memory capacity for each size class (from
ClusterSizingConfiguration) -
The configured
kubeAPIServerMemoryFraction -
The controller sets the
hypershift.openshift.io/recommended-cluster-sizeannotation on the HostedCluster with the recommended size. -
The hosted cluster sizing controller uses this recommendation (along with other factors) to determine the actual cluster size applied to the HostedCluster.
Monitoring
You can monitor the autoscaling behavior by checking:
- The
hypershift.openshift.io/recommended-cluster-sizeannotation on the HostedCluster - The VPA resource status in the control plane namespace:
kubectl get vpa kube-apiserver -n <control-plane-namespace> - The VPA recommendation conditions and container recommendations
Troubleshooting
If the recommended cluster size is not being set:
- Verify that both required annotations are present on the HostedCluster
- Check that the VPA operator is installed and the VPA controller instance exists
- Verify that the VPA resource exists in the control plane namespace and has valid recommendations
- Ensure the
ClusterSizingConfigurationhas size configurations with capacity specifications - Check controller logs for errors related to size cache updates or VPA reconciliation