Kubernetes
Cluster Autoscaler
Cluster autoscaler is an infrastructural-level tool that increases or decreases the size of a Kubernetes cluster based on the presence of pending pods and node utilization metrics.
The cluster autoscaler is configured with the EC2 autoscaling group.
It modifies your worker node groups to scale out when you need more resources and scale in when you have underutilized resources.
Deploy the Kubernetes cluster autoscaler
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
Create an IAM Policy for the cluster autoscaler
- Click on Policies in the sidebar
- Click Create policy
- Switch to the JSON tab
- Paste the following policy document, which includes permissions for managing Auto Scaling Groups and EC2 instances
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeLaunchTemplateVersions"
],
"Resource": "*"
}
]
}TipThis policy grants permissions across all resources (
"Resource": "*"
). If you want to restrict permissions to specific resources for better security, you can specify the ARN of your Auto Scaling Groups.
- Click Next: Tags (you can add tags if desired)
- Click Next: Review
- Name the policy, e.g.,
EKSClusterAutoscalerPolicy
- Add a description, e.g.,
Policy for EKS Cluster Autoscaler
- Click Create policy
Associate the IAM Role with a Kubernetes Service Account
To allow the Cluster Autoscaler deployment to use the IAM role, you'll create a Kubernetes service account annotated with the IAM role ARN.
eksctl create iamserviceaccount --name cluster-autoscaler --namespace kube-system --cluster my-cluster --role-name EKSClusterAutoscalerRole \
--attach-policy-arn arn:aws:iam::111122223333:policy/EKSClusterAutoscalerPolicy --region us-east-2 --override-existing-serviceaccounts --approve
Configure the Cluster Autoscaler Deployment
Let's modify the Cluster Autoscaler deployment to explicitly specify the AWS cloud provider.
kubectl -n kube-system edit deployment.apps/cluster-autoscaler
In the spec.template.spec.containers.command
section, ensure the following flags are set
spec.template.spec.containers.command
section, ensure the following flags are setspec:
template:
spec:
containers:
- name: cluster-autoscaler
image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.27.3
command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --cluster-name=<your-cluster-name>
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<your-cluster-name>
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false
- --ignore-daemonsets-utilization
- --scan-interval=10s
- --scale-down-delay-after-add=10m
- --scale-down-unneeded-time=10m
- --scale-down-utilization-threshold=0.5
- --stderrthreshold=info
- --v=4
# ... rest of the configuration ...
After making the changes, save the file and exit the editor. This will update the Cluster Autoscaler deployment with the new configuration.
Verify the Cluster Autoscaler Configuration
Now that the Cluster Autoscaler is updated to use the service account and configured correctly verify that it's running and has the necessary permissions.
Ensure the Cluster Autoscaler deployment is running:
kubectl -n kube-system get deployment cluster-autoscaler
Monitor the logs to ensure there are no errors related to permissions or configurations:
kubectl -n kube-system logs deployment.apps/cluster-autoscaler
Look for messages indicating successful startup, such as:
I1025 10:45:01.081123 1 auto_scaling_groups.go:154] Registering ASG <ASG_NAME_1>
I1025 10:45:01.081150 1 auto_scaling_groups.go:154] Registering ASG <ASG_NAME_2>
Also, check for any errors related to IAM permissions, such as AccessDenied
errors. If you see such errors, revisit the IAM policy and role configurations.
Test the Autoscaling Functionality
To ensure that the Cluster Autoscaler functions correctly, perform a test by deploying a NitroEnclaveDeployment and observing if the autoscaler scales the node group appropriately.