Kubernetes
Cluster Autoscaler
Cluster autoscaling is an infrastructural-level tool that increases or decreases the size of a Kubernetes cluster based on the presence of pending pods and node utilization metrics.
The cluster autoscaler is configured with the EC2 autoscaling group.
It modifies your worker node groups to scale out when you need more resources and scale in when you have underutilized resources.
Deploy the Kubernetes cluster autoscaler
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
Create an IAM Policy for the cluster autoscaler
- Access the AWS IAM Console
- Create a new policy
- Click on Policies in the sidebar
- Click Create policy
- Define permissions
- Switch to the JSON tab
- Paste the following policy document, which includes permissions for managing Auto Scaling Groups and EC2 instances
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeLaunchTemplateVersions"
],
"Resource": "*"
}
]
}TipThis policy grants permissions across all resources (
"Resource": "*"
). If you want to restrict permissions to specific resources for better security, you can specify the ARN of your Auto Scaling Groups.
- Review and create policy
- Click Next: Tags (you can add tags if desired)
- Click Next: Review
- Name the policy, e.g.,
EKSClusterAutoscalerPolicy
- Add a description, e.g.,
Policy for EKS Cluster Autoscaler
- Click Create policy
Associate the IAM Role with a Kubernetes Service Account
To allow the Cluster Autoscaler deployment to use the IAM role, you'll create a Kubernetes service account annotated with the IAM role ARN.
eksctl create iamserviceaccount --name cluster-autoscaler --namespace kube-system --cluster my-cluster --role-name EKSClusterAutoscalerRole \
--attach-policy-arn arn:aws:iam::111122223333:policy/EKSClusterAutoscalerPolicy --region us-east-2 --override-existing-serviceaccounts --approve
Configure the Cluster Autoscaler Deployment
Let's modify the Cluster Autoscaler deployment to explicitly specify the AWS cloud provider.
kubectl -n kube-system edit deployment.apps/cluster-autoscaler
In the spec.template.spec.containers.command
section, ensure the following flags are set:
spec:
template:
spec:
containers:
- name: cluster-autoscaler
image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.27.3
command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --cluster-name=<your-cluster-name>
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<your-cluster-name>
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false
- --ignore-daemonsets-utilization
- --scan-interval=10s
- --scale-down-delay-after-add=10m
- --scale-down-unneeded-time=10m
- --scale-down-utilization-threshold=0.5
- --stderrthreshold=info
- --v=4
# ... rest of the configuration ...
After making the changes, save the file and exit the editor. This will update the Cluster Autoscaler deployment with the new configuration.
Verify the Cluster Autoscaler Configuration
Now that the Cluster Autoscaler is updated to use the service account and configured correctly verify that it's running and has the necessary permissions.
- Check the Deployment Status
Ensure the Cluster Autoscaler deployment is running:
kubectl -n kube-system get deployment cluster-autoscaler
- Check the Cluster Autoscaler Logs
Monitor the logs to ensure there are no errors related to permissions or configurations:
Look for messages indicating successful startup, such as:
kubectl -n kube-system logs deployment.apps/cluster-autoscaler
Also, check for any errors related to IAM permissions, such asI1025 10:45:01.081123 1 auto_scaling_groups.go:154] Registering ASG <ASG_NAME_1>
I1025 10:45:01.081150 1 auto_scaling_groups.go:154] Registering ASG <ASG_NAME_2>AccessDenied
errors. If you see such errors, revisit the IAM policy and role configurations.
Test the Autoscaling Functionality
To ensure that the Cluster Autoscaler functions correctly, perform a test by deploying a NitroEnclaveDeployment and observing if the autoscaler scales the node group appropriately.