Reach Us

Setting up a Horizontal Pod Autoscaler for Kubernetes cluster

In a docker-based micro-service setup, the application does not automatically scale based on the number of users accessing it during business hours.

To resolve this, the team at CloudifyOps suggested implementing a Horizontal Pod Autoscaler (HPA) in the Kubernetes environment. With this approach, we can scale the application pod based on CPU and memory utilization. It further reduces the need to run more pods (more replicas) during non business hours.

Introduction:

Kubernetes autoscaling:

The three scalability tools that Kubernetes has are the Horizontal pod autoscaler, Vertical pod autoscaler (VPA) and the cluster autoscaler. HPA and VPA tools are used to scale up and monitor the application layer.

Horizontal pod autoscaling:

When a spike or drop in consumption occurs, Kubernetes can automatically decrease or increase the number of pods that serve the workload.

Vertical pod autoscaling:

Deciding how much compute resources to dedicate to a particular workload is challenging. With the right configuration, Kubernetes can help you get the most out of the allocated resources.

Requirement:

  • We need one Kubernetes cluster configuration ready to deploy.

Steps to follow:

Installing the metrics-server: The goal of the HPA is to make scaling decisions based on the per-pod resource metrics that are retrieved from the metrics API (metrics.k8s.io).

Create the cluster without giving the –yes argument to it. This will only create the configuration. Now, we need to make the below changes to the metrics server configuration.

The metrics server will be useful in creating the HPA.

For Cluster created with KOPS, follow the steps:

  • kops edit cluster <.cluster name.>

add the below configuration to your cluster configuration under kubelet

kubelet:

anonymousAuth: false

authorizationMode: Webhook

authenticationTokenWebhook: true

No alt text provided for this image

After making the changes, we should update the cluster. With the below command, the cluster will be created with the required configuration.

  • kops update cluster –name bittergourd.xyz –yes –admin

If you are changing the configuration after deploying the cluster, you need to run the rolling-update, which causes the master to terminate and redeploy. Later, new nodes will be deployed and the old nodes will be terminated.

To avoid this recreation, we are following the above steps while creating the Kubernetes cluster with kops (Kubernetes operations).

Now we need to install the metrics server

Below is the output of the metrics server creation step.

No alt text provided for this image

To confirm the metrics server installation

  • kubectl get pods -n kube-system

You will find metrics server pod in the pods list.

No alt text provided for this image

We can find out the memory and CPUutilization of pods and nodes using below commands

  • kubectl top node <.node name.>
  • kubectl top pod -n <namespce> <.pod_name.>
  • kubectl top nodes
  • kubectl top -n <.namespace.>

The output should look like this.

No alt text provided for this image

Resource Requests and limits:

If the resource limit of a pod is exceeded, then it can use more than its requested resource. However, a container can’t use more than its resource limit.

If you set a memory request for 256 MiB, and a container is in a scheduled pod, then it can use more RAM.

If the limit is set at 4GiB, the kubelet enforces the limit. The runtime stops the process that tries to consume more than the permitted amount of memory.

No alt text provided for this image

Configuring HPA:

It is important that we have resource requests and limits mentioned in the container resources like shown in the above image.

First, we will start a deployment running the image and expose it as a service using the following command

  • kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

One new deployment and service will be created with the above command. After completing the deployment, we need to deploy the HPA.

Follow below commands:

echo ‘apiVersion: autoscaling/v1

kind: HorizontalPodAutoscaler

metadata:

name: php-apache

spec:

scaleTargetRef:

apiVersion: apps/v1

kind: Deployment

name: php-apache

minReplicas: 1

maxReplicas: 10

targetCPUUtilizationPercentage: 50 ‘ | kubectl apply -f –

The HPA will be deployed as shown here.

No alt text provided for this image

As there is no load applied on the deployment, the targets show 0%/50%. To test the HPA, we shall apply load on the deployment.

  • kubectl run -i –tty load-generator –rm –image=busybox –restart=Never — /bin/sh -c “while sleep 0.01; do wget -q -O- http://php-apache; done”

Run the above command to apply some load on the deployment. You will get an output like below.

No alt text provided for this image

The deployment is scaled up. If you see in the below image, when the load increases above  50%, the deployment scaled up to 7 pods.

No alt text provided for this image

The default time to scale down is 300 seconds. The scale down time can be customized to suit different requirements.

No alt text provided for this image

Note: If you use AWS EKS, the metric server needs to be enabled with the following command.

With the above exercise, we applied load on the CPU. Similarly, we can configure for memory usage as well.

To learn more about these cutting edge technologies & real time industry applied best practices, follow our LinkedIn Page. To explore our services, visit our website.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google
Spotify
Consent to display content from - Spotify
Sound Cloud
Consent to display content from - Sound
Contact Us