In a docker-based micro-service setup, the application does not automatically scale based on the number of users accessing it during business hours.
To resolve this, the team at CloudifyOps suggested implementing a Horizontal Pod Autoscaler (HPA) in the Kubernetes environment. With this approach, we can scale the application pod based on CPU and memory utilization. It further reduces the need to run more pods (more replicas) during non business hours.
The three scalability tools that Kubernetes has are the Horizontal pod autoscaler, Vertical pod autoscaler (VPA) and the cluster autoscaler. HPA and VPA tools are used to scale up and monitor the application layer.
When a spike or drop in consumption occurs, Kubernetes can automatically decrease or increase the number of pods that serve the workload.
Deciding how much compute resources to dedicate to a particular workload is challenging. With the right configuration, Kubernetes can help you get the most out of the allocated resources.
Installing the metrics-server: The goal of the HPA is to make scaling decisions based on the per-pod resource metrics that are retrieved from the metrics API (metrics.k8s.io).
Create the cluster without giving the –yes argument to it. This will only create the configuration. Now, we need to make the below changes to the metrics server configuration.
The metrics server will be useful in creating the HPA.
For Cluster created with KOPS, follow the steps:
add the below configuration to your cluster configuration under kubelet
kubelet:
anonymousAuth: false
authorizationMode: Webhook
authenticationTokenWebhook: true
After making the changes, we should update the cluster. With the below command, the cluster will be created with the required configuration.
If you are changing the configuration after deploying the cluster, you need to run the rolling-update, which causes the master to terminate and redeploy. Later, new nodes will be deployed and the old nodes will be terminated.
To avoid this recreation, we are following the above steps while creating the Kubernetes cluster with kops (Kubernetes operations).
Now we need to install the metrics server
Below is the output of the metrics server creation step.
To confirm the metrics server installation
You will find metrics server pod in the pods list.
We can find out the memory and CPUutilization of pods and nodes using below commands
The output should look like this.
Resource Requests and limits:
If the resource limit of a pod is exceeded, then it can use more than its requested resource. However, a container can’t use more than its resource limit.
If you set a memory request for 256 MiB, and a container is in a scheduled pod, then it can use more RAM.
If the limit is set at 4GiB, the kubelet enforces the limit. The runtime stops the process that tries to consume more than the permitted amount of memory.
It is important that we have resource requests and limits mentioned in the container resources like shown in the above image.
First, we will start a deployment running the image and expose it as a service using the following command
One new deployment and service will be created with the above command. After completing the deployment, we need to deploy the HPA.
Follow below commands:
echo ‘apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50 ‘ | kubectl apply -f –
The HPA will be deployed as shown here.
As there is no load applied on the deployment, the targets show 0%/50%. To test the HPA, we shall apply load on the deployment.
Run the above command to apply some load on the deployment. You will get an output like below.
The deployment is scaled up. If you see in the below image, when the load increases above 50%, the deployment scaled up to 7 pods.
The default time to scale down is 300 seconds. The scale down time can be customized to suit different requirements.
Note: If you use AWS EKS, the metric server needs to be enabled with the following command.
With the above exercise, we applied load on the CPU. Similarly, we can configure for memory usage as well.
To learn more about these cutting edge technologies & real time industry applied best practices, follow our LinkedIn Page. To explore our services, visit our website.
CloudifyOps Pvt Ltd, Ground Floor, Block C, DSR Techno Cube, Survey No.68, Varthur Rd, Thubarahalli, Bengaluru, Karnataka 560037
Indiqube Vantage, 3rd Phase, No.1, OMR Service Road, Santhosh Nagar, Kandhanchavadi, Perungudi, Chennai, Tamil Nadu 600096.
CloudifyOps Inc.,
200, Continental Dr Suite 401,
Newark, Delaware 19713,
United States of America
Copyright 2024 CloudifyOps. All Rights Reserved