Tips to rightsize your K8s workloads

With most cloud providers offering Kubernetes as a service, it has become very easy to implement it to deploy your applications. However, Kubernetes has many constructs, including containers, pods, schedulers, and Pod Disruption Budgets (PDBs).

With so many moving parts to your application, along with the abstraction on top of the nodes, developers may sometimes forget to check the resource utilization of underlying nodes. Not having proper visibility on these can lead to higher unnecessary costs.

This blog post will look at the most basic entities you can monitor to save on resources that are not being utilized properly, namely, the CPU and memory.

How CPU and memory impact scheduling in Kubernetes?

You can define CPU and memory requirements in terms of limits and requests when configuring Kubernetes containers. CPU and memory requests allow you to determine where to schedule the pods, while limits make sure that you don’t end up using more than what has been defined.

To understand how CPU and memory resources are utilized, let’s explore Kubernetes scheduling.

Kubernetes scheduler

A Kubernetes scheduler picks the right node on which to run a workload. Kubernetes users or developers are generally not aware of where their workload ends up running. This is because the nodes are abstracted away, and they then just see a pool of resources in terms of CPU and memory. Behind the scenes, the scheduler does all the calculations around where a particular pod should land and assigns the node to the workload.

How does a scheduler make its decision?

A scheduler makes use of CPU and memory requests to decide where it should run the workload. It tallies all the memory and CPU that the containers in a pod will require, finds the first node where those resources are available, and assigns it to that node.

What role do limits play?

Let’s assume you have requested a certain amount of resources in Kubernetes, and based on that, Kubernetes assigns a node. What will happen if the container starts using extra resources? It can actually impact the neighboring containers. This is known as a noisy neighbor and is a common problem. To avoid such cases, you can set a limit for the maximum resources any container can use.

How to rightsize Kubernetes workloads and nodes

One of the ways to start with optimization is to examine historical data. This is also an impactful way to identify the resource utilization of your pods and nodes.

Monitor historical data and metrics

Every application has an access pattern that it follows. For example, ride-hailing apps will see a surge in weekday mornings and evenings and much less traffic on Sundays. Similarly, food-delivery apps will be accessed more during conventional meal times. Once you have identified such patterns, you can monitor usage data during those peaks and decide how to optimize your application based on that.

So, let’s say you have allocated four cores to your application, but even at peak usage, it is only utilizing one core. Thus, you can easily scale it down to two cores.

Understand your workload and new service requirements

Different types of workloads need different types of resources. You have to determine whether your workload is dependent on CPU or memory and then allocate resources accordingly.

Let’s say you have two sets of nodes: one that is memory-intensive and another with more CPU cores. Any services that use more CPU should be scheduled on the CPU-optimized nodes while those requiring high memory should use the memory-optimized nodes. You can distribute your workload between the two sets with the help of affinity, taints, and tolerations.

Messages such as “OOM” (out of memory) in the application logs can indicate that you need to increase memory. Frequent health check failures, on the other hand, can suggest high CPU usage.

Estimate new service requirements

When onboarding a new service, it is important to perform basic tests to estimate proper usage; otherwise, the improper allocation of resources can impact performance.

If you see that your application is CPU- or memory-intensive, you have solved half of your problem. After this, you can estimate the minimum and maximum these services will utilize and easily set the request and limit parameters, solving the problem of underutilized resources.

Set up proper autoscaling

Autoscaling enables dynamic rightsizing. For example, if you are not using the system at night, you can scale it down and then scale it back up in the morning.

You can also configure your application to scale with traffic. Cluster and pod autoscalers, such as Karpenter from AWS, allow you to do this. These allow you to perform autoscaling based on CPU and memory usage, enabling you to manage unexpected traffic and optimize resource consumption. You may also opt for time-based autoscaling for scenarios where you know you need to scale your application down at given times or on specific days.

Scale down your non-production workloads

You can save on costs by running a minimal version of your application in your staging and development environments. Staging and development environments do not serve a lot of traffic, so you can assign it one or even less than one core.

Optimize your applications

Another important aspect to understand is that you cannot save resources if your application is abusing them due to bad code or design. Therefore, make sure to look at all the applications eating up resources and if the amount they are consuming is justified.

How Site24x7 Helps in Rightsizing Kubernetes Workloads

Site24x7 can continuously monitor your workloads for utilization and performance. It allows you to see the resources being used by your application at peak times and make informed decisions. You can also set up alerts for resource utilization to act fast when needed.

For example, you can receive notifications if a certain number of nodes are experiencing CPU utilization of less than 10% and then scale your cluster down to save on costs.

Below are a few of the important metrics you can track with Site24x7:

  • CPU and memory utilization of application pods: This is the total usage for all containers running in the pod; any container using much less or more than what has been requested can be a potential target for optimization.
  • CPU and memory utilization of nodes: Keeping a check on your CPU and node utilization is critical. If many workloads land on one node while the rest of the nodes are empty, you can delete extra nodes and save costs.
  • Allocation status of Kubernetes nodes: If you have multiple nodes with less than 40%-50% resource allocation, you can very easily optimize spend by scaling down your cluster.
  • CPU and memory request and limits: Site24x7 captures all pod requests and limits for both CPU and memory. It then presents these in the dashboard for you to take corrective action in the case of misconfiguration.


In this era of cloud computing, implementing a tool such as Kubernetes can be fairly straightforward, but you may encounter issues at scale. If you do not optimize your clusters for cost and performance, you may end up increasing the cost of your infrastructure simply to keep your applications running. Organizations must adopt smart autoscaling and understand their application requirements from the start.

As the saying goes, you cannot optimize anything you cannot measure. Continuously monitoring memory and CPU utilization is the key to rightsizing your Kubernetes workloads for optimal performance.

As the saying goes, you cannot optimize anything you cannot measure. Continuously monitoring memory and CPU utilization is the key to rightsizing your Kubernetes workloads for optimal performance.

Sign up for a demo of Site24x7 today to learn how we can help you keep an eye on component ​​health and performance across your entire Kubernetes infrastructure—all from a single intuitive console.

Was this article helpful?

Related Articles

Write For Us

Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 "Learn" portal. Get paid for your writing.

Write For Us

Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.

Apply Now
Write For Us