- Create a Kubernetes Helm Chart with example - Wed, Sep 27 2023
- Create your first Grafana dashboard - Mon, Sep 18 2023
- Kubernetes resource requests, resource limits, and LimitRanges - Fri, Sep 15 2023
Resource requests
In Kubernetes, resource requests are used to specify the amount of resources, such as CPU and memory, that a container needs. When you specify the resource request for containers in a Pod, the kube-scheduler uses this information to decide which node to place the Pod on. Containers running within Pods use resource requests to communicate to Kubernetes the amount of CPU and memory resources required to function.
Another way to think of resource requests is that they are the minimum amount of resources that a container needs to run. If a container does not get its requested resources, it may not be able to run properly or may be terminated by Kubernetes.
Resource limits
Resource limits set the maximum CPU and memory resources allowed per container on a node. This acts as a safeguard to prevent one container from overusing its resources and negatively affecting performance across other containers on that node.
Kubernetes enforces these limits by throttling or terminating containers that exceed their specified limits. Setting resource requests and limits provides Kubernetes with the information necessary for intelligent scheduling decisions and effective resource management. This contributes to efficient resource utilization while upholding predictability in performance levels.
In a nutshell, kube-scheduler uses resource requests to decide which node to place the Pod on. If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a container to use more resources than its request for that resource specifies. However, a container is not allowed to use more than its resource limit.
Limit ranges (LimitRanges)
Limit ranges are used to constrain the resource allocations (limits and requests) that you can specify for each applicable object kind (such as Pod or PersistentVolumeClaim) in a namespace (a virtual cluster of Pods in a physical cluster). In the Kubernetes documentation, you will often find the spelling "LimitRange," which is the correct resource kind.
A LimitRange provides constraints that can:
- Enforce minimum and maximum compute resource usage per Pod or container in a namespace.
- Enforce minimum and maximum storage requests per PersistentVolumeClaim in a namespace.
- Enforce a ratio between request and limit for a resource in a namespace.
- Set default request/limit for compute resources in a namespace and automatically inject them into containers at runtime 1.
If you attempt to create or update an object (Pod or PersistentVolumeClaim) that violates a LimitRange constraint, your request to the API server will fail with an HTTP status code of 403 Forbidden and a message explaining the constraint that has been violated.
Setting resource requests, limits, and limit ranges
Kubernetes assigns CPU resources not in percentages but in millicores or millicpu. In Kubernetes, millicores and millicpu are used interchangeably. Memory, on the other hand, is measured in bytes.
Millicpu is a unit of measurement used in computing to represent the amount of CPU resources allocated to a process or container. It is equivalent to one one-thousandth (1/1000) of a CPU core 1. In Kubernetes, CPU is measured in terms of CPU units, where 1 CPU = 1 vCPU/Core (for cloud providers) = 1 hyperthread (for bare metal processors) Therefore, 1 millicpu is equal to 0.001 CPU or 1/1000th of a CPU.
Here's an example of how resource requests and limits are set in a Kubernetes manifest (YAML). Below is the YAML configuration to deploy the Nginx container in the my-app Pod.
apiVersion: v1 kind: Pod metadata: name: my-app spec: containers: - name: nginx-container image: nginx:latest resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "200m"
In this example, the container requests a minimum of 100 millicpu and 256 MiB of memory. The container is limited to using a maximum of 200 millicpu and 512 MiB of memory.
Create the my-app Pod by applying the configuration using the command below:
kubectl apply -f app-deployment.yaml
You can then list the Pods to verify that your Pod was created:
kubectl get Pods
With the help of the describe command, you can verify that the resource limit and request were set correctly.
kubectl describe Pod my-app
Kubernetes uses these settings to schedule Pods appropriately and to ensure that containers operate within their resource boundaries. To optimize performance in real-world conditions, these settings must be tailored to your application requirements and regularly monitored to achieve an ideal balance between resource utilization and stability.
The YAML file below defines a LimitRange.
apiVersion: "v1" kind: "LimitRange" metadata: name: "resource-limits" spec: limits: - type: "Pod" max: cpu: "500m" memory: "750Mi" min: cpu: "10m" memory: "5Mi" - type: "Container" max: cpu: "500m" memory: "750Mi" min: cpu: "10m" memory: "5Mi" default: cpu: "100m" memory: "100Mi"
This YAML defines default resource requests and limits for containers in the namespace. These defaults are applied if a Pod doesn't specify requests and limits.
To apply LimitRange to your namespace, run this command:
kubectl apply -f limit-range.yaml
Using the describe command, you can check the LimitRange details.
kubectl describe limitrange resource-limits
Best practices
Monitor your Pods' resource utilization continuously using Kubernetes monitoring tools or third-party solutions. Adjust the LimitRange definitions and your Pod specifications based on real-world performance and needs.
Setting values too high could result in unnecessary spending and waste, while setting them too low could cause poor performance or crashes. To maximize results and ensure smooth operations, allocate more resources than initially needed, and monitor usage to adjust as necessary.
CPU utilization that exceeds its threshold results in its being throttled, potentially hindering container performance. When memory usage exceeds its limit, processes are killed in an attempt to free up memory (OOM-kill) to avoid a crash that can prevent your application from functioning properly.
If your workload experiences short CPU spikes, and performance is optional to you, setting its CPU limit slightly lower than what was seen during these spikes is sufficient. As for memory limits, setting them to accommodate all spikes will help to ensure your workload won't be killed by unfinished operations and incomplete user requests.
Setting memory limits that match requests is strongly advised, as otherwise you risk OOM killing a container or node completely; setting them higher could open your node up to OOM issues.
Allocate your CPU resources correctly by following this rule of thumb: for new workloads, begin with more generous resource settings before closely monitoring metrics to adjust requests and limits accordingly and ensure cost efficiency.
Remember that the exact values for requests and limits and the LimitRange definitions should be tailored to your application's requirements and the cluster's available resources. Regular monitoring and adjustments are vital to maintaining optimal resource utilization.
By following the steps mentioned above, you will ensure that resource requests and limits are set appropriately for your Pods and limit ranges are in place to prevent resource overconsumption. This approach contributes to a more stable and efficient Kubernetes cluster.
Subscribe to 4sysops newsletter!
Conclusion
In this article, you learned the basics of configuring Kubernetes resource requests, resource limits, and resource ranges. Please note that I only discussed the most common types of resources. In addition to CPU and memory, Kubernetes allows you to specify other resources, such as GPU, ephemeral-storage (temporary storage), and hugepages (blocks of memory that are much larger than the default page size).