- Kubernetes logs: Pod logs, container logs, Docker logs, kubelet logs, and master node logs - Mon, Sep 25 2023
- Kubernetes DaemonSets - Wed, Sep 6 2023
- Static Pods in Kubernetes - Fri, Sep 1 2023
A Kubernetes cluster usually contains multiple worker nodes to run workloads. Taints and tolerations work hand-in-hand, allowing admins to control the placement of Pods on the nodes. The kube-scheduler is a control-plane component that decides which Pod goes to which node in a cluster. The taints on the nodes and the tolerations on the Pods are taken into account when the kube-scheduler assigns the Pods to nodes.
A taint comprises a key, a value, and a taint effect.
The key is a user-defined string that can be used to categorize or identify a taint. For example, if you wanted to taint nodes that are only for use with a specific application, you might choose a key such as app.
The value is also a user-defined string and provides more specific information about the taint. Continuing with the above example, if your application name is myapp, the value for your taint might be myapp, so the key-value pair for the taint becomes app=myapp, which can be understood as "this node is for myapp use only." Like keys, values are also case-sensitive.
The taint effect can be one of the following:
- NoSchedule: The Pods without a matching toleration will not be scheduled on the node.
- PreferNoSchedule: The kube-scheduler will try to avoid scheduling Pods without a matching toleration on the node but may still do so if necessary.
- NoExecute: The existing Pods on the node without a matching toleration for that taint will be evicted (terminated).
Tolerations in Kubernetes are applied to Pods, and they allow (but do not require) the Pods to be scheduled onto nodes with matching taints. They can also allow Pods to remain running on nodes, even when a taint is added to the node.
Tolerations have corresponding components to taints: key, value, and effect. However, unlike taints, tolerations have an operator, which can be either equal or exists. If the operator is equal, the toleration matches a taint if both the key and the value are the same. If the operator is exists, the toleration matches a taint if the taint has the same key, regardless of the value.
Toleration effects are identical to taint effects: NoSchedule, PreferNoSchedule, and NoExecute.
The effects must match between the taint and the toleration for the toleration to have any impact. The toleration's effect applies to the taint with the same effect. If a Pod has a toleration where the key and value match a taint on a node but the effects don't match, the Pod will be treated as if it does not tolerate the taint.
For example, if a Pod has a toleration with the effect NoSchedule, but the taint on the node has the effect NoExecute, the Pod does not tolerate the taint and will be evicted from the node (assuming it was running there already) or will not be scheduled on it.
Configuring taints on nodes
I am running a three-node cluster, where kube-srv1 is the master node running the control plane and kube-srv2 and kube-srv3 are the worker nodes, as you can see in the screenshot below.
To learn how to taint a node, let's look at an example. You want to reserve a particular worker node (kube-srv3, in our case) for running the development Pods only. Let's pretend it has older hardware or maybe low memory, so we want to avoid running production Pods on the kube-srv3 node by setting a taint on it. To taint a node, use the kubectl taint nodes <node-name> key=value:effect command, as shown below:
kubectl taint nodes kube-srv3.testlab.local env=dev:NoSchedule
As soon as you run this command, you will see a message indicating that the node is tainted. Here, we used env as the key, dev as the value, and NoSchedule as the taint effect. The taint effect indicates what happens to a Pod when it does not tolerate the taint. Since we used the NoSchedule taint effect, it means the new Pods will not be scheduled on the node, but the existing Pods (if any) will continue to run on the node. If we use the NoExecute taint effect instead, the running Pods will be evicted from the node if they do not tolerate the taint.
To see the taints configured on a node, run the kubectl describe node command, as shown below:
kubectl describe node kube-srv3.testlab.local
If you are curious to know why your Pods are never placed on the master node itself, by default, you can run the kubectl describe node command and take a look at the taints set on it.
kubectl describe node kube-srv1.testlab.local | grep Taints
Notice that the master node is by default tainted with a NoSchedule effect. That is why the kube-scheduler does not schedule the user-defined Pods on the master node. Depending on how your Kubernetes cluster is set up, you might not see this taint, particularly if you used tools like minikube or something similar to run a single-node cluster. However, in a production-ready Kubernetes cluster, the master node is tainted to prevent workloads from running on it. This ensures that the master node is used solely for managing the Kubernetes cluster and that the worker nodes are used for running workloads. If you want to schedule Pods on the master node anyway, you can configure a matching toleration on Pods or remove the default taint from the master node, but doing so is not recommended.
Configure tolerations on Pods
You can't directly add a toleration to a Pod using a kubectl command like the kubectl taint command. Instead, you define tolerations in the configuration file of a Pod or Deployment. To configure tolerations on a Pod, create a Pod configuration file, as shown below:
apiVersion: v1 kind: Pod metadata: name: webapp-pod labels: app: webapp spec: containers: - name: nginx image: nginx tolerations: - key: env value: dev operator: Equal effect: NoSchedule
The screenshot shows that we added tolerations under the spec section. Note that the tolerations field is an array allowing you to essentially define multiple tolerations. You need to carefully add the same settings as those of the taint. In our case, the taint was configured with key: env, value: dev, and effect: NoSchedule. The operator: Equal denotes that the key must be equal to the value (i.e., env=dev). If you just need to evaluate the existence of a taint on a node, you could skip typing the value field and simply use the operator: Exists. If you're working with a Deployment, the tolerations need to be added under the template.spec section.
Now, let's create a Pod using the configuration file we created above.
kubctl apply -f webapp-pod.yaml
With the next command, you can view your pods:
Kubctl get pods -o wide
You can see that the new Pod is now running on the kube-srv3 node. This is because I have set the other worker node (kube-srv2) with a different taint to avoid running the development Pods on it. It's important to use taints and tolerations carefully and ensure that your cluster has enough nodes for scheduling the required workloads. If all the nodes in your cluster are tainted, and you forget to add tolerations in the Pod configuration, your Pods will be stuck in the Pending state. In that case, it is worth checking the events section by running the kubectl describe pod command.
This will tell you that Pod scheduling failed because of an untolerated taint.
To remove the taint from the node, you need to use the same command that you used to set the taint, followed by a minus sign (-), as shown below:
Subscribe to 4sysops newsletter!
kubectl taint nodes kube-srv3.testlab.local env=dev:NoSchedule-
I hope this post helps you understand how taints and tolerations work together to influence Pod placement on nodes. But remember—it does not guarantee the Pod placement on a particular node. There might be situations when you want your Pods to be placed exactly on a particular node (probably due to hardware or software requirements). By default, nothing is stopping your Pods from being placed on a node with no taint. In such cases, you can use the concepts of node selectors and node affinity to ensure that your Pods are placed on the desired node. Sometimes, you might also need to use a combination of taints and tolerations and node affinity to get the desired results in your environment.