- Docker logs tail: Troubleshoot Docker containers with real-time logging - Wed, Sep 13 2023
- dsregcmd: Troubleshoot and manage Azure Active Directory (Microsoft Entra ID) joined devices - Thu, Aug 31 2023
- Ten sed command examples - Wed, Aug 23 2023
Why persistent volumes?
By their nature, containers are temporary and are meant to be disposable. Therefore, the container application image allows quickly spinning up new applications and new versions. However, this creates an issue for critical data. If the container is ephemeral, how do you persist the data?
A database container is a classic example. Without a persistent volume, all data would be lost every time the container restarts, which would be unacceptable for critical applications. Instead, the data can persist across container restarts and updates by using persistent volumes, making containers a viable option for hosting databases and other data-driven applications.
You can see persistent volume configuration settings in the respective helm charts (packages of preconfigured Kubernetes resources). For example, below is a view of the YAML configuration file's data persistence settings for a MySQL pod.
Creating and using persistent volumes
Now that we know what persistent volumes are and why they're essential, let's see how to create and use them in a Kubernetes cluster.
There are two primary ways to provide a persistent volume in Kubernetes:
- Static provisioning—The static provisioning method enables an administrator to create storage resources and define persistent volumes manually.
- Dynamic provisioning—Kubernetes automatically provisions the storage and creates a persistent volume resource with a defined StorageClass.
Storage classes allow administrators to describe the different classes of storage based on factors such as quality-of-service levels, backup policies, and other customized policy definitions. It also ensures that default storage is configured for dynamic provisioning.
With a default storage class, applications that request a persistent volume claim can automatically be bound to a persistent volume.
How do you claim a persistent volume?
Persistent volumes are used by first creating a PersistentVolumeClaim. A PersistentVolumeClaim is a request for storage that specifies storage requirements, such as size and access modes. Once created, Kubernetes will try to bind the PersistentVolumeClaim to an available persistent volume that matches the requirements.
To see the persistent volume claims, you can use the following command:
kubectl get pvc
Using a persistent volume claim in a pod
After a PersistentVolumeClaim is bound to a persistent volume, it can be used by a pod. To do so, include the PersistentVolumeClaim name in the volume section of your pod's YAML definition, and use the volumeMounts field to specify the mount path inside the container. Below is a PersistentVolume YAML config to persist data in a local hostPath folder.
apiVersion: v1 kind: PersistentVolume metadata: name: pv01 spec: accessModes: - ReadWriteOnce capacity: storage: 10Gi hostPath: path: /data/pv01/
To see persistent volumes mounted, you can use the following command:
kubectl get pv
Managing the lifecycle of persistent volumes
Kubernetes offers various options for managing the lifecycle of persistent volumes, from creation to deletion.
Persistent volume reclaim policy
When a PersistentVolumeClaim is no longer needed, you can delete it. Once a pod is deleted and no longer claims a persistent volume, it enters the released phase, and Kubernetes will not reuse it. However, this behavior is configurable. You can set a persistentVolumeReclaimPolicy in the persistent volume definition. There are three possible policies:
- Retain—The retain policy is the default policy in which the persistent volume remains in the released phase. An administrator must manually reclaim the volume.
- Recycle—This setting is now deprecated. With this setting, the persistent volume's data is scrubbed before it's made available again.
- Delete—The persistent volume is automatically deleted, and the physical storage is released.
Expanding a persistent volume
A PersistentVolumeClaim can be edited to configure a larger size. However, the StorageClass must have the allowVolumeExpansion field populated.
Resizing a PersistentVolumeClaim
Resizing a PersistentVolumeClaim depends on modifying the PVC to a larger size. You must also update the pod definition to use the new volume size. You may be unable to resize all volume types, which may require downtime for the PersistentVolumeClaim applications.
Backing up and restoring persistent volumes
Kubernetes does not contain a native backup solution for backing up your persistent volume data. You can use a third-party solution, including many free and open-source solutions.
Backing up your persistent volumes is crucial for disaster recovery. You can use tools like Velero, a popular open-source solution, to create backups of your persistent volumes and PersistentVolumeClaim. With Velero, you can restore your data to a new cluster or even the same cluster after a disaster.
Best practices for working with persistent volumes
Note the following best practices around persistent volumes in Kubernetes:
Subscribe to 4sysops newsletter!
- Use StorageClasses—Storage classes remove the heavy lifting from provisioning PVs. With the dynamic provisioning that storage classes provide, admins can set a default storage class and not worry about manually configuring persistent storage for pods that require it.
- Choose the proper access modes—persistent volumes and PersistentVolumeClaims support several access modes, such as ReadWriteOnce, ReadOnlyMany, and ReadWriteMany. Choosing the proper access mode impacts the security and performance of your persistent storage.
- Monitor persistent storage—You will want to monitor your persistent storage and ensure it is utilized correctly and efficiently. Open-source tools, such as Prometheus, can help you track storage, while you can have visibility into persistent volumes and PersistentVolumeClaims using the Kubernetes dashboard or Grafana dashboards.
- Back up your data—Persistent volumes are like any other critical data. They need to be protected using a backup solution. Velero is an example of an open-source solution to automate persistent volume backups and ensure that your data is safe.
Wrapping up
Persistent volumes in Kubernetes are integral to hosting business-critical applications in self-hosted Kubernetes clusters. They allow ephemeral containers to persist critical data so your data isn't deleted with the pod when it is respun or updated. In addition, using storage classes and dynamic provisioning allows admins to automate the process of providing persistent volumes to stateful applications.