Persistent volumes allow administrators to configure persistent data locations for stateful applications. They allow containerized applications to store data beyond the lifecycle of individual containers or pods. This makes it possible to retain data even after a pod restart or update. With Kubernetes, persistent volumes are storage locations provisioned by an administrator or dynamically created using StorageClasses.

Why persistent volumes?

By their nature, containers are temporary and are meant to be disposable. Therefore, the container application image allows quickly spinning up new applications and new versions. However, this creates an issue for critical data. If the container is ephemeral, how do you persist the data?

A database container is a classic example. Without a persistent volume, all data would be lost every time the container restarts, which would be unacceptable for critical applications. Instead, the data can persist across container restarts and updates by using persistent volumes, making containers a viable option for hosting databases and other data-driven applications.

You can see persistent volume configuration settings in the respective helm charts (packages of preconfigured Kubernetes resources). For example, below is a view of the YAML configuration file's data persistence settings for a MySQL pod.

Helm chart for MySQL showing data persistence configuration settings

Helm chart for MySQL showing data persistence configuration settings

Creating and using persistent volumes

Now that we know what persistent volumes are and why they're essential, let's see how to create and use them in a Kubernetes cluster.

There are two primary ways to provide a persistent volume in Kubernetes:

  • Static provisioning—The static provisioning method enables an administrator to create storage resources and define persistent volumes manually.
  • Dynamic provisioning—Kubernetes automatically provisions the storage and creates a persistent volume resource with a defined StorageClass.

Storage classes allow administrators to describe the different classes of storage based on factors such as quality-of-service levels, backup policies, and other customized policy definitions. It also ensures that default storage is configured for dynamic provisioning.

With a default storage class, applications that request a persistent volume claim can automatically be bound to a persistent volume.

How do you claim a persistent volume?

Persistent volumes are used by first creating a PersistentVolumeClaim. A PersistentVolumeClaim is a request for storage that specifies storage requirements, such as size and access modes. Once created, Kubernetes will try to bind the PersistentVolumeClaim to an available persistent volume that matches the requirements.

To see the persistent volume claims, you can use the following command:

kubectl get pvc
Persistent volume claims configured

Persistent volume claims configured

Using a persistent volume claim in a pod

After a PersistentVolumeClaim is bound to a persistent volume, it can be used by a pod. To do so, include the PersistentVolumeClaim name in the volume section of your pod's YAML definition, and use the volumeMounts field to specify the mount path inside the container. Below is a PersistentVolume YAML config to persist data in a local hostPath folder.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv01
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 10Gi
  hostPath:
    path: /data/pv01/

To see persistent volumes mounted, you can use the following command:

kubectl get pv
Persistent volumes created from persistent volume claims

Persistent volumes created from persistent volume claims

Managing the lifecycle of persistent volumes

Kubernetes offers various options for managing the lifecycle of persistent volumes, from creation to deletion.

Persistent volume reclaim policy

When a PersistentVolumeClaim is no longer needed, you can delete it. Once a pod is deleted and no longer claims a persistent volume, it enters the released phase, and Kubernetes will not reuse it. However, this behavior is configurable. You can set a persistentVolumeReclaimPolicy in the persistent volume definition. There are three possible policies:

  • Retain—The retain policy is the default policy in which the persistent volume remains in the released phase. An administrator must manually reclaim the volume.
  • Recycle—This setting is now deprecated. With this setting, the persistent volume's data is scrubbed before it's made available again.
  • Delete—The persistent volume is automatically deleted, and the physical storage is released.
Viewing the configured reclaim policy

Viewing the configured reclaim policy

Expanding a persistent volume

A PersistentVolumeClaim can be edited to configure a larger size. However, the StorageClass must have the allowVolumeExpansion field populated.

Resizing a PersistentVolumeClaim

Resizing a PersistentVolumeClaim depends on modifying the PVC to a larger size. You must also update the pod definition to use the new volume size. You may be unable to resize all volume types, which may require downtime for the PersistentVolumeClaim applications.

Backing up and restoring persistent volumes

Kubernetes does not contain a native backup solution for backing up your persistent volume data. You can use a third-party solution, including many free and open-source solutions.

Backing up your persistent volumes is crucial for disaster recovery. You can use tools like Velero, a popular open-source solution, to create backups of your persistent volumes and PersistentVolumeClaim. With Velero, you can restore your data to a new cluster or even the same cluster after a disaster.

Best practices for working with persistent volumes

Note the following best practices around persistent volumes in Kubernetes:

Subscribe to 4sysops newsletter!

  • Use StorageClasses—Storage classes remove the heavy lifting from provisioning PVs. With the dynamic provisioning that storage classes provide, admins can set a default storage class and not worry about manually configuring persistent storage for pods that require it.
  • Choose the proper access modes—persistent volumes and PersistentVolumeClaims support several access modes, such as ReadWriteOnce, ReadOnlyMany, and ReadWriteMany. Choosing the proper access mode impacts the security and performance of your persistent storage.
  • Monitor persistent storage—You will want to monitor your persistent storage and ensure it is utilized correctly and efficiently. Open-source tools, such as Prometheus, can help you track storage, while you can have visibility into persistent volumes and PersistentVolumeClaims using the Kubernetes dashboard or Grafana dashboards.
  • Back up your data—Persistent volumes are like any other critical data. They need to be protected using a backup solution. Velero is an example of an open-source solution to automate persistent volume backups and ensure that your data is safe.

Wrapping up

Persistent volumes in Kubernetes are integral to hosting business-critical applications in self-hosted Kubernetes clusters. They allow ephemeral containers to persist critical data so your data isn't deleted with the pod when it is respun or updated. In addition, using storage classes and dynamic provisioning allows admins to automate the process of providing persistent volumes to stateful applications.

avatar
0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2023

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account