So, how do we handle persistent data in our k8s cluster?

Consider the case where you need data in a pod to persist for some reason, so what do you do? You create a volume;

  • Container storage —Data lives as long as the container is alive
  • Pod Volume — Usable as long as pod is in the same node
  • External Storage Volume — Same as above but with external storage so multiple pods from different nodes cnan use it
  • Persistent Volume — Usable from whatever pod in wherever node and no hard dependancies.

If you are testing something and you can control whatever node your pod is going to run in, a quick pod volume is the way to go.

In your pod .yml, the below yaml states that the /opt path in the container is linked to the /data path in our node. So even if the pod dies, the data within it will be available for reuse as long as it starts with the same volume details and also in the same node.

Lets check the pod to see if the mount is done right.

Inside of the pod at the mounted folder

For Windows, I tried out with minikube but you have to mount a folder first to use as volume path. Run below commands to check current volumes first.

If you don’t see Users path mounted, mount it using below command while in host cmd. It will mount Users path of the host to /Users in minikube.

Then you can use your directory like so;

So coming back to the main topic, Pod volume is ok if you have a single node or know for certain what node your pod is going to run but what if you have multiple nodes? With the above setup you are going to have volumes in all nodes with possibly different data.

Persistent volumes claim to solve this problem. For this we can deploy some form of NFS server ;

After that use within our pod as below;

So ok, we finally have a constant storage solution but;

  • We have to know NFS Server details
  • We would be creating a hard dependancy by ip and physical path
  • We have to remember what physical path is mounted to where. Mount mistakes are inevitable

We’re getting there so, to solve these new problems we create Persistent Volumes and match to them via requests in the form of Persistent Volume Claims. We can keep whatever data we want in Persistent Volumes that match with pods’ claims instead of giving physical locations and directly mounting to pods themselves. Persistent Volumes, on an ideal environment, would be managed by the sys admins and would probably have a yaml like below.

Now, sys admins can control the presented NFS server, the access, volume, storage modes and the resources they want to allocate. Also since claims and volumes have 1–1 relationship, sysadmins might want to further makesure larger volumes don’t get claimed by small claims by adding labels. In a case where volumes do not become available

On the app side we first create a claim,

And then use the claim in our pods like so;

In the case above, our pod requests 600Mi of space to use as persistent storage and looks for available volumes. It matches with the one named persistent-volume and attaches itself to the pod. Henceforth, our pod would be keeping its data there.

What happens when NFS (or whatever storage) is down — There are multiple HA solutions for this specific case and this must be kept in mind when planning a production system.

The question that remains for another time:

  • What happens when a claim is deleted, will the pod data be recoverable. Volume doesnt get detached if ReclaimPolicy is set to Retain but how do we attach a new claim back to the old volume?

Just a software everything fighting battles against mostly myself, and gaining small victories lately.