“If you don’t understand recursion then read this again”
This is the fourth part of the five-part series — “ From Sandbox to K8S: Deploying a Streamlit based object detection application using Minikube.”
With all the background that we got in the last few articles, we will finally try to deploy our Streamlit based object detection application in this section. As you might be familiar by now that since K8S is a container orchestration platform, we need to wrap up our application into a container before attempting to deploy the same. To containerize our application, we can use the following Dockerfile:
To build the image, you need to clone the project repository, that contains the above Dockerfile and build the same from the project’s root directory using the following command:
docker build -t parthasarathysubburaj/k8s-object-detection:v1 .
Please make sure that you replace ‘parthasarathysubburaj’ with your DockerHub registry name that you created in the last section. Once built, you can push the same to the Docker registry using the command:
docker push parthasarathysubburaj/k8s-object-detection:v1
If you notice the project repository and the Dockerfile, you would observe that the model weights (the ones consuming more memory) for running SSD and YOLO models are not copied inside the docker image. This will help us in keeping the docker image light, thus consuming less memory and enable us to deploy them faster. So wouldn’t it be better if we have the model weights separated from the docker image and somehow link it inside the container during runtime? And that’s where PersistentVolume (PV) objects in K8S becomes handy 😅
Managing the storage systems becomes a very crucial part in deploying any application. In Kubernetes, PersistentVolumes (PV) provides an interface for users and cluster administrators for managing storage by abstracting the details of how storage is provided and consumed in the cluster.
A PersistentVolume is a piece of storage in the cluster that is provisioned by the cluster administrator; they can just be seen as any other resource type like Pod, Namespace, Secrets etc. The key point to note here is that the lifecycle of a PV is independent of any individual Pod that uses it, i.e. even if the Pod using the PV crashes and is deleted, the volume is still alive, and all it’s contents are preserved, hence it is Persistent.
Any user in the cluster can get access to this volume by making a claim via a PersistentVolumeClaim (PVC) object. A user can request specific levels of memory that needs to be attached to the Pod that is being spun, and the same is allocated provided that the request is valid and is available in the cluster.
To deploy our application, let’s first create a PVC in the cluster for storing the weights of the model and later link the same to our container during runtime. Again, a PVC can be created by using a YAML file, like the one below:
followed by executing:
kubectl create -f pvc.yaml
This creates a PVC named object-detection-model-weights with storage of 1 GB with read-write access to multiple nodes as specified in the
accessModes of the
Now, let's spin a Pod and attach the PVC to it, and copy our model weights into the PVC. We can use the same Pod definition file that we used in the last article with a little bit of modification for mounting the PVC into it. The modified YAML file looks like this:
And for spinning the pod, we need to execute:
kubectl create -f pod-pvc.yaml
This creates a Pod named ‘ubuntu-base’ with the base image ‘ubuntu:18.04’ and mounts the PVC ‘object-detection-model-weights’ that we created in the last step, and the same is available in the directory
/workspace within our Pod.
Now let’s download the model weights that we will be using in our deployment. Model weights can be downloaded from here:
Once downloaded, we can move these weights into our PVC that is attached to our running Pod ‘ubuntu-base’ by following the below steps:
Step 1: Creating the directory structure inside the PVC for us to copy the weights; this can be done by executing:
kubectl exec -it pod/ubuntu-base -n partha -- mkdir -p /workspace/model-weights/ssd
which creates the directories /workspace/model-weights/ssd in our mounted PVC; and now let’s create a provision for storing YOLO weights
kubectl exec -it pod/ubuntu-base -n partha -- mkdir -p /workspace/model-weights/yolo
Step 2: Copy the model weights from your local machine into the PVC by the executing:
kubectl cp <path to SSD weights> ubuntu-base:/workspace/model-weights/ssd -c ubuntu-base-container -n parthakubectl cp <path to YOLO weights> ubuntu-base:/workspace/model-weights/yolo -c ubuntu-base-container -n partha
Once copied, we no longer need the Pod, and the same can be deleted by executing:
kubectl delete pod/ubuntu-base -n partha
And this concludes our section on Persistent Volumes.
I guess most of the developers will be a big fan of the Twelve-Factor App. And the third factor in it deals with how one should handle the application’s configuration information when building them, and it clearly highlights the importance of separating the configuration information from the application’s source code. Thanks to ConfigMap that helps us in achieving this in K8S.
ConfigMaps are objects that are used for storing non-confidential data as key-value pairs, and the same can be referenced inside the pod as environment variables, command-line arguments, or as configuration files in a PVC.
In our deployment, we will be using a configMap to store the path to the model weights that have to be used by the application and a few configurations for Streamlit.
A configMap can be created by using the following YAML:
And for creation, we need to execute:
kubectl create -f configmap.yaml
The above YAML file will create a configMap object with six configurations, the first two corresponds to the model paths (make sure they point to the paths in your PVC) and the next four corresponds to Streamlit settings. We will expose these configurations as environmental variables in our Pod for its consumption.
If you notice, all the information related to model weights are available in the PVC and the config map object, so in case if we have to change the model weights, say try out a different checkpoint which is trained with a different set of hyperparameters all we need to do is to modify the PVC and the configMap without needing to touch the docker image, thus enabling us to deploy models faster.
By now, we must be aware of the fact that Pods are the smallest deployable units of computing that one can create and manage in Kubernetes. We have created a couple of Pods in this exercise as well. Though working with Pods are relatively easy, they have their drawbacks. To being with Pods cannot scale up or scale down automatically on a need basis. For example, let’s imagine you run an e-commerce website and definitely during the festival season the traffic in your website will increase, and you might want more pods supporting your application to give a better user experience. Once the festival season is over the web traffic will return to normalcy, and you might want to scale down. Unfortunately, Pods are not capable of doing this by themselves. Also, they do not support rolling updates nor rollbacks, which helps us in updating applications with zero downtime. As a result, they are not very common in production scenarios. And that's what a Deployment object helps us.
A Deployment object creates ReplicaSets whose primary responsibility is to maintain a stable set of replica Pods running at any given time. They can be seen as a higher-level resource which helps us in deploying and updating applications declaratively.
In our case, we will make use of the Deployment object for deploying the Streamlit application. Just like any other object that we have created till now, we can use a YAML for creating the same.
And executing the below command will create the deployment
kubectl create -f deployment.yaml
Though the file looks long, the template is more or less similar to the Pod creation YAML file, just that here we add the number of replica sets to be created in the filed
replicas under the
spec section (we are just specifying only one replica set for the demo). Also, make a note of how the created configMap is being referenced and used inside the deployment. Since we need to run the Streamlit application, we modify the
command filed under the container specification accordingly. Also, our Streamlit runs on the port
7500 as per our
command we need to put it in the container specification under the filed
ports as shown above, this will enable us to connect to our application via a Service object.
To break the monotony and give you a change, let’s try to learn what Service objects are from a video lecture 😃
In our case, the below YAML file can be used to create the Service object:
And now you can open up your browser and visit the URL
The URL suffix and the port numbers are as per our config map and service definition files. You can check your Minikube’s IP address by executing the command
minikube ip in the terminal.
And if you list all the objects in your namespace you get to see the following:
we can very well see that the Deployment object had spun a Replicaset which in turn had created a Pod that runs our application.
And here is a cool demo of our application that’s powered by Minikube 😅
In the last section of this tutorial series, we will see some of the advanced concepts like Ingress, Liviness and Readiness probes and performing Rolling updates and Rollbacks.