Run production ready lightweight kubernetes using K3s in Q Blocks instance
K3s is a production-ready lightweight Kubernetes distribution that allows easy and scalable container orchestration. Read more on K3s official Github Repo.
Once pre-requisite is fulfilled, we can proceed ahead with K3s setup.
Steps to bring up K3s cluster inside Q Blocks GPU instance:
Make sure nvidia-smi is running inside the container
Install Docker
sudoapt-getupdatesudoapt-getinstalldocker.io
Install nvidia-container-toolkit
distribution=$(./etc/os-release;echo $ID$VERSION_ID)curl-s-Lhttps://nvidia.github.io/libnvidia-container/gpgkey|sudoapt-keyadd-curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/libnvidia-container.list
sudoapt-getupdate&&sudoapt-getinstall-ynvidia-container-toolkit
Set nvidia runtime as default container runtime:
By default, k3s prefers containerd runtime. But for GPUs to work we need default runtime of nvidia. So we setup nvidia runtime as follows in docker daemon file:
Validate if GPUs are getting detected by K3s cluster node:
sudok3skubectldescribenode-A|grepnvidia
If you are able to see GPU recognised and deamonSet not throwing an error its time to do a test run and make sure a pod can access the GPU. Make sure to run this container only on a node with GPU.
Make sure the docker image used for testing has same or lower cuda version as the one supported by nvidia driver in Instance.
Please wait for 5-10 seconds for the pod to load and run. If it ran successfully, it would display a log like this:
[Vector addition of 50000 elements]CopyinputdatafromthehostmemorytotheCUDAdeviceCUDAkernellaunchwith196blocksof256threadsCopyoutputdatafromtheCUDAdevicetothehostmemoryTestPASSEDDone
This confirms K3s cluster was able to detect GPU and pods are able to run code on GPUs inside kubernetes cluster
If you face any difficulty in setting up K3s then please reach us out at support@qblocks.cloud.