Projects & Stories

Rancher v2 and a story about re-deploying nodes

I tried using Rancher v2 with Kubernetes a couple of times now. Everytime I ended up with errors like

  • network plugin is not ready: cni config uninitialized
  • failed to start containers: kubelet
  • transport: authentication handshake failed: remote error: tls: bad certificate
  • crypto/rsa: verification error

All this error messages can mean a lot of different things but in my case it was because I re-deployed nodes without restarting/cleaning it.

Apparently Rancher does not clean-up everything after deleting the node from the web-interface or if they do.. it sometimes fails. So you end up digging your own grave while trying to deploy your new node over and over again..

To be honest I thought destroying all containers related to Rancher and Kubernetes should be enough maybe also run docker volume prune once in a while..

But for some reason kubernetes leaves you with active mount points.

So before you cry yourself into sleep like I did.. try manually cleaning your host by searching for unwanted mounts and leftovers:

#!/bin/bash

# rancher and kubernets related mount points
for mount in $( \
    mount \
        | grep tmpfs \
        | grep '/var/lib/kubelet' \
        | awk '{ print $3 }') /var/lib/kubelet /var/lib/rancher;
      do umount $mount;
done

# re-generated configuration files on a new node
rm -rf /etc/ceph \
       /etc/cni \
       /etc/kubernetes \
       /opt/cni \
       /opt/rke \
       /run/secrets/kubernetes.io \
       /run/calico \
       /run/flannel \
       /var/lib/calico \
       /var/lib/etcd \
       /var/lib/cni \
       /var/lib/kubelet \
       /var/lib/rancher/rke/log \
       /var/log/containers \
       /var/log/pods \
       /var/run/calico

# kubeadm will fail if the dir does not exist
mkdir -p /etc/cni/net.d

Make sure you deleted all related containers first! I hope this will help you like it helped me.. cause Kubernetes is awesome <3