Resolving a race between the kubelet and the host
May 29, 2018rancher kubernetes iscsi
Pods backed by iSCSI storage fail to mount the storage after a reboot of the host. The error in
kubectl describe is something like:
MountVolume.WaitForAttach failed for volume "mqtt-data" : failed to get any path for iscsi disk, last err seen: iscsi: failed to attach disk: Error: iscsiadm: Could not login to [iface: default, target: iqn.2006-04.us.monach:nas.mqtt-data, portal: 10.68.0.11,3260]. iscsiadm: initiator reported error (12 - iSCSI driver not found. Please make sure it is loaded, and retry the operation) iscsiadm: Could not log into all portals Logging in to [iface: default, target: iqn.2006-04.us.monach:nas.mqtt-data, portal: 10.68.0.11,3260] (multiple) (exit status 12)
Note the iSCSI driver not found error. Strange, right? It’s there -
lsmod shows iSCSI modules loaded into the kernel. In order to resolve this, I have to log into the host and stop/disable the
iscsid service on the host itself. This allows the kubelet container to load/execute the drivers directly.
The part that I haven’t yet figured out is why the service keeps reactivating. I disable it with
systemctl disable iscsid, and the next time the host boots….it’s running again.
Update 2018/06/14: This is resolved, and the answer is…well…easy. It appears that Ubuntu automatically installs and enables
open-iscsi, which I noticed during another installation on a different system. This led me to the idea that the service I need to disable is not
iscsid, but in fact is
open-iscsi. Disabling that service prevented the restart of the service, and the next time that I brought the RKE nodes back up, Kubernetes correctly mounted the iSCSI targets for the pods without my intervention.
Update 2018/06/27: I’m rolling out a new Rancher cluster, and this problem reappeared. It now seems that I have to disable both
iscsid services in order for this to work.