Multus is the way to go for Rook Ceph networking
Have you ever wanted to add more than one network interface to your Pods? Need a safer way to connect a legacy application to multiple network VLANs in Kubernetes? Let’s see how we can achieve this for a Rook Ceph cluster.
Normally in Kubernetes a Pod only has a single interface to communicate on the cluster network. This is and should be enough for most applications, but for production Rook Ceph deployments it isn’t good enough. From Ceph’s recommendation, you should preferably use two different networks. A public network on which clients can and will talk to all the Ceph cluster components and the second cluster network. The cluster network, as the name might imply is used for certain Ceph cluster traffic, to be exact the Ceph OSD data replication traffic.
A long time ago,
kube-proxy was still using
iptables (by default at the time) to route traffic to service IPs, the
hostNetwork option was a common way to increase network “throughput”.
hostNetwork: true does that by exposing the node’s network stack to the Pod’s containers. At first glance, this might sound great, but it comes with some drawbacks in the security department.
I can still remember jokingly running
shutdown in a
hostNetwork Pod and, let’s just say I was thankfully able to somehow power on the server again through good old remote management interface (IPMI). So take that
hostNetwork mode, you can weaken the isolation of containers by a lot.
You will end up with more traffic on the cluster network than on the public network for a simple reason, a client needs to “write” data once to an OSD, but the OSD needs to talk with an X amount of other OSDs to fulfill the replication requirement of the storage pool.
A simplified diagram of this flow of data from a client:
Sidenote: That’s a reason why Ceph can appear slower in direct comparisons with other storage projects because an input/ “data write” operation is only confirmed after it has been fully replicated.
So where does Multus come into play here?
Multus allows you to attach one or more (specific) network interfaces to your Pods. Making the whole ordeal of setting a Pod’s network interfaces to a Pod streamlined.
There are still going to be some security implications when you, e.g., attach a node’s network interface to a Pod, but at least it is made transparent through Multus’
Custom Resource Definitions.
A security team could simply restrict access to these “network definitions” using RBAC in Kubernetes. This in combination with a policy agent (e.g., Open Policy Agent (OPA)), can help enforce certain “network access policies”. For monitoring/ auditing as well, you can just keep an eye on which network definitions are used by which Pods.
Let’s assume we have a Kubernetes node with two physically connected network interfaces. Let’s stick to the “good old” interface naming schema to keep it simple:
eth1 interface. 😉
eth0 is used as the “default” interface of the node, we will be using
eth0 for the Ceph public network (client traffic) and
eth1 for Ceph’s OSD replication. For simplicity, we’ll assume both networks have a DHCP server running.
To get started, we need to create two
Short explanation on what the
.spec.config here means:
"type": "host-device"uses the
host-deviceCNI plugin to “Move an already-existing device into a container.”
dhcpCNI plugin tells the CNI to get an IP from a DHCP server.
You can run
kubectl get network-attachment-definitions to confirm that both
NetworkAttachmentDefinitions have been created.
Warning: For existing clusters, you currently can’t easily switch from, e.g., “container network” to
hostNetwork mode/ Multus.
(Documentation: Ceph Cluster CRD - Multus Configuration - Rook Ceph v1.11)
This will tell the Rook Ceph operator to “attach” the Multus network annotations to the Ceph components, no need to add anything else to the
To summarize, we can use Multus to more specifically and easily have a Rook Ceph cluster use two different networks for performance reasons.
Looking back at the time I implemented
hostNetwork mode in Rook, it is still the simplest way
to “skip the container network” to gain more performance (depending on the CNI encapsulation, etc., used by your Kubernetes cluster network) or expose a service to other clusters/ servers which “can’t be just ’loadbalanced’/ proxied”.
We are looking into improving the existing documentation and examples, to make it easier for people to use Multus, instead of
hostNetwork mode, with their Rook Ceph clusters.
If you want to get a more in-depth look at what Multus can do, be sure to check out this great post by devopstales here: Use Multus CNI in Kubernetes - devopstales. To look at what other plugins and config options the CNI project and plugins have, check out CNI Documentation - Plugins Overview.
Thanks for reading!