We put Ceph on Nitro. It’s faster, but how much faster?

We put Ceph on Nitro. It’s faster, but how much faster?

Alexander Trost June 13, 2022
We put Ceph on Nitro. It’s faster, but how much faster?

tl;dr: We put Ceph on ⚡AWS Nitro⚡ (on Kubernetes) to see how fast it would go. The untuned setup runs ~44% faster, in fact. Skip to the results to find out more.

Ceph is at the cutting edge of open source distributed storage, and AWS has been at the cutting edge of data center operations for a long time.

What’s Ceph?

Ceph is a fully F/OSS, robust, highly available, and highly durable distributed storage system.

If you’re running it on Kubernetes, you’re going to want to use Rook (and keep your operations load light with Koor).

What’s Nitro?

AWS Nitro is a “quiescent” solution for hypervisor-controlled multi-tenancy on AWS’s custom hardware.

Put simply, Nitro makes it possible for machines provided by AWS to deliver a nearer-to-bare-metal performance experience, and for bare metal machines Nitro gets out of the way as much as possible to enable increased bare metal performance.

AWS has great talks from 2017 (the system has been in development since at 2013) on Nitro:

AWS re:Invent 2017: C5 Instances and the Evolution of Amazon EC2 Virtualization (CMP332)

AWS re:Invent 2018: Powering Next-Gen EC2 Instances: Deep Dive into the Nitro System (CMP303-R1)

AWS re:Invent 2019: Powering next-gen Amazon EC2: Deep dive into the Nitro system (CMP303-R2)

The question we want to answer

Well, how do we know how much faster Nitro is? We’ll have to get a realistic storage setup running on AWS and test a Nitro setup versus one not running on AWS’s next-gen hypervisor architecture.

First, the results

We’ve posed the burning question, and we won’t keep you waiting for the answers.

Graph

Here’s the TPS we observed across the runs:

Transactions Per Second results Graph

TPS results from runs

Tables

In tabular form:

Run #Stock TPSNitro TPS
1369.90545.16
2383.34535.89

How much did Nitro add?

As this setup has not been tuned (we’re not using provisioned IOPS, Rook is not tuned and neither is Postgres), the base level of transactions per second is quite low.

That said, the important bit here is that “simply” switching the instance type (m4.xlarge to nitro-enabled m5.xlarge) has rendered a ~45% increase in performance (as measured by TPS)!

With Rook managing storage and Nitro enhancing performance, we’ve got an easy to use low-TPS but production-grade ready to use file system deployed, running a database workload.

The setup

Interested in digging into how we set up the clusters and got those numbers? Continue reading. Alternatively you can dive into and run the code yourself.

Get the code

We’ve built a fully infrastructure-as-code repository that makes it easy to replicate our results!

Try the experiment yourself at opencoreventures/experiments-ceph-on-nitro

Hardware provisioning with Pulumi

Pulumi is a powerful solution for provisioning cloud resources as code. With Pulumi, you can provision cloud resources as code – make use of their custom resources to create abstractions (ex. an ObjectStorage resource which works across AWS, GCP, and Azure).

Here’s a snippet from our code to provision our SSH keys and an instance:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import * as fs from "fs";
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

///////////////////////////
// Setup & Configuration //
///////////////////////////

const config = new pulumi.Config();

const environment = process.env.ENVIRONMENT;
if (!environment) { throw new Error("ENV variable [ENVIRONMENT] missing"); }

const sshKeyAbsolutePath = process.env.SSH_PUB_KEY_ABS_PATH;
if (!sshKeyAbsolutePath) { throw new Error("ENV variable [SSH_PUB_KEY_ABS_PATH] missing"); }

const ec2InstanceType = config.require("ec2-instance-type");
const ec2Node0AZ = config.require("ec2-node0-az");
const ec2SSHKeyName = config.require("ec2-ssh-key-name");

// ... snip ...

const adminSSHKey = new aws.ec2.KeyPair(
  "admin-ssh-key",
  {
    keyName: ec2SSHKeyName,
    publicKey: fs.readFileSync(sshKeyAbsolutePath).toString() },
);

export const adminSSHKeyKeyName = adminSSHKey.keyName;

const node0 = new aws.ec2.Instance(
  "node-0",
  {
    ami: ami.then(ami => ami.id),
    instanceType: ec2InstanceType,
    availabilityZone: ec2Node0AZ,
    keyName: adminSSHKeyKeyName,
    tags: {
      NodeId: "0",
      Environment: environment,
    },
  },
);

This code does a bunch of things that are hard to do with other solutions:

  • Makes use of ENV variables seamlessly
  • Exposes the full power of NodeJS ecosystem
  • Reads in files from the local filesystem easily

An alternative to Pulumi is Crossplane which manages all your compute, storage and other resources even inside your existing Kubernetes cluster.

Kubernetes for workload orchestration, provided by k0s

Since we’re using Ceph we aim to build a cluster of storage, and managing machines in a cluster is pretty easy these days with Kubernetes.

Running Kubernetes is made even easier by the k0s project (created by the folks over at Mirantis), so we’ll be using that to build our cluster (over an alternative like kubeadm).

Here’s how easy it is to start a Kubernetes cluster with k0s:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
k0sctl.yaml
---
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
  name: k0s-cluster
spec:
  k0s:
    version: 1.23.5+k0s.0
    config:
      spec:
        api:
          externalAddress: ${CTRL_0_PUBLIC_IP}
          address: ${CTRL_0_PRIVATE_IP}
          sans:
            - ${CTRL_0_PUBLIC_IP}
          port: 6443
          k0sApiPort: 9443
          extraArgs:
            # see: https://github.com/kubernetes/kubernetes/issues/74302
            http2-max-streams-per-connection: "1000"

        storage:
          type: etcd

        network:
          podCIDR: 10.244.0.0/16
          serviceCIDR: 10.96.0.0/12
          provider: calico
          calico:
            mode: vxlan
            vxlanPort: 4789
            vxlanVNI: 4096
            mtu: 1450
            wireguard: true
            flexVolumeDriverPath: /usr/libexec/k0s/kubelet-plugins/volume/exec/nodeagent~uds
            withWindowsNodes: false
            overlay: Always

        podSecurityPolicy:
          defaultPolicy: 00-k0s-privileged

        installConfig:
          users:
            etcdUser: etcd
            kineUser: kube-apiserver
            konnectivityUser: konnectivity-server
            kubeAPIserverUser: kube-apiserver
            kubeSchedulerUser: kube-scheduler

        konnectivity:
          agentPort: 8132
          adminPort: 8133

        images:
          default_pull_policy: IfNotPresent

  hosts:
    ##############
    # Controller #
    ##############
    - role: controller
      ssh:
        address: ${CTRL_0_PUBLIC_IP} # envsubst
        user: ubuntu
        port: 22
        keyPath: ~/.ssh/id_rsa

    ################
    # Worker Nodes #
    ################
    - role: worker
      ssh:
        address: ${WORKER_0_PUBLIC_IP} # envsubst
        user: ubuntu
        keyPath: ~/.ssh/id_rsa

    - role: worker
      ssh:
        address: ${WORKER_1_PUBLIC_IP} # envsubst
        user: ubuntu
        keyPath: ~/.ssh/id_rsa

    - role: worker
      ssh:
        address: ${WORKER_2_PUBLIC_IP} # envsubst
        user: ubuntu
        keyPath: ~/.ssh/id_rsa

Of course, this is a template, which hasn’t been interpolated yet, but it’s that easy!

To make this run of course, so the Makefile target looks a little like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
K0SCTL ?= k0sctl
ENVSUBST ?= envsubst

K0SCTL_YAML_PATH ?= path/to/your/k0sctl.yaml

## Generate the k0sctl YAML file
generate: generated-folder
    @echo -e "=> Generating k0sctl.yaml based on template @ [$(K0SCTL_YAML_TEMPLATE_PATH)]"
    @export CTRL_0_PUBLIC_IP=$(CTRL_0_PUBLIC_IP) \
        && export CTRL_0_PRIVATE_IP=$(CTRL_0_PRIVATE_IP) \
        && export WORKER_0_PUBLIC_IP=$(WORKER_0_PUBLIC_IP) \
        && export WORKER_1_PUBLIC_IP=$(WORKER_1_PUBLIC_IP) \
        && export WORKER_2_PUBLIC_IP=$(WORKER_2_PUBLIC_IP) \
        && cat $(K0SCTL_YAML_TEMPLATE_PATH) | $(ENVSUBST) > $(K0SCTL_YAML_PATH)

## Install k0s
deploy-k8s: generate
  @echo -e "=> Running k0sctl..."
  $(K0SCTL) -c $(K0SCTL_YAML_PATH)

Rook for Ceph cluster management

Rook makes running Ceph clusters a breeze on Kubernetes.

Installing Rook on Kubernetes is as easy as pie:

1
2
3
4
5
6
7
rook:
    @echo "=> Installing CRDs for rook..."
    $(KUBECTL) apply -f crds.yaml
    @echo "=> Installing common resources for rook..."
    $(KUBECTL) apply -f common.yaml
    @echo "=> Installing rook operator..."
    $(KUBECTL) apply -f operator.yaml

Testing Methodology

The methodology is pretty simple – as stated in the README, we’re going to:

  1. Provision compute resources on AWS
  2. Set up a k8s cluster on those machines
  3. Install Rook
  4. Run some workload simulations (ex. pgbench)

Then, we’re going to do the same thing again, but the second time will be ⚡supercharged by AWS Nitro⚡.

Huge thanks to Alexander, Founding Engineer @ Koor for flexing his expertise and helping resolve Rook cluster setup issues!

Wrapup

Well clearly AWS’s Nitro system is very impressive at increasing the throughput of I/O bound workloads! It’s not as dramatic as a change from HDD to SSD or SSD to NVMe but it’s certainly a huge step up, with not much more than an instance type change (and a few code changes).