Secure (TLS) gRPC services with VKE

June 3, 2022 - 4 minutes read - 802 words

NOTE cert-manager is a better solution to what follows.

I’ve a need to deploy a Vultr Kubernetes Engine (VKE) cluster on a daily basis (create and delete within a few hours) and expose (securely|TLS) a gRPC service.

I have an existing solution Automatic Certs w/ Golang gRPC service on Compute Engine that combines a gRPC Healthchecking and an ACME service and decided to reuse this.

In order for it work, we need:

Kubernetes cluster
Deployment
Storage
Load-balancer
DNS

Deployment

The simplest part of the solution is to craft a Kubernetes Deployment:

apiVersion: v1
kind: List
items:
- kind: Deployment
  apiVersion: apps/v1
  metadata:
    labels:
      app: autocert
      type: server
    name: autocert
  spec:
    replicas: 1
    selector:
      matchLabels:
        app: autocert
        type: server
    template:
      metadata:
        labels:
          app: autocert
          type: server
      spec:
        imagePullSecrets:
        - name: ghcr
        containers:
        - name: autocert
          image: {IMAGE}
          command:
          - /autocert
          args:
          - --host={HOST}
          - --port=50051
          - --path=/certs
          ports:
          - name: http
            containerPort: 80
          - name: grpc
            containerPort: 50051

NOTE When I remember to do so, I prefer to use List Resources as a slightly more elegant way to combine multiple Kubernetes Resources in a single file. (More) often, this is done using ---, YAML’s document begin

NOTE I’ve retained the imagePullSecrets in the above as I’m accessing IMAGE from a private registry.

NOTE {IMAGE} and {HOST} are both replaced during deployment. The ACME service needs to know the intended host name in order to procure the X509 cert.

Why use List with only one item? Because more items will be added…

Storage

autocert stores X509 certificates provided by Let’s Encrypt. In order to be able to use these across my daily recreations of the cluster. I could have used an emptyDir and been careful to ensure that I captured the files before they were lost. I decided to use Vultr’s Block Storage particularly after reading that Vultr provides a Container Storage Interface (CSI) vultr-csi that’s installed automatically on VKE clusters.

It took me a few tries to get this working but it’s straightforward.

Vultr suggests querying the regions endpoint to determine what types of block storage are available in the region that you’re using:

curl \
--silent \
https://api.vultr.com/v2/regions \
| jq -r '.regions[]|select(.id=="sea")'

yields:

{
  "id": "sea",
  "city": "Seattle",
  "country": "US",
  "continent": "North America",
  "options": [
    "ddos_protection",
    "block_storage_storage_opt",
    "load_balancers",
    "kubernetes"
  ]
}

block_storage_storage_opt means that HDD is available not no NVMe (no block_storage_high_perf).

I queried the StorageClasses in the cluster:

kubectl get storageclasses \
--output=name \
--namespace=healthcheck \
--kubeconfig=${CONFIG}

vultr-block-storage
vultr-block-storage-hdd (default)
vultr-block-storage-hdd-retain
vultr-block-storage-retain

NOTE It’s unclear whether vultr-block-storage and vultr-block-storage-hdd are synonyms but I used vultr-block-storage-hdd.

So I wrote:

- kind: PersistentVolumeClaim
  apiVersion: v1
  metadata:
    name: autocert
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 50Gi
    storageClassName: vultr-block-storage-hdd

NOTE Because Kubernetes is a declarative system, even after applying the Deployment, adding the PersistentVolumeClaim config and re-applying works just fine. The PersistentVolumeClaim will be created and, because the Deployment is unchanged, it won’t be changed.

And:

kubectl describe pvc/autocert  \
--namespace=healthcheck \
--kubeconfig=${CONFIG}

yields:

Name:          autocert
Namespace:     healthcheck
StorageClass:  vultr-block-storage-hdd
Status:        Bound
Volume:        pvc-4271bc344c674607
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: block.csi.vultr.com
               volume.kubernetes.io/storage-provisioner: block.csi.vultr.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      50Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       access-pvc
               autocert-5cc6b768c6-vr8tq
Events:        <none>

Load-balancer

Depending on the Kubernetes platform, developers have the option to create Service of type: LoadBalancer to create TCP Load-balancers (LBs) or Ingress to create HTTP(S) LBs.

gRPC uses HTTP/2 and this is not supported by Vultr’s LB service. So, I’m using a TCP LB and this is described using Vultr-specific annotations:

- kind: Service
  apiVersion: v1
  metadata:
    annotations:
      service.beta.kubernetes.io/vultr-loadbalancer-protocol: "tcp"
    labels:
      app: autocert
      type: server
    name: autocert
  spec:
    type: LoadBalancer
    selector:
      app: autocert
      type: server
    ports:
    - name: http
      port: 80
      targetPort: 80
    - name: grpc
      port: 443
      targetPort: 50051

NOTE The LB exposes 2 (port 80 and 443) services (has 2 forwarding rules). Port 80 exposes the ACME service. When an X509 cert is needed, the ACME service provides a handler on port 80 that provides interaction with the Let’s Encrypt service. The gRPC service is published on 443 and maps to the container port 50051.

NOTE It appears to not be possible to nudge a Vultr LB onto a reserved IP address. This would be useful to facilitate the next step.

DNS

I should (!) use but I don’t have a programmable DNS service. If my DNS were programmable, I could grab the IP address of the Vultr LB and update a DNS (A)lias record to point to the LB’s endpoint.

It appears to not be problematic for the ACME service though. After I manually updated the DNS record, I tried to gRPCurl the service and it worked:

grpcurl ${HOST}:443 grpc.health.v1.Health/Check

NOTE Don’t forget to always use the :{PORT} when specifying gRPC endpoints as there’s no assumption of default ports as there is with browsers.

Yields:

{
  "status": "SERVING"
}