Prometheus Exporters for fly.io and Vultr

May 20, 2022 - 5 minutes read - 867 words

I’ve been on a roll building utilities this week. I developed a Service Health dashboard for my “thing”, a Prometheus Exporter for Fly.io and today, a Prometheus Exporter for Vultr. This is motivated by the fear that I will forget a deployed Cloud resource and incur a horrible bill.

I’ve no written several Prometheus Exporters for cloud platforms:

Each of them monitors resource deployments and produces resource count metrics that can be scraped by Prometheus and alerted with Alertmanager. I have Alertmanager configured to send notifications to Pushover. Last week I wrote an integration between Google Cloud Monitoring to send notifications to Pushover too.

So, I’m confident that I can track my resources across these disparate platforms. Next step, I guess, is to monitor the monitoring the tools. It’s turtles all the way down!

Prometheus Exporter for Fly.io

This one may be entirely redundant but I was unable to determine how to configure AlertManager to consume the Fly’s Prometheus Metrics]. If I understand correctly, Fly publishes a (virtual) Prometheus API which you can access, by (your) organization (slug): https://api.fly.io/prometheus/$(your-org). The documentation suggests consuming this endpoint using Grafana which is great but I prefer to use Prometheus and Alertmanager for metrics and Grafana for dashboards.

I tried to configure a Prometheus server to scrape Fly’s Prometheus API as a /federate and a /remote-read but neither (appear to) work(ed).

What I mostly want to track is apps deployed to Fly to ensure I’m not incurring too many $$$.

Fly’s Golang CLI flyctl includes an API implementation. It’s not well documented but it’s trivial to enumerate the apps within an org (slug):

name := fmt.Sprintf("%s_%s", c.System.Namespace, c.System.Subsystem)
client := api.NewClient(
    c.Token,
    name,
    c.System.Version,
    terminal.DefaultLogger,
)

ctx := context.Background()
role := ""
apps, _ := client.GetApps(ctx, &role)

	log.Printf("Retrieved %d apps", len(apps))

	for _, app := range apps {
		log.Printf("Name: %s/%s (%s) [%s]",
            app.Organization.Slug,
            app.Name,
            app.ID,
            strconv.FormatBool(app.Deployed),
        )
    }

For this type of data, I’m following the Prometheus pattern of having an up metric (alive):

prometheus.MustNewConstMetric(
    "up",
    prometheus.CounterValue,
    1.0,
    app.ID, app.Name, app.Organization.Slug, app.Status, strconv.FormatBool(app.Deployed),
)

This, without any labels, represents the aggregate number of apps deployed in the org.

I can alert on this:

- name: fly_exporter
  rules:
  - alert: fly_exporter_apps_running
    expr: fly_exporter_apps{} > 0
    for: 6h
    labels:
      severity: page
    annotations:
      summary: fly.io apps deployed

If there are more than zero apps deployed for a period of 6 hours, notify me!

Prometheus Exporter for Vultr

I’m a sucker for managed Kubernetes offerings. My “thing” uses Google Kubernetes Engine which is the trailblazing service (as is expected) but I’ve used Digital Ocean Kubernetes Engine (DOKE), Linode Kubernetes Engine (LKE) and decided to give Vultr Kubernetes Engine (VKE) a whirl too, yesterday.

I was compelled by the General Availability of VKE but, I was committed once I saw that there’s a Golang-based CLI too. Not only because Golang CLIs aren’t Node.JS CLIs (and I dislike running Node.JS) but because, as with Fly, a Golang CLIs generally (!) entails a Golang API SDK too.

One thing I dislike (!) about Vultr’s CLI is that there’s no evident way to format command output as anything other than text. And, while my [awk]’s improving, I find it curious that a Golang CLI would retrieve (probably JSON but perhaps protobuf) data from its API service, unmarshal it into a Golang struct type and then not provide json.Marshal and yaml.Marshal on the way out. I digress. Vote here Issue #72.

The SDK is very easy to use and well-documented:

ctx := context.Background()

config := &oauth2.Config{}
token := &oauth2.Token{
    AccessToken: apiKey,
}
ts := config.TokenSource(ctx, token)
client := govultr.NewClient(oauth2.NewClient(ctx, ts))

// Optional changes
_ = client.SetBaseURL("https://api.vultr.com")
client.SetUserAgent(name)
client.SetRateLimit(500)

NOTE apiKey is obtained from Vultr’s dashboard (CLI?). Curiously, it appears there’s only one API per account which is limiting.

Once you’ve an API client, you can, for example, enumerate Kubernetes clusters, node pools and nodes:

ctx := context.Background()
options := &govultr.ListOptions{}
clusters, _, _ := c.Client.Kubernetes.ListClusters(ctx, options)

var wg sync.WaitGroup
for _, cluster := range clusters {
    wg.Add(1)
    go func(cluster govultr.Cluster) {
        defer wg.Done()
        log.Printf("Cluster: %s (%s) [%s]",
            cluster.Label,
            cluster.ID,
            cluster.Status,
        )

        for _, nodepool := range cluster.NodePools {
            log.Printf("Node Pool (%d): %s (%s) [%s]",
                len(nodepool.Nodes),
                nodepool.Label,
                nodepool.ID,
                nodepool.Status,
            )
        }
    }(cluster)
}
wg.Wait()

Vultr provides various types of resources that would likely be useful to other Vultr developers (Instances, DNS, Objects, Firewalls, Networks and Load Balancers) but, I’m currently only using Kubernetes Engine and so I’ve only implemented metrics for Kubernetes.

ch <- prometheus.MustNewConstMetric(
    "cluster_up",
    prometheus.CounterValue,
    func(status string) (result float64) {
        if status == "active" {
            result = 1.0
        }
        return result
    }(cluster.Status),
    []string{
        cluster.Label,
        cluster.Region,
        cluster.Version,
        cluster.Status,
    }...,
)
ch <- prometheus.MustNewConstMetric(
    "node_pools",
    prometheus.GaugeValue,
    float64(len(cluster.NodePools)),
    []string{
        cluster.Label,
        cluster.Region,
        cluster.Version,
        cluster.Status,
    }...,
)
ch <- prometheus.MustNewConstMetric(
    "nodes",
    prometheus.GaugeValue,
    float64(nodepool.NodeQuantity),
    []string{
        nodepool.Label,
        nodepool.Plan,
        nodepool.Status,
        nodepool.Tag,
    }...,
)

The cluster_up metric without labels provides an aggregate of the number of Kubernetes clusters deployed.

I can alert on this:

- name: vultr_exporter
  rules:
  - alert: vultr_exporter_kubernetes_cluster_up
    expr: vultr_exporter_kubernetes_cluster_up{} > 0
    for: 6h
    labels:
      severity: page
    annotations:
      summary: Vultr Kubernetes Engine cluster running

If there are more than zero Kubernetes clusters running for 6h, notify me!

That’s all!