Migrating Prometheus Exporters to Kubernetes
- 8 minutes read - 1622 wordsI have built Prometheus Exporters for multiple cloud platforms to track resources deployed across clouds:
- Prometheus Exporter for Azure
- Prometheus Exporter for crt.sh
- Prometheus Exporter for Fly.io
- Prometheus Exporter for GoatCounter
- Prometheus Exporter for Google Analytics
- Prometheus Exporter for Google Cloud
- Prometheus Exporter for Koyeb
- Prometheus Exporter for Linode
- Prometheus Exporter for PorkBun
- Prometheus Exporter for updown.io
- Prometheus Exporter for Vultr
Additionally, I’ve written two status service exporters:
These exporters are all derived from an exemplar DigitalOcean Exporter written by metalmatze for which I maintain a fork.
The Exporters share common flags: --endpoint=0.0.0.0:{PORT} --path=/metrics but necessarily have unique configuration (combinations of environment and flags) for secrets.
I’ve been running these Exporters locally under podman as a Pod primarily because a Pod provides an aggregation of the containers and simplifies port publishing. However, with the migration to Kubernetes, each Exporter is configured as a Deployment, Service and associated VerticalPodAutoscaler. The latter to monitor resource requirements.
My cluster includes the excellent kube-prometheus which includes Prometheus Operator. To use this, each Deployment|Service has associated ServiceMonitor and PrometheusRule resources (see CRDs).
There’s also an AlertmanagerConfig to configure Alertmanager.
I use Helm among other things to manage the Tailscale Kubernetes Operator, kustomize because it’s part of Kubebuilder for Ackal’s Operator and occasionally jq and yq for ad hoc templating but, my preference is Jsonnet and I used Jsonnet for this project, specifically the Go implementation of Jsonnet.
Unwilling to sacrifice or revise the existing Podman script, I decided to reuse the existing solution’s environment variables and, rules.yml but was unsuccessful reusing alertmanager.yml (see below).
The original repo comprised:
.
├── .env
├── alertmanager.yml
├── podman.sh
├── prometheus.yml
└── rules.yml
As a result, I have the following additional (!) files and will explain most of these below:
.
├── alertmanager.json
├── alertmanager.jsonnet
├── alertmanager.sh
├── kubernetes.jsonnet
├── kubernetes.sh
├── rules.json
├── rules.jsonnet
└── rules.sh
The primarily Jsonnet file is kubernetes.jsonnet and this creates Deployment, Service, ServiceMonitor and VerticalPodAutoscaler resources.
Because there are multiple (similar) Exporters, the Jsonnet file is essentially a loop over a list of exporters that is mapped to a Kubernetes List type that includes a Deployment, Service, ServiceMonitor and VerticalPodAutoscaler for each Exporter:
local exporters = [
// Must use [name] here to replace with the variable value
"[name]: {
// Exporter
"image": "...",
"args": [],
"env": [],
...
}
];
// Output
{
"apiVersion": "v1",
"kind": "List",
"items": [
] + std.map(
function(name) deployment(...),
std.objectFields(exporters),
) + std.map(
function(name) service(...),
std.objectFields(exporters),
) + std.map(
function(name) serviceMonitor(...),
std.objectFields(exporters),
) + std.map(
function(name) vpa(...),
std.objectFields(exporters),
)
}
Where each Kubernetes Resource (e.g. Deployment)has a Jsonnet function that generates the Resource config:
local deployment(name, exporter, ...) = {
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": name,
"labels": { ... },
},
"spec": {
"selector": {
"matchLabels": { ... },
},
"template": {
"metadata": {
"labels": { ... },
},
"spec": {
"serviceAccount": serviceaccount,
"containers": [
{
// Exporter
}
]
}
}
}
}
This is mostly straightforward. The only complexity is that the crtsh-exporter uses a --hosts flag that lists the set of fully-qualified host names to be queried. The data is represents by an object where the key is the domain name and the value is the list of hosts:
local hosts = {
"example.com": [
"foo",
"bar",
"baz",
]
};
And this needs to be presented as e.g. --args=foo.example,com,bar.example.com,bar.example.com and it must repeatedly generate a crtsh-exporter for each domain name:
local crtsh_exporters() = {
local fqdn(host, domain) = host + "." + domain,
local fqdns(hosts, domain) = std.map(
function(host) fqdn(host, domain),
hosts,
),
// Must use [...] to evaluate the statement value
["crtsh-" + std.strReplace(domain, ".", "-")]: {
local port = 8080,
"image": "...",
"args": [
"--endpoint=0.0.0.0:" + port,
"--hosts=" + std.join(",",fqdns(hosts[domain], domain)),
// "--path=/metrics",
],
"env": [],
"port": port,
"volumes": [],
"volumeMounts": [],
},
// Iterate over the domains' keys
for domain in std.objectFields(hosts)
};
crtsh_exporters()
This yields:
{
"crtsh-example-com": {
"args": [
"--endpoint=0.0.0.0:8080",
"--hosts=foo.example.com,bar.example.com,baz.example.com"
],
"env": [ ],
"image": "...",
"port": 8080,
"volumeMounts": [ ],
"volumes": [ ]
},
"crtsh-example-org": {
"args": [
"--endpoint=0.0.0.0:8080",
"--hosts=foo.example.org,bar.example.org,baz.example.org"
],
"env": [ ],
"image": "...",
"port": 8080,
"volumeMounts": [ ],
"volumes": [ ]
}
}
And this corresponds to the revised (!) exporters variable and so must be added to the static value:
local exporters = {
...
} + crtsh_exporters();
Each of the Jsonnet files is invoked using a similarly named script (e.g. kubernetes.sh) and it simply:
#!/usr/bin/env bash
source .env
jsonnet \
--ext-str "VARIABLE=${VARIABLE}" \
kubernetes.jsonnet
NOTE Where
VARIABLErecurs for each value in.envthat needs to be surfaced in the Jsonnet file.
It would be simpler to:
jsonnet \
--ext-str "VARIABLE=${VARIABLE}" \
kubernetes.jsonnet \
| kubectl apply --filename=- --namespace=${NAMESPACE}
But I like having the flexibility to test kubernetes.sh:
./kubernetes.sh \
| jq -r '.items[]|select(.kind=="Deployment" and .metadata.name=="foo")'
And generally:
./kubernetes.sh \
| jq -r '.' \
| kubectl apply \
--filename=- \
--namespace=${NAMESPACE}
NOTE With the seemingly redundant
| jq -r '.' |but it helps to rewind editing during testing
I realized that while Prometheus Alerting Rules match kube-prometheus PrometheusRule, Prometheus Alertmanager config differs (!?) from AlertmanagerConfig, e.g.
NOTE
docs.crd.devis excellent though it would benefit from better searching.
alertmanager.yml:
route:
group_by: ["..."]
receiver: gmail
routes:
- receiver: pushover
match:
severity: page
receivers:
- name: gmail
email_configs:
- to: my@gmail.com
from: my@gmail.com
smarthost: smtp.gmail.com:587
auth_username: my@gmail.com
auth_identity: my@gmail.com
auth_password: "..."
- name: pushover
pushover_configs:
- user_key: "..."
token: "..."
is equivalent to:
{
"route": {
"group_by": [
"..."
],
"receiver": "gmail",
"routes": [
{
"receiver": "pushover",
"match": {
"severity": "page"
}
}
]
},
"receivers": [
{
"name": "gmail",
"email_configs": [
{
"to": "my@gmail.com",
"from": "my@gmail.com",
"smarthost": "smtp.gmail.com:587",
"auth_username": "my@gmail.com",
"auth_identity": "my@gmail.com",
"auth_password": "..."
}
]
},
{
"name": "pushover",
"pushover_configs": [
{
"user_key": "...",
"token": "...",
}
]
}
]
}
Becomes:
{
"apiVersion": "v1",
"kind": "List",
"items": [
{
"apiVersion": "v1",
"kind": "Secret",
"metadata": {
"name": "gmail"
},
"type": "Opaque",
"data": {
"authPassword": authPassword,
},
},
{
"apiVersion": "v1",
"kind": "Secret",
"metadata": {
"name": "pushover"
},
"type": "Opaque",
"data": {
"token": token,
"userKey": userkey
},
},
{
"apiVersion": "monitoring.coreos.com/v1alpha1",
"kind": "AlertmanagerConfig",
"metadata": {
"name": "clouds-monitoring"
},
"spec": {
"receivers": [
{
"emailConfigs": [
{
"authIdentity": "my@gmail.com",
"authPassword": {
"key": "authPassword",
"name": "gmail",
"optional": false
},
"authUsername": "my@gmail.com",
"from": "my@gmail.com",
"smarthost": "smtp.gmail.com:587",
"to": "my@gmail.com"
}
],
"name": "gmail"
},
{
"name": "pushover",
"pushoverConfigs": [
{
"token": {
"key": "token",
"name": "pushover",
"optional": false
},
"userKey": {
"key": "userKey",
"name": "pushover",
"optional": false
}
}
]
}
],
"route": {
"groupBy": [
"..."
],
"receiver": "gmail",
"routes": [
{
"match": {
"severity": "page"
},
"receiver": "pushover"
}
]
}
}
}
],
}
NOTE
AlertmanagerConfig:
- uses
camelCase(whereasalertmanager.ymlissnake_case)- appropriately (!) refactors
Configsecret values e.g.authPasswordas Kubernetes Secrets
Let’s dwell on that second point, alertmanager.yml has the gmail receiver’s authPassword in plaintext:
authPassword: "..."
But AlertmanagerConfig replaces the secret value with a Kubernetes Secret (see below), in this example called gmail (for obvious reasons) and the secret has a key called authPassword with the base64-encoded value:
authPassword: {
"key": "authPassword",
"name": "gmail",
"optional": false
}
So, we have to do 2 things for this transformation:
- Create
Secrets for each receiver; - Reference the
Secretnames in a new receiver object
local receivers = {
"gmail": {
"authPassword": std.extVar("GMAIL_AUTH_PASSWORD"),
},
"pushover": {
"userKey": std.extVar("PUSHOVER_USER_KEY"),
"token": std.extVar("PUSHOVER_TOKEN"),
},
};
// Generate Kubernetes Secret
local secret(name, receiver) = {
// Secret
};
// Generate AlertmanagerConfig receiver config
local configs(name) = {
// Must use [key] here to replace with the variable value
[key]: {
"key": key,
"name": name,
"optional": false,
},
for key in std.objectFields(receivers[name])
};
// Output
{
"apiVersion": "v1",
"kind": "List",
"items": [
] + std.map(
function(name) secret(...),
std.objectFields(receivers),
) + [
{
"apiVersion": "monitoring.coreos.com/v1alpha1",
"kind": "AlertmanagerConfig",
"metadata": {
"name": name,
},
"spec": {
"route": { ... }
"receivers": [
{
"name": "gmail",
"emailConfigs": [
{
// Static values
"to": email,
...
} + configs("gmail"),
],
},
{
"name": "pushover",
"pushoverConfigs": [
{
// No Static values
} + configs("pushover"),
],
},
],
},
},
],
}
I mentioned previous that rules.yml can be mapped mostly directly. Because we’re using Jsonnet, we first need to convert rules.yml into JSON:
podman run \
--interactive \
--rm \
--volume=${PWD}:/workdir \
docker.io/mikefarah/yq \
--output-format=json \
rules.yml \
> rules.json
And then, from within the Jsonnet file, we can import it:
local rules = import "rules.json";
local prometheusrule(name, rules) = {
"apiVersion": "monitoring.coreos.com/v1",
"kind": "PrometheusRule",
"metadata": {
"name": name,
"labels": {},
},
"spec": {
"groups": [
{
"name": name,
"rules": rules,
},
],
},
};
// Output
{
"apiVersion": "v1",
"kind": "List",
"items": [
] + std.map(
function(group) prometheusrule(
// Fixup group names replacing understores with hyphens
std.strReplace(group.name,"_","-"),
group.rules,
),
rules.groups,
)
}
NOTE The only change (because I want to continue to treat
rules.ymlas a source of truth) is that, in my case, I need to rename the rule group names which contain underscores with hyphens.
Once the system is deployed and alerts begin firing, it’s good to have some sanity checks:
.
├── kubernetes.test.sh
├── prometheus.test.sh
└── vpa.check.sh
kubernetes.test.sh verifies that the correct number of Deployment’s, Service’s etc. are created. This is quite a useful pattern:
WANT="..." # Some number
GOT=$(\
kubectl get deployments \
--namespace=${NAMESPACE} \
--output=name \
| wc --lines)
if [ "${GOT}" != "${WANT}" ]; then
printf "got: %s; want: %s\n" "$GOT" "$WANT"
fi
prometheus.test.sh verifies Prometheus targets, rules etc. using the Prometheus API:
ENDPOINT="http://localhost:9090" # Or ...
FILTER=".data.activeTargets|length"
WANT="..." # Some number
GOT=$(\
curl \
--silent \
--get \
--header "Accept: application/json" \
${ENDPOINT}/api/v1/targets \
| jq -r "${FILTER}")
)
if [ "${GOT}" != "${WANT}" ]; then
printf "got: %s; want: %s\n" "$GOT" "$WANT"
fi
That’s all!