Migrating Prometheus Exporters to Kubernetes
- 8 minutes read - 1622 wordsI have built Prometheus Exporters for multiple cloud platforms to track resources deployed across clouds:
- Prometheus Exporter for Azure
- Prometheus Exporter for crt.sh
- Prometheus Exporter for Fly.io
- Prometheus Exporter for GoatCounter
- Prometheus Exporter for Google Analytics
- Prometheus Exporter for Google Cloud
- Prometheus Exporter for Koyeb
- Prometheus Exporter for Linode
- Prometheus Exporter for PorkBun
- Prometheus Exporter for updown.io
- Prometheus Exporter for Vultr
Additionally, I’ve written two status service exporters:
These exporters are all derived from an exemplar DigitalOcean Exporter
written by metalmatze for which I maintain a fork.
The Exporters share common flags: --endpoint=0.0.0.0:{PORT} --path=/metrics
but necessarily have unique configuration (combinations of environment and flags) for secrets.
I’ve been running these Exporters locally under podman
as a Pod
primarily because a Pod provides an aggregation of the containers and simplifies port publishing. However, with the migration to Kubernetes, each Exporter is configured as a Deployment
, Service
and associated VerticalPodAutoscaler
. The latter to monitor resource requirements.
My cluster includes the excellent kube-prometheus
which includes Prometheus Operator. To use this, each Deployment
|Service
has associated ServiceMonitor
and PrometheusRule
resources (see CRDs).
There’s also an AlertmanagerConfig
to configure Alertmanager.
I use Helm among other things to manage the Tailscale Kubernetes Operator, kustomize because it’s part of Kubebuilder
for Ackal’s Operator and occasionally jq
and yq
for ad hoc templating but, my preference is Jsonnet and I used Jsonnet for this project, specifically the Go implementation of Jsonnet.
Unwilling to sacrifice or revise the existing Podman script, I decided to reuse the existing solution’s environment variables and, rules.yml
but was unsuccessful reusing alertmanager.yml
(see below).
The original repo comprised:
.
├── .env
├── alertmanager.yml
├── podman.sh
├── prometheus.yml
└── rules.yml
As a result, I have the following additional (!) files and will explain most of these below:
.
├── alertmanager.json
├── alertmanager.jsonnet
├── alertmanager.sh
├── kubernetes.jsonnet
├── kubernetes.sh
├── rules.json
├── rules.jsonnet
└── rules.sh
The primarily Jsonnet file is kubernetes.jsonnet
and this creates Deployment
, Service
, ServiceMonitor
and VerticalPodAutoscaler
resources.
Because there are multiple (similar) Exporters, the Jsonnet file is essentially a loop over a list of exporters that is mapped to a Kubernetes List
type that includes a Deployment
, Service
, ServiceMonitor
and VerticalPodAutoscaler
for each Exporter:
local exporters = [
// Must use [name] here to replace with the variable value
"[name]: {
// Exporter
"image": "...",
"args": [],
"env": [],
...
}
];
// Output
{
"apiVersion": "v1",
"kind": "List",
"items": [
] + std.map(
function(name) deployment(...),
std.objectFields(exporters),
) + std.map(
function(name) service(...),
std.objectFields(exporters),
) + std.map(
function(name) serviceMonitor(...),
std.objectFields(exporters),
) + std.map(
function(name) vpa(...),
std.objectFields(exporters),
)
}
Where each Kubernetes Resource (e.g. Deployment
)has a Jsonnet function that generates the Resource config:
local deployment(name, exporter, ...) = {
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": name,
"labels": { ... },
},
"spec": {
"selector": {
"matchLabels": { ... },
},
"template": {
"metadata": {
"labels": { ... },
},
"spec": {
"serviceAccount": serviceaccount,
"containers": [
{
// Exporter
}
]
}
}
}
}
This is mostly straightforward. The only complexity is that the crtsh-exporter
uses a --hosts
flag that lists the set of fully-qualified host names to be queried. The data is represents by an object where the key is the domain name and the value is the list of hosts:
local hosts = {
"example.com": [
"foo",
"bar",
"baz",
]
};
And this needs to be presented as e.g. --args=foo.example,com,bar.example.com,bar.example.com
and it must repeatedly generate a crtsh-exporter
for each domain name:
local crtsh_exporters() = {
local fqdn(host, domain) = host + "." + domain,
local fqdns(hosts, domain) = std.map(
function(host) fqdn(host, domain),
hosts,
),
// Must use [...] to evaluate the statement value
["crtsh-" + std.strReplace(domain, ".", "-")]: {
local port = 8080,
"image": "...",
"args": [
"--endpoint=0.0.0.0:" + port,
"--hosts=" + std.join(",",fqdns(hosts[domain], domain)),
// "--path=/metrics",
],
"env": [],
"port": port,
"volumes": [],
"volumeMounts": [],
},
// Iterate over the domains' keys
for domain in std.objectFields(hosts)
};
crtsh_exporters()
This yields:
{
"crtsh-example-com": {
"args": [
"--endpoint=0.0.0.0:8080",
"--hosts=foo.example.com,bar.example.com,baz.example.com"
],
"env": [ ],
"image": "...",
"port": 8080,
"volumeMounts": [ ],
"volumes": [ ]
},
"crtsh-example-org": {
"args": [
"--endpoint=0.0.0.0:8080",
"--hosts=foo.example.org,bar.example.org,baz.example.org"
],
"env": [ ],
"image": "...",
"port": 8080,
"volumeMounts": [ ],
"volumes": [ ]
}
}
And this corresponds to the revised (!) exporters
variable and so must be added to the static value:
local exporters = {
...
} + crtsh_exporters();
Each of the Jsonnet files is invoked using a similarly named script (e.g. kubernetes.sh
) and it simply:
#!/usr/bin/env bash
source .env
jsonnet \
--ext-str "VARIABLE=${VARIABLE}" \
kubernetes.jsonnet
NOTE Where
VARIABLE
recurs for each value in.env
that needs to be surfaced in the Jsonnet file.
It would be simpler to:
jsonnet \
--ext-str "VARIABLE=${VARIABLE}" \
kubernetes.jsonnet \
| kubectl apply --filename=- --namespace=${NAMESPACE}
But I like having the flexibility to test kubernetes.sh
:
./kubernetes.sh \
| jq -r '.items[]|select(.kind=="Deployment" and .metadata.name=="foo")'
And generally:
./kubernetes.sh \
| jq -r '.' \
| kubectl apply \
--filename=- \
--namespace=${NAMESPACE}
NOTE With the seemingly redundant
| jq -r '.' |
but it helps to rewind editing during testing
I realized that while Prometheus Alerting Rules match kube-prometheus
PrometheusRule
, Prometheus Alertmanager config differs (!?) from AlertmanagerConfig
, e.g.
NOTE
docs.crd.dev
is excellent though it would benefit from better searching.
alertmanager.yml
:
route:
group_by: ["..."]
receiver: gmail
routes:
- receiver: pushover
match:
severity: page
receivers:
- name: gmail
email_configs:
- to: my@gmail.com
from: my@gmail.com
smarthost: smtp.gmail.com:587
auth_username: my@gmail.com
auth_identity: my@gmail.com
auth_password: "..."
- name: pushover
pushover_configs:
- user_key: "..."
token: "..."
is equivalent to:
{
"route": {
"group_by": [
"..."
],
"receiver": "gmail",
"routes": [
{
"receiver": "pushover",
"match": {
"severity": "page"
}
}
]
},
"receivers": [
{
"name": "gmail",
"email_configs": [
{
"to": "my@gmail.com",
"from": "my@gmail.com",
"smarthost": "smtp.gmail.com:587",
"auth_username": "my@gmail.com",
"auth_identity": "my@gmail.com",
"auth_password": "..."
}
]
},
{
"name": "pushover",
"pushover_configs": [
{
"user_key": "...",
"token": "...",
}
]
}
]
}
Becomes:
{
"apiVersion": "v1",
"kind": "List",
"items": [
{
"apiVersion": "v1",
"kind": "Secret",
"metadata": {
"name": "gmail"
},
"type": "Opaque",
"data": {
"authPassword": authPassword,
},
},
{
"apiVersion": "v1",
"kind": "Secret",
"metadata": {
"name": "pushover"
},
"type": "Opaque",
"data": {
"token": token,
"userKey": userkey
},
},
{
"apiVersion": "monitoring.coreos.com/v1alpha1",
"kind": "AlertmanagerConfig",
"metadata": {
"name": "clouds-monitoring"
},
"spec": {
"receivers": [
{
"emailConfigs": [
{
"authIdentity": "my@gmail.com",
"authPassword": {
"key": "authPassword",
"name": "gmail",
"optional": false
},
"authUsername": "my@gmail.com",
"from": "my@gmail.com",
"smarthost": "smtp.gmail.com:587",
"to": "my@gmail.com"
}
],
"name": "gmail"
},
{
"name": "pushover",
"pushoverConfigs": [
{
"token": {
"key": "token",
"name": "pushover",
"optional": false
},
"userKey": {
"key": "userKey",
"name": "pushover",
"optional": false
}
}
]
}
],
"route": {
"groupBy": [
"..."
],
"receiver": "gmail",
"routes": [
{
"match": {
"severity": "page"
},
"receiver": "pushover"
}
]
}
}
}
],
}
NOTE
AlertmanagerConfig
:
- uses
camelCase
(whereasalertmanager.yml
issnake_case
)- appropriately (!) refactors
Config
secret values e.g.authPassword
as Kubernetes Secrets
Let’s dwell on that second point, alertmanager.yml
has the gmail
receiver’s authPassword
in plaintext:
authPassword: "..."
But AlertmanagerConfig
replaces the secret value with a Kubernetes Secret (see below), in this example called gmail
(for obvious reasons) and the secret has a key called authPassword
with the base64-encoded value:
authPassword: {
"key": "authPassword",
"name": "gmail",
"optional": false
}
So, we have to do 2 things for this transformation:
- Create
Secret
s for each receiver; - Reference the
Secret
names in a new receiver object
local receivers = {
"gmail": {
"authPassword": std.extVar("GMAIL_AUTH_PASSWORD"),
},
"pushover": {
"userKey": std.extVar("PUSHOVER_USER_KEY"),
"token": std.extVar("PUSHOVER_TOKEN"),
},
};
// Generate Kubernetes Secret
local secret(name, receiver) = {
// Secret
};
// Generate AlertmanagerConfig receiver config
local configs(name) = {
// Must use [key] here to replace with the variable value
[key]: {
"key": key,
"name": name,
"optional": false,
},
for key in std.objectFields(receivers[name])
};
// Output
{
"apiVersion": "v1",
"kind": "List",
"items": [
] + std.map(
function(name) secret(...),
std.objectFields(receivers),
) + [
{
"apiVersion": "monitoring.coreos.com/v1alpha1",
"kind": "AlertmanagerConfig",
"metadata": {
"name": name,
},
"spec": {
"route": { ... }
"receivers": [
{
"name": "gmail",
"emailConfigs": [
{
// Static values
"to": email,
...
} + configs("gmail"),
],
},
{
"name": "pushover",
"pushoverConfigs": [
{
// No Static values
} + configs("pushover"),
],
},
],
},
},
],
}
I mentioned previous that rules.yml
can be mapped mostly directly. Because we’re using Jsonnet, we first need to convert rules.yml
into JSON:
podman run \
--interactive \
--rm \
--volume=${PWD}:/workdir \
docker.io/mikefarah/yq \
--output-format=json \
rules.yml \
> rules.json
And then, from within the Jsonnet file, we can import it:
local rules = import "rules.json";
local prometheusrule(name, rules) = {
"apiVersion": "monitoring.coreos.com/v1",
"kind": "PrometheusRule",
"metadata": {
"name": name,
"labels": {},
},
"spec": {
"groups": [
{
"name": name,
"rules": rules,
},
],
},
};
// Output
{
"apiVersion": "v1",
"kind": "List",
"items": [
] + std.map(
function(group) prometheusrule(
// Fixup group names replacing understores with hyphens
std.strReplace(group.name,"_","-"),
group.rules,
),
rules.groups,
)
}
NOTE The only change (because I want to continue to treat
rules.yml
as a source of truth) is that, in my case, I need to rename the rule group names which contain underscores with hyphens.
Once the system is deployed and alerts begin firing, it’s good to have some sanity checks:
.
├── kubernetes.test.sh
├── prometheus.test.sh
└── vpa.check.sh
kubernetes.test.sh
verifies that the correct number of Deployment
’s, Service
’s etc. are created. This is quite a useful pattern:
WANT="..." # Some number
GOT=$(\
kubectl get deployments \
--namespace=${NAMESPACE} \
--output=name \
| wc --lines)
if [ "${GOT}" != "${WANT}" ]; then
printf "got: %s; want: %s\n" "$GOT" "$WANT"
fi
prometheus.test.sh
verifies Prometheus targets, rules etc. using the Prometheus API:
ENDPOINT="http://localhost:9090" # Or ...
FILTER=".data.activeTargets|length"
WANT="..." # Some number
GOT=$(\
curl \
--silent \
--get \
--header "Accept: application/json" \
${ENDPOINT}/api/v1/targets \
| jq -r "${FILTER}")
)
if [ "${GOT}" != "${WANT}" ]; then
printf "got: %s; want: %s\n" "$GOT" "$WANT"
fi
That’s all!