Istio Upgrades: Prometheus SDS
How to handle the migration to Istio SDS in your prometheus instances.
If, like me, you run bespoke instances of Prometheus rather than the one which comes bundled with Istio, you've likely got some configuration that looks like this:
- job_name: 'kubernetes-pods-istio-secure'
honor_labels: true
scheme: https
tls_config:
ca_file: /etc/certs/root-cert.pem
cert_file: /etc/certs/cert-chain.pem
key_file: /etc/certs/key.pem
insecure_skip_verify: true
kubernetes_sd_configs:
- role: pod
relabel_configs:
...
If you do, then I'm sorry to tell you but that's going to stop working when you upgrade Istio 1.6. And it'll fail subtly.
History
Before SDS became the default way of distributing the mTLS certificates to your workloads, citadel
was responsible for creating secrets in your workloads namespace named istio.default
(where default was your service account name for your workload).
The typical pattern then to enable Prometheus to scape mTLS protected endpoints was to volume mount those certificates in:
volumes:
- name: "istio-certs"
secret:
defaultMode: 420
secretName: "istio.default"
volumeMounts:
- mountPath: /etc/certs
name: istio-certs
readOnly: true
However those secrets are now redundant and no longer get created by istiod
. They're not deleted as part of the Istio upgrade process however - so this will eventually manifest as mTLS failures due to the fact prometheus is using certificates that aren't being updated any more. Lovely hey :)
So lets look at how we can get updated certs into Prometheus.
(A) Solution
There may be better ways to do this, and I'm more than happy for someone to comment with a better solution - but this is how I got it working. It was a faff.
I decided to run an istio-proxy
on the prometheus
workload, and use a shared volume between istio-proxy
and prometheus
to share the certs that istio-proxy
would get via SDS. This was easier said than done.
Adding an istio-proxy to Prometheus
The first thing we need to do is ensure that the prometheus StatefulSet
runs an istio proxy. I did this by adding the following annotations:
sidecar.istio.io/inject: "true"
traffic.sidecar.istio.io/includeInboundPorts: ""
traffic.sidecar.istio.io/includeOutboundIPRanges: ""
Effectively they will cause a sidecar to be injected, but not configure any iptables interception.
Configuring istio-proxy to write Certificates to disk
We then need to configure istio-proxy
to write the certificates to disk (by default it won't) but most importantly, write them to a volume which can be shared with the main prometheus application. These are the annotations I used to do that:
proxy.istio.io/config: |
proxyMetadata:
OUTPUT_CERTS: /etc/istio-output-certs
sidecar.istio.io/userVolume: '[{"name": "istio-certs", "emptyDir": {"medium":"Memory"}}]'
sidecar.istio.io/userVolumeMount: '[{"name": "istio-certs", "mountPath": "/etc/istio-output-certs"}]'
The key part here is OUTPUT_CERTS
, which tells istio-proxy
to write the certificates received via SDS to that directory.
Note: It's very important you don't use /etc/certs
for this path, see https://github.com/istio/istio/issues/28050
Reducing the istio-proxy footprint
I also added a Sidecar
resource to reduce the memory footprint of the proxy, as this proxy wasn't going to be used for any communication, it doesn't need any cluster configuration from istiod
other than a single mTLS host (see this issue), which in this case I'm using istiod
itself.
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: prometheus
namespace: app-metrics
spec:
egress:
- hosts:
- istio-system/istiod.istio-system.svc.cluster.local
workloadSelector:
labels:
app: prometheus
Ensuring no mTLS is used when talking to Prometheus
I added a PeerAuthentication
policy to ensure that any of my apps that talk directly to prometheus didn't attempt to do so over mTLS:
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: prometheus
namespace: app-metrics
spec:
mtls:
mode: DISABLE
selector:
matchLabels:
app: prometheus
Using the istio-proxy certificates in Prometheus
At this point, we have a sidecar which is writing the certificates to disk so I added a volumeMount on the prometheus StatefulSet for the istio certificates, which references the volume added by the sidecar injector:
volumeMounts:
- mountPath: /etc/prom-certs/
name: istio-certs
And also updated the prometheus.yaml
file to reference the new folder:
tls_config:
ca_file: /etc/prom-certs/root-cert.pem
cert_file: /etc/prom-certs/cert-chain.pem
key_file: /etc/prom-certs/key.pem
insecure_skip_verify: true
Conclusion
That should get your mTLS scrapes working again. I find it a bit annoying that I've had to make modifications to the injector config in order to get this working.