Monitoring Kubernetes with Prometheus
Kubernetes is directly instrumented with the Prometheus client library. Monitoring Kubernetes with Prometheus makes perfect sense as Prometheus can leverage data from the various Kubernetes components straight out of the box.
Prometheus is an open-source cloud native project, targets are discovered via service discovery or static configuration. Prometheus uses PromQL which is a flexible query language to fully leverage it’s multi-dimentional data model.
We can also instrument our own applications with the Go, Java, Scala, Python or Ruby client libraries, or via one of the many unofficial third-party libraries. You can use an exporter where it’s not feasible to instrument a given system.
We can then use Grafana to visually display data collected via Prometheus.
- The Prometheus server which scrapes and stores time series data
- Client libraries for instrumenting application code
- A push gateway for supporting short-lived jobs
- Exporters for exporting data from various open-source projects.
- An alert manager for all of your alerting needs.
We are going to focus on the Prometheus Server, mainly on the installation and configuration. We’ll look at how to scrape pod metrics via annotations and run through various examples to get Prometheus monitoring Kubernetes and various workloads running within the cluster.
If you’re a Helm user you can install Prometheus via the official Helm chart as below, alternatively you could leverage the Prometheus Operator which allows simplified configuration as well as use of the ServiceMonitor and AlertManager CRDs.
The below Helm commands will add the required repositories and install Prometheus.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo add kube-state-metrics https://kubernetes.github.io/kube-state-metrics helm repo update helm install prometheus prometheus-community/prometheus
The official Prometheus helm chart doesn’t request a storage class when creating the persistent volume claim, so lets create some persistent volumes for Prometheus to use.
apiVersion: v1 kind: PersistentVolume metadata: name: prometheus-alertmanager spec: capacity: storage: 2Gi volumeMode: Filesystem hostPath: path: /mnt/prometheus-alertmanager accessModes: - ReadWriteOnce --- apiVersion: v1 kind: PersistentVolume metadata: name: prometheus-server spec: capacity: storage: 8Gi volumeMode: Filesystem hostPath: path: /mnt/prometheus-server accessModes: - ReadWriteOnce
Depending on the use case and scale of your Prometheus cluster hostPath volumes may not be sufficient, they are however perfect for testing and validating Prometheus is up and running correctly.
If you hit the following error after installation:
Error opening query log file" file=/data/queries.active
It will be due to the volume permissions. So you can either change the security context values when installing Prometheus so they match the permissions on your volume, or you could change the volume to match Prometheus.
The volumes we created above are hostPath volumes, to change the permissions simply log onto the host which contains the volume and update the directory owner as below:
chown -R 65534:65534 /mnt/prometheus-server/
To get all of the configuration options you can run
helm show values prometheus-community/prometheus > values.yml
After that, you can then customise the various aspects of the configuration and install with the -f option.
helm install prometheus prometheus-community/prometheus -f values.yml
Certain values are populated in the prometheus-server ConfigMap (for instance, AlertManager config and Alerting Rules) and then mounted to the required pods.
By default the above helm chart will automatically install kube-state-metrics. To disable this dependency during installation, set kubeStateMetrics.enabled to false:
helm install prometheus prometheus-community/prometheus --set kubeStateMetrics.enabled=false
You could also achieve this by setting the kubeStateMetrics variable to false in values.yml.
Scraping Pod Metrics via Annotations
In order to scrape pod metrics, you must add annotations to the pods, as per the below example:
metadata: annotations: prometheus.io/scrape: "true" prometheus.io/path: /metrics prometheus.io/port: "8080"
The prometheus.io/path and prometheus.io/port annotations are determined by how the pod is serving the metrics.
The values for prometheus.io/scrape and prometheus.io/port need to be enclosed in double quotes.
Exporters are useful for cases where it’s not feasible to instrument a given system with Prometheus directly. You’ll find that some exporters are officially maintained by the Prometheus GitHub organisation whereas others are maintained by third-parties.
Below is an example of the Redis Exporter running as a sidecar alongside Redis.
--- apiVersion: v1 kind: Namespace metadata: name: redis --- apiVersion: apps/v1 kind: Deployment metadata: namespace: redis name: redis spec: replicas: 1 selector: matchLabels: app: redis template: metadata: annotations: prometheus.io/scrape: "true" prometheus.io/port: "9121" labels: app: redis spec: containers: - name: redis image: redis:4 resources: requests: cpu: 100m memory: 100Mi ports: - containerPort: 6379 - name: redis-exporter image: oliver006/redis_exporter:latest resources: requests: cpu: 100m memory: 100Mi ports: - containerPort: 9121
Once applied, you’ll see the Redis target up along with Redis metrics in Prometheus.
There are two ways to access the Prometheus Dashboard, you can expose the Service and connect via your ingress controller or connect using kubectl port-forward.
To use port forwarding, run kubectl get pods to find the pod name for the Prometheus server, or use your preferred method to get this information.
Once you know the Prometheus pod name you can then run:
kubectl port-forward <pod name> 9090:9090
This will make the Prometheus dashboard accessible on localhost:9090
Your monitoring strategy will largely depend on your organisational requirements. There are plenty of paid and open source solutions that achieves what Prometheus (and Grafana) can offer. What’s currently in operational use elsewhere within the business might be a quick and easy win but it’s not necessarily the best solution for monitoring Kubernetes.