Operator for monitor the health of your managed clusters
This document attempts to explain how the different components in Open Cluster Management Observabilty come together to deliver multicluster fleet observability. We do leverage several open source projects: Grafana, Alertmanager, Thanos, Observatorium Operator and API Gateway, Prometheus; We also leverage a few Open Cluster Mangement projects namely - Cluster Manager or Registration Operator, Klusterlet. The multicluster-observability operator is the root operator which pulls in all things needed.
Component | Git Repo | Description |
---|---|---|
MCO Operator | multicluster-observability-operator | Operator for monitoring. This is the root repo. If we follow the Readme instructions here to install, the code from all other repos mentioned below are used/referenced. |
Endpoint Operator | endpoint-metrics-operator | Operator that manages setting up observability and data collection at the managed clusters. |
Observatorium Operator | observatorium-operator | Operator to deploy the Observatorium project. Inside the open cluster management, at this time, it means metrics using Thanos. Forked from main observatorium-operator repo. |
Metrics collector | metrics-collector | Scrapes metrics from Prometheus at managed clusters, the metric collection being shaped by configuring allow-list. |
RBAC Proxy | rbac_query_proxy | Helper service that acts a multicluster metrics RBAC proxy. |
Grafana | grafana | Grafana repo - for dashboarding and metric analytics. Forked from main grafana repo. |
Dashboard Loader | grafana-dashboard-loader | Sidecar proxy to load grafana dashboards from configmaps. |
Management Ingress | management-ingress | NGINX based ingress controller to serve Open Cluster Management services. |
Observatorium API | observatorium | API Gateway which controls reading, writing of the Observability data to the backend infrastructure. Forked from main observatorium API repo. |
Thanos Ecosystem | kube-thanos | Kubernetes specific configuration for deploying Thanos. The observatorium operator leverages this configuration to deploy the backend Thanos components. |
open-cluster-management
klusterlet is installed. See Klusterlet for more information.Note: By default, the API conversion webhook use on the OpenShift service serving certificate feature to manage the certificate, you can replace it with cert-manager if you want to run the multicluster-observability-operator in a kubernetes cluster.
Use the following quick start commands for building and testing the multicluster-observability-operator:
Check out the multicluster-observability-operator repository.
git clone git@github.com:stolostron/multicluster-observability-operator.git
cd multicluster-observability-operator
Build the multicluster-observability-operator image and push it to a public registry, such as quay.io:
make docker-build docker-push IMG=quay.io/<YOUR_USERNAME_IN_QUAY>/multicluster-observability-operator:latest
open-cluster-management-observability
namespace if it doesn’t exist:
kubectl create ns open-cluster-management-observability
kubectl -n open-cluster-management-observability apply -k examples/minio
make deploy IMG=quay.io/<YOUR_USERNAME_IN_QUAY>/multicluster-observability-operator:latest
kubectl apply -f operators/multiclusterobservability/config/samples/observability_v1beta2_multiclusterobservability.yaml
kubectl -n open-cluster-management-observability get pod
NAME READY STATUS RESTARTS AGE
minio-79c7ff488d-72h65 1/1 Running 0 9m38s
observability-alertmanager-0 3/3 Running 0 7m17s
observability-alertmanager-1 3/3 Running 0 6m36s
observability-alertmanager-2 3/3 Running 0 6m18s
observability-grafana-85fdc8c48d-j67j6 2/2 Running 0 7m17s
observability-grafana-85fdc8c48d-wnltt 2/2 Running 0 7m17s
observability-observatorium-api-69cfff4c95-bpw5s 1/1 Running 0 7m2s
observability-observatorium-api-69cfff4c95-gbh7b 1/1 Running 0 7m2s
observability-observatorium-operator-5df6b7949c-kbpmp 1/1 Running 0 7m17s
observability-rbac-query-proxy-d44df47c4-9ccdn 2/2 Running 0 7m15s
observability-rbac-query-proxy-d44df47c4-rtcgh 2/2 Running 0 6m50s
observability-thanos-compact-0 1/1 Running 0 7m2s
observability-thanos-query-79c4d9488b-bd5sf 1/1 Running 0 7m3s
observability-thanos-query-79c4d9488b-d7wzt 1/1 Running 0 7m3s
observability-thanos-query-frontend-6fdb5d4946-rgblb 1/1 Running 0 7m3s
observability-thanos-query-frontend-6fdb5d4946-shsz2 1/1 Running 0 7m3s
observability-thanos-query-frontend-memcached-0 2/2 Running 0 7m3s
observability-thanos-query-frontend-memcached-1 2/2 Running 0 6m37s
observability-thanos-query-frontend-memcached-2 2/2 Running 0 6m33s
observability-thanos-receive-controller-6b446c5576-hj6xl 1/1 Running 0 7m3s
observability-thanos-receive-default-0 1/1 Running 0 7m2s
observability-thanos-receive-default-1 1/1 Running 0 6m20s
observability-thanos-receive-default-2 1/1 Running 0 5m50s
observability-thanos-rule-0 2/2 Running 0 7m3s
observability-thanos-rule-1 2/2 Running 0 6m27s
observability-thanos-rule-2 2/2 Running 0 5m56s
observability-thanos-store-memcached-0 2/2 Running 0 7m3s
observability-thanos-store-memcached-1 2/2 Running 0 6m37s
observability-thanos-store-memcached-2 2/2 Running 0 6m33s
observability-thanos-store-shard-0-0 1/1 Running 2 7m3s
observability-thanos-store-shard-1-0 1/1 Running 2 7m3s
observability-thanos-store-shard-2-0 1/1 Running 2 7m3s
After a successful deployment, you can run the following command to check if you have OCP cluster as a managed cluster.
kubectl get managedcluster --show-labels
If there is no vendor=OpenShift
label exists in your managed cluster, you can manually add this label with this command kubectl label managedcluster <managed cluster name> vendor=OpenShift
Then you should be able to have metrics-collector
pod is running:
kubectl -n open-cluster-management-addon-observability get pod
endpoint-observability-operator-5c95cb9df9-4cphg 1/1 Running 0 97m
metrics-collector-deployment-6c7c8f9447-brpjj 1/1 Running 0 96m
Expose the thanos query frontend via route by running this command:
cat << EOF | kubectl -n open-cluster-management-observability apply -f -
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: query-frontend
spec:
port:
targetPort: http
wildcardPolicy: None
to:
kind: Service
name: observability-thanos-query-frontend
EOF
You can access the thanos query UI via browser by inputting the host from oc get route -n open-cluster-management-observability query-frontend
. There should have metrics available when you search the metrics
. The available metrics are listed heresum
kubectl -n open-cluster-management-observability delete -f operators/multiclusterobservability/config/samples/observability_v1beta2_multiclusterobservability.yaml
make undeploy
kubectl -n open-cluster-management-observability delete -k examples/minio
open-cluster-management-observability
namespace:
kubectl delete ns open-cluster-management-observability
Rebuild Image: Wed Jan 25 15:08:26 EST 2023