Skip to content

Enable metrics #405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 21, 2021
Merged

Enable metrics #405

merged 7 commits into from
Jul 21, 2021

Conversation

amisevsk
Copy link
Collaborator

@amisevsk amisevsk commented Apr 30, 2021

What does this PR do?

Enables serving metrics for both the DevWorkspace controller and webhooks server.

Metrics are secured via kube-rbac-proxy, which is the default setup when kubebuilder bootstraps a project. The image used can be configured via env vars KUBE_RBAC_PROXY_IMAGE and OPENSHIFT_RBAC_PROXY_IMAGE -- I made it separate so that we can use a different image on openshift (openshift4/ose-kube-rbac-proxy)

Changes to the deployment:

  • Changed generate-deployment.sh to be a little smarter in how it uses envsubst -- we no longer need bak files :)
  • Repurposed the (useless) auth-proxy rbac files to actually apply to the right objects (previously they were tied to nonexisting service accounts)
  • Changed the name of the metrics service from devworkspace-controller-manager-metrics-service to devworkspace-controller-metrics-service and made the selector actually select the controller's deployment.

What issues does this PR fix or reference?

Closes #241

Is it tested? How?

make install and wait for everything to be up and running (you may need to make uninstall to get rid of the old auth-proxy service). Once everything is running:

  1. We need to grant the devworkspace-controller-metrics-reader to some serviceaccount, e.g.
    NAMESPACE=devworkspace-controller
    kubectl create clusterrolebinding dw-metrics --clusterrole=devworkspace-controller-metrics-reader --serviceaccount=${NAMESPACE}:devworkspace-controller-serviceaccount
  2. Get the token for the serviceaccount and store it:
    NAMESPACE=devworkspace-controller
    TOKEN=$(kubectl get secrets -o=json -n ${NAMESPACE} | jq -r '[.items[] | select (.type == "kubernetes.io/service-account-token" and .metadata.annotations."kubernetes.io/service-account.name" == "devworkspace-controller-serviceaccount")][0].data.token' | base64 --decode)
  3. Expose the metrics services on the local network (on OpenShift, you can oc expose svc devworkspace-controller-metrics-service, etc. with tls.termination: passthrough)
    kubectl port-forward service/devworkspace-controller-manager-service 8443:8443 &; 
    kubectl port-forward service/devworkspace-webhookserver 9443:9443
  4. Use the token from 2. to check metrics:
    curl -k -H "Authorization: Bearer ${TOKEN}" https://localhost:9443/metrics
    curl -k -H "Authorization: Bearer ${TOKEN}" https://localhost:8443/metrics
    

To ingest the metrics locally with prometheus/grafana:

  1. Create file prometheus.yaml
    cat <<EOF > prometheus.yaml
    global:
      scrape_interval:     1s
      evaluation_interval: 1s
    
    scrape_configs:
      - job_name: 'DevWorkspace'
        scheme: https
        authorization:
          type: Bearer
          credentials: ${TOKEN}
        tls_config:
          insecure_skip_verify: true
        static_configs:
        - targets: ['localhost:8443']
    
      - job_name: 'DevWorkspace webhooks'
        scheme: https
        authorization:
          type: Bearer
          credentials: ${TOKEN}
        tls_config:
          insecure_skip_verify: true
        static_configs:
        - targets: ['localhost:9443']
    EOF
  2. Start Prometheus docker container with config file above
    docker run -d --name prometheus \
      --network=host \
      -p 9090:9090 \
      -v $(pwd)/prometheus.yaml:/etc/prometheus/prometheus.yaml:z \
      prom/prometheus --config.file=/etc/prometheus/prometheus.yaml --log.level=debug
    note: --network=host is required when using localhost, otherwise it can be dropped (e.g. for OpenShift routes)
  3. Start Grafana
    docker run -d --name grafana -p 3000:3000 grafana/grafana
  4. Navigate to localhost:3000, login as admin/admin, add the datasource for prometheus (http://localhost:9090, Access: Browser), and create a new dashboard

@openshift-ci-robot
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amisevsk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@sleshchenko sleshchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested yet but changes LGTM
The only thing that may want us not to merge it - productization, but I think kuberbac proxy is common enough that we can use an existing productized container instead of doing it by ourselves.

@amisevsk
Copy link
Collaborator Author

Rebased PR onto main and merged metrics-service.yaml into the service.yaml created to support webhooks. Still need to double-check that everything works though.

@JPinkney
Copy link
Contributor

@amisevsk @sleshchenko Are we going to target this for the 1.0 release?

@amisevsk
Copy link
Collaborator Author

Are we going to target this for the 1.0 release?

As of now, no. We need to find an OpenShift-suitable image for kube-rbac-proxy (or we need to build such an image ourselves, which I want to avoid). Something like ose-kube-rbac-proxy may be suitable but pulling that image requires authentication. Ideally we would use whatever rbac-proxy image is build into OpenShift to avoid maintaining our own copy.

amisevsk added 4 commits July 19, 2021 15:01
Enable setting the image used for the kube-rbac-proxy via env vars in
order to allow using a different repository for the proxy on OpenShift
vs Kubernetes.

Signed-off-by: Angel Misevski <[email protected]>
Use OPENSHIFT_RBAC_PROXY_IMAGE env var to set rbac proxy image used in
OLM deployments in similar way to OpenShift.

Signed-off-by: Angel Misevski <[email protected]>
@amisevsk
Copy link
Collaborator Author

Retesting this PR now, it seems that the ose-rbac-proxy image can be pulled when running on OpenShift (tested on regular OpenShift 4.7 cluster and in crc).

Latest changes can be tested using the catalogsource below (and installing DWO as an operator)

cat <<EOF | kubectl apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: custom-devworkspace-operator-catalog
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/amisevsk/devworkspace-operator-index@sha256:cfbe26ee7003a15fed3d6a92f45dd30023c020283cc48df98b2bb201972db547
  publisher: Red Hat
  displayName: DevWorkspace Operator Catalog
EOF

@amisevsk
Copy link
Collaborator Author

Fixed two more minor issues

  1. make run/debug are broken since there's multiple containers now (getting env vars didn't work by default)
  2. The webhook server service was never being updated, so testing this PR required a make uninstall first. Should be fixed now.

Copy link
Member

@sleshchenko sleshchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not grafana master but I manage to get some diagram and it works just fine.
Screenshot_20210720_173556

Good job!

Copy link
Contributor

@JPinkney JPinkney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested it out and it works for me 👍

@openshift-ci
Copy link

openshift-ci bot commented Jul 20, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amisevsk, JPinkney, sleshchenko

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [JPinkney,amisevsk,sleshchenko]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@amisevsk
Copy link
Collaborator Author

/test v7-devworkspaces-operator-e2e, v7-devworkspace-happy-path

1 similar comment
@sleshchenko
Copy link
Member

/test v7-devworkspaces-operator-e2e, v7-devworkspace-happy-path

@sleshchenko
Copy link
Member

/test v7-devworkspace-happy-path

@openshift-ci
Copy link

openshift-ci bot commented Jul 21, 2021

@amisevsk: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/v7-devworkspace-happy-path cabd423 link /test v7-devworkspace-happy-path

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@amisevsk
Copy link
Collaborator Author

I believe happy-path test failures are expected due to #484

@amisevsk amisevsk merged commit 0ee61fd into devfile:main Jul 21, 2021
@amisevsk amisevsk deleted the enable-metrics branch July 21, 2021 17:39
@amisevsk amisevsk mentioned this pull request Jul 21, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement metrics for operator
4 participants