Skip to content

koordlet: add libpfm4&perf group#1554

Merged
koordinator-bot[bot] merged 8 commits intokoordinator-sh:mainfrom
bowen-intel:libpfm4
Sep 14, 2023
Merged

koordlet: add libpfm4&perf group#1554
koordinator-bot[bot] merged 8 commits intokoordinator-sh:mainfrom
bowen-intel:libpfm4

Conversation

@bowen-intel
Copy link
Contributor

@bowen-intel bowen-intel commented Aug 17, 2023

Ⅰ. Describe what this PR does

This pull request enhances the precision of CPI calculation through the utilization of the perf group. Additionally, it introduces the integration of libpfm4 as an event parser, thereby expanding the potential to incorporate additional TMA analyses in subsequent developments.

Ⅱ. Does this pull request fix one issue?

fixes #1476 #1470

Ⅲ. Describe how to verify it

Use Prometheus.

Ⅳ. Special notes for reviews

This pr uses unsafe.Pointer to mock syscall.Syscall6 in unit test, so I have to add unsafeptr tag for go vet tool and ignore github.com/koordinator-sh/koordinator/pkg/koordlet/util/perf package in test since it use -race tag.
Any comments is welcome.

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@eahydra eahydra changed the title feat: add libpfm4&perf group koordlet: add libpfm4&perf group Aug 17, 2023
@bowen-intel bowen-intel force-pushed the libpfm4 branch 3 times, most recently from cb88d52 to 6afb121 Compare August 17, 2023 07:42
@codecov
Copy link

codecov bot commented Aug 17, 2023

Codecov Report

Patch coverage: 55.83% and project coverage change: +0.53% 🎉

Comparison is base (3fcac00) 65.19% compared to head (b24b776) 65.72%.
Report is 53 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1554      +/-   ##
==========================================
+ Coverage   65.19%   65.72%   +0.53%     
==========================================
  Files         352      376      +24     
  Lines       36326    39842    +3516     
==========================================
+ Hits        23681    26186    +2505     
- Misses      10901    11745     +844     
- Partials     1744     1911     +167     
Flag Coverage Δ
unittests 65.72% <55.83%> (+0.53%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
pkg/features/koordlet_features.go 100.00% <ø> (ø)
pkg/koordlet/util/perf/perf_linux.go 46.73% <0.00%> (ø)
...lectors/performance/performance_collector_linux.go 52.83% <21.42%> (-3.17%) ⬇️
pkg/koordlet/util/stat.go 52.85% <45.00%> (-22.15%) ⬇️
pkg/koordlet/util/perf_group/perf_group_linux.go 62.84% <62.84%> (ø)

... and 99 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bowen-intel bowen-intel force-pushed the libpfm4 branch 3 times, most recently from cdd92e7 to 8c74079 Compare August 18, 2023 09:40
Signed-off-by: bowen-intel <bowen.song@intel.com>
@bowen-intel bowen-intel force-pushed the libpfm4 branch 2 times, most recently from af7b27f to c4e3a75 Compare August 24, 2023 07:57
@bowen-intel bowen-intel marked this pull request as draft August 24, 2023 08:08
Signed-off-by: bowen-intel <bowen.song@intel.com>
@bowen-intel
Copy link
Contributor Author

bowen-intel commented Aug 28, 2023

Make a A/B test for two method:

workload

100 nginx

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx
  namespace: nginx
  labels:
    run: my-nginx
spec:
  replicas: 100
  selector:
    matchLabels:
      run: my-nginx
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      nodeName: inspur-icx-1
      containers:
        - name: my-nginx
          image: nginx:latest
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "1"
              memory: 1Gi
            limits:
              cpu: "1"
              memory: 1Gi
---
apiVersion: v1
kind: Service
metadata:
  name: my-nginx
  namespace: nginx
  labels:
    run: my-nginx
spec:
  ports:
    - port: 80
      protocol: TCP
      targetPort: 80
  selector:
    run: my-nginx

environment

CPU Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
Memory 186Gi
Ubuntu 22.04
containerd github.com/containerd/containerd v1.6.8 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc version 1.1.4
koordinator(based on v1.3.0)
kubernetes v1.27.2

libpfm4

image

perf-util

image

@bowen-intel
Copy link
Contributor Author

Any comments and questions is welcome! 😀 @zwzhang0107 @songtao98

Copy link
Contributor

@songtao98 songtao98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Signed-off-by: bowen-intel <bowen.song@intel.com>
@zwzhang0107
Copy link
Contributor

Make a A/B test for two method:

workload

100 nginx

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx
  namespace: nginx
  labels:
    run: my-nginx
spec:
  replicas: 100
  selector:
    matchLabels:
      run: my-nginx
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      nodeName: inspur-icx-1
      containers:
        - name: my-nginx
          image: nginx:latest
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "1"
              memory: 1Gi
            limits:
              cpu: "1"
              memory: 1Gi
---
apiVersion: v1
kind: Service
metadata:
  name: my-nginx
  namespace: nginx
  labels:
    run: my-nginx
spec:
  ports:
    - port: 80
      protocol: TCP
      targetPort: 80
  selector:
    run: my-nginx

environment

CPU Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz Memory 186Gi Ubuntu 22.04 containerd github.com/containerd/containerd v1.6.8 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6 runc version 1.1.4 koordinator(based on v1.3.0) kubernetes v1.27.2

libpfm4

image

perf-util

image

this looks great!

it will be better if we have the comparison on accuracy between perf and libpfm4

Signed-off-by: bowen-intel <bowen.song@intel.com>
Signed-off-by: bowen-intel <bowen.song@intel.com>
@bowen-intel
Copy link
Contributor Author

Make a A/B test for two method:

workload

100 nginx

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx
  namespace: nginx
  labels:
    run: my-nginx
spec:
  replicas: 100
  selector:
    matchLabels:
      run: my-nginx
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      nodeName: inspur-icx-1
      containers:
        - name: my-nginx
          image: nginx:latest
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "1"
              memory: 1Gi
            limits:
              cpu: "1"
              memory: 1Gi
---
apiVersion: v1
kind: Service
metadata:
  name: my-nginx
  namespace: nginx
  labels:
    run: my-nginx
spec:
  ports:
    - port: 80
      protocol: TCP
      targetPort: 80
  selector:
    run: my-nginx

environment

CPU Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz Memory 186Gi Ubuntu 22.04 containerd github.com/containerd/containerd v1.6.8 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6 runc version 1.1.4 koordinator(based on v1.3.0) kubernetes v1.27.2

libpfm4

image

perf-util

image

this looks great!

it will be better if we have the comparison on accuracy between perf and libpfm4

libpfm4

image cycles: 9694773088 instructions: 2817277063

perf

perf stat -e '{cycles,instructions}' -G xxx.scope -a sleep 10
image

Signed-off-by: bowen-intel <bowen.song@intel.com>
@zwzhang0107
Copy link
Contributor

/lgtm

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit 876e878 into koordinator-sh:main Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[proposal] Use libpfm4 in perf collector

5 participants