Skip to content

koord-scheduler: fix a bug in spec.scheduleTimeoutSeconds nil pointer dereference#985

Merged
koordinator-bot[bot] merged 1 commit intokoordinator-sh:mainfrom
Syulin7:fix_panic
Feb 6, 2023
Merged

koord-scheduler: fix a bug in spec.scheduleTimeoutSeconds nil pointer dereference#985
koordinator-bot[bot] merged 1 commit intokoordinator-sh:mainfrom
Syulin7:fix_panic

Conversation

@Syulin7
Copy link
Contributor

@Syulin7 Syulin7 commented Feb 5, 2023

Signed-off-by: Syulin7 735122171@qq.com

Ⅰ. Describe what this PR does

fix a bug in spec.scheduleTimeoutSeconds nil pointer dereference.

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Creating a podgroup without setting a spec.scheduleTimeoutSeconds value.

apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: PodGroup
metadata:
  name: gang-example
  namespace: default
spec:
  minMember: 2

and then koord-scheduler will panic.

I0129 09:56:43.432556       1 gang_cache.go:69] getGangFromCache create new gang, gang: kubeflow/tfjob-simple
I0129 09:56:43.432563       1 gang_cache.go:202] Create gang by podGroup on add, gangName: kubeflow/tfjob-simple
E0129 09:56:43.432579       1 gang.go:176] podGroup's annotation totalNumber illegal, gangName: kubeflow/tfjob-simple, value:
E0129 09:56:43.432584       1 gang.go:188] podGroup's annotation GangModeAnnotation illegal, gangName: kubeflow/tfjob-simple, value:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1a0d2ec]

goroutine 1 [running]:
github.com/koordinator-sh/koordinator/pkg/scheduler/plugins/coscheduling/core.(*Gang).tryInitByPodGroup(0xc000427680, 0xc00066c600, 0xc000b7d140)
	/home/runner/work/koordinator/koordinator/pkg/scheduler/plugins/coscheduling/core/gang.go:197 +0x5cc
github.com/koordinator-sh/koordinator/pkg/scheduler/plugins/coscheduling/core.(*GangCache).onPodGroupAdd(0xc0005608c0, {0x1f62b80, 0xc00066c600})
	/home/runner/work/koordinator/koordinator/pkg/scheduler/plugins/coscheduling/core/gang_cache.go:205 +0x2e5
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
	/home/runner/go/pkg/mod/k8s.io/client-go@v0.22.6/tools/cache/controller.go:231
github.com/koordinator-sh/koordinator/pkg/scheduler/frameworkext/helper.(*forceSyncEventHandler).addDirectly(0xc000b0b940, {0x1f62b80, 0xc00066c600})
	/home/runner/work/koordinator/koordinator/pkg/scheduler/frameworkext/helper/synced_eventhandler.go:94 +0x4a
github.com/koordinator-sh/koordinator/pkg/scheduler/frameworkext/helper.ForceSyncFromInformer(0x22a81d0, {0x7fcd25b48b30, 0xc000560870}, {0x7fcd25c472a8, 0xc000835180}, {0x22a81d0, 0xc000b5abe8})
	/home/runner/work/koordinator/koordinator/pkg/scheduler/frameworkext/helper/synced_eventhandler.go:120 +0x199
github.com/koordinator-sh/koordinator/pkg/scheduler/plugins/coscheduling/core.NewPodGroupManager({0x2297740, 0xc000bb25f0}, {0x22d4a18, 0xc000560870}, {0x2304f28, 0xc0000ba960}, 0xc000b7d140)
	/home/runner/work/koordinator/koordinator/pkg/scheduler/plugins/coscheduling/core/core.go:112 +0x4a8
github.com/koordinator-sh/koordinator/pkg/scheduler/plugins/coscheduling.New({0x228fb08, 0xc000b7d140}, {0x7fcd25c46168, 0xc0008d96e0})
	/home/runner/work/koordinator/koordinator/pkg/scheduler/plugins/coscheduling/coscheduling.go:83 +0x21e
github.com/koordinator-sh/koordinator/pkg/scheduler/frameworkext.PluginFactoryProxy.func1({0x228fb08, 0xc000b7d140}, {0x23045f8, 0xc0004c8e00})
	/home/runner/work/koordinator/koordinator/pkg/scheduler/frameworkext/framework_extender.go:261 +0xf7
k8s.io/kubernetes/pkg/scheduler/framework/runtime.NewFramework(0xc000b4b560, 0xc000c4d250, {0xc0005b0420, 0xc, 0x40da27})
	/home/runner/go/pkg/mod/k8s.io/kubernetes@v1.22.6/pkg/scheduler/framework/runtime/framework.go:339 +0x98b
k8s.io/kubernetes/pkg/scheduler/profile.newProfile({{0xc000059690, 0xf}, 0xc000899200, {0xc00019d800, 0xc, 0xc}}, 0xa, 0x0, {0xc0008c3688, 0xb, ...})
	/home/runner/go/pkg/mod/k8s.io/kubernetes@v1.22.6/pkg/scheduler/profile/profile.go:41 +0x138
k8s.io/kubernetes/pkg/scheduler/profile.NewMap({0xc000b4b620, 0x1, 0x0}, 0x1ca3520, 0xc0008c3308, {0xc000c4d688, 0xb, 0xb})
	/home/runner/go/pkg/mod/k8s.io/kubernetes@v1.22.6/pkg/scheduler/profile/profile.go:61 +0x1bd
k8s.io/kubernetes/pkg/scheduler.(*Configurator).create(0xc000c4d890)
	/home/runner/go/pkg/mod/k8s.io/kubernetes@v1.22.6/pkg/scheduler/factory.go:151 +0xbf3
k8s.io/kubernetes/pkg/scheduler.New({0x2311910, 0xc000c3cc60}, {0x2304f28, 0xc0000ba960}, 0xc0001d5ac0, 0x65822a, {0xc0008c3b68, 0xb, 0x2})
	/home/runner/go/pkg/mod/k8s.io/kubernetes@v1.22.6/pkg/scheduler/scheduler.go:280 +0x64b
github.com/koordinator-sh/koordinator/cmd/koord-scheduler/app.Setup({0x22b22f0, 0xc00062b1c0}, 0xc00000f158, {0x3469a20, 0x1, 0x1}, {0xc00062a700, 0x8, 0x0})
	/home/runner/work/koordinator/koordinator/cmd/koord-scheduler/app/server.go:357 +0x7a5
github.com/koordinator-sh/koordinator/cmd/koord-scheduler/app.runCommand(0x0, 0x0, {0x3469a20, 0x1, 0x1}, {0xc00062a700, 0x8, 0x8})
	/home/runner/work/koordinator/koordinator/cmd/koord-scheduler/app/server.go:129 +0x149
github.com/koordinator-sh/koordinator/cmd/koord-scheduler/app.NewSchedulerCommand.func1(0xc00079ef00, {0xc00011bce0, 0x6, 0x6})
	/home/runner/work/koordinator/koordinator/cmd/koord-scheduler/app/server.go:84 +0x39
github.com/spf13/cobra.(*Command).execute(0xc00079ef00, {0xc000120010, 0x6, 0x6})
	/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:920 +0x827
github.com/spf13/cobra.(*Command).ExecuteC(0xc00079ef00)
	/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044 +0x3cd
github.com/spf13/cobra.(*Command).Execute(...)
	/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968
main.main()
	/home/runner/work/koordinator/koordinator/cmd/koord-scheduler/main.go:83 +0xf9

Ⅳ. Special notes for reviews

koordinator version: 1.1.1

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@codecov
Copy link

codecov bot commented Feb 5, 2023

Codecov Report

Base: 67.48% // Head: 67.51% // Increases project coverage by +0.03% 🎉

Coverage data is based on head (78a662d) compared to base (02a98d4).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #985      +/-   ##
==========================================
+ Coverage   67.48%   67.51%   +0.03%     
==========================================
  Files         241      241              
  Lines       27991    27982       -9     
==========================================
+ Hits        18889    18893       +4     
+ Misses       7801     7789      -12     
+ Partials     1301     1300       -1     
Flag Coverage Δ
unittests 67.51% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/scheduler/plugins/coscheduling/core/gang.go 78.23% <100.00%> (+3.08%) ⬆️
pkg/util/httputil/reverseproxy.go 84.84% <0.00%> (-0.27%) ⬇️
pkg/koordlet/runtimehooks/reconciler/reconciler.go 67.05% <0.00%> (+1.17%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Syulin7 Syulin7 force-pushed the fix_panic branch 2 times, most recently from 024bef1 to e7ca9f5 Compare February 5, 2023 08:21
@Syulin7
Copy link
Contributor Author

Syulin7 commented Feb 6, 2023

@eahydra PTAL, Thanks.

@Syulin7 Syulin7 force-pushed the fix_panic branch 2 times, most recently from a743a00 to a94a087 Compare February 6, 2023 05:53
@Syulin7 Syulin7 force-pushed the fix_panic branch 3 times, most recently from 8b46561 to 854a71c Compare February 6, 2023 08:27
Copy link
Member

@eahydra eahydra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eahydra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit 65849ae into koordinator-sh:main Feb 6, 2023
@Syulin7 Syulin7 deleted the fix_panic branch February 6, 2023 09:36
@eahydra eahydra added this to the v1.2 milestone Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants