-
Notifications
You must be signed in to change notification settings - Fork 950
feat: KEP-2437 - PodGroup Creation for Volcano Scheduler #2729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
google-oss-prow
merged 77 commits into
kubeflow:master
from
Doris-xm:volcano-podgroup-build
Oct 6, 2025
Merged
Changes from 72 commits
Commits
Show all changes
77 commits
Select commit
Hold shift + click to select a range
8867e71
feat: api for volcano scheduling plugin
Doris-xm 0cd20fc
feat: init volcano-plugin
Doris-xm 2e911ef
feat: init test file
Doris-xm ed0425f
feat: register volcano plugin
Doris-xm fca9f40
feat: deal with minTaskMember, minMember, NetworkTopo
Doris-xm 8ab5d71
fix: calculate of minResource
Doris-xm bd60987
test: build PodGroup test
Doris-xm ec6c5a8
refactor: separate to 2 prs(build&handler)
Doris-xm f182b65
test: add test for new&reconcile_builder
Doris-xm ffa27f9
fix: typo
Doris-xm f8ea7dd
Merge branch 'refs/heads/master' into volcano-podgroup-build
Doris-xm 4fa3b6a
fix: trainer/v2 import
Doris-xm 1a247da
fix: networktopo type
Doris-xm 73bb476
fix: OpenAPI validation errors
Doris-xm b92383a
fix: remove minTaskMembers
Doris-xm fe6ffd0
test: test coverage 100%
Doris-xm 1c33eba
Merge branch 'refs/heads/master' into volcano-podgroup-build
Doris-xm 7cbad55
feat: update apis
Doris-xm cdd9309
feat: replace testify
Doris-xm 8e68bba
fix: registry Volcano CRDs to the scheme
Doris-xm 79618cb
fix: add volcano to scheme
Doris-xm 8814111
fix: fix networktopo schema
Doris-xm 114f11b
fix: add networktopo spec in trainer
Doris-xm c8fa0fd
fix: unit test
Doris-xm f8d8912
feat: import networkTopo directly
Doris-xm c084ac8
fix: make generate
Doris-xm f65d7a7
Merge branch 'refs/heads/master' into volcano-podgroup-build
Doris-xm aa90695
fix: make generate
Doris-xm 83c0585
fix: golangci-lint
Doris-xm 6f24588
fix: golangci-lint
Doris-xm d2fd159
feat: add volcano installation in integration test
Doris-xm e95a158
Merge branch 'refs/heads/master' into volcano-podgroup-build
Doris-xm d582a81
fix: filter volcano api
Doris-xm fe19174
Merge branch 'refs/heads/master' into volcano-podgroup-build
Doris-xm deafc5d
fix: get volcano.podgroup with local version
Doris-xm d8bae4a
fix: init test env with volcano podgroup installed
Doris-xm cf578a2
fix: check plugin in enforcePodgroupPolicy
Doris-xm a332de4
fix: group-name label in unit test
Doris-xm a4f09e6
fix: ReconcilerBuilders
Doris-xm 71996b6
feat: add PodGroupHandler
Doris-xm 99e0462
feat: unit test for handlers
Doris-xm 583b1d6
fix: group name annotation
Doris-xm e60a5a9
Update hack/swagger/main.go
Doris-xm ef25760
fix: no need to delete RBAC
Doris-xm 28bcd41
Update pkg/runtime/framework/plugins/volcano/indexer.go
Doris-xm e2b1d89
fix: nil checking for trainjob
Doris-xm d037828
Update pkg/runtime/framework/plugins/volcano/volcano.go
Doris-xm aaf47a2
Merge remote-tracking branch 'origin/volcano-podgroup-build' into vol…
Doris-xm 9c1c5e0
fix: make generate
Doris-xm 8497e6b
fix: index conflict
Doris-xm 67bbedb
Update pkg/runtime/framework/plugins/coscheduling/coscheduling.go
Doris-xm 04aa3e8
fix: update volcano to v1.12.2
Doris-xm e044c4f
Merge remote-tracking branch 'origin/volcano-podgroup-build' into vol…
Doris-xm 167a595
feat: re-use indexer
Doris-xm d52581e
feat: add validate
Doris-xm a8be8ca
Merge branch 'refs/heads/master' into volcano-podgroup-build
Doris-xm 8756dc7
fix: no scheduler when coscheduling is nil
Doris-xm 7ba807e
fix: put group-name in annotations
Doris-xm 6075a0c
feat: validate if priorityClass installed
Doris-xm 132bb16
feat: propagate annotations to pod
Doris-xm 7065606
feat: integration test for volcano
Doris-xm e6c7646
fix: golangci-lint check
Doris-xm 2eb8629
feat: use shared indexer
Doris-xm 4ede544
feat: remove indexer to runtime/
Doris-xm ef69e38
Update hack/swagger/main.go
Doris-xm 63393c5
Update hack/swagger/main.go
Doris-xm 2fad831
fix: append owner reference & missing import
Doris-xm f0e5a4c
fix: rewrite volcano UT
Doris-xm 9c1eae1
feat: add copyright
Doris-xm e185049
fix: sync RBAC to Helm charts
Doris-xm 497a0b6
fix: refactor UTs
Doris-xm 4030ce5
fix: test validation separately
Doris-xm c886583
Update hack/swagger/main.go
Doris-xm 451bf6e
fix: refactor TestVolcano
Doris-xm 49bd5b9
Merge remote-tracking branch 'origin/volcano-podgroup-build' into vol…
Doris-xm 4175fef
fix: refactor TestValidate
Doris-xm 1f05304
Update pkg/runtime/framework/plugins/volcano/volcano_test.go
Doris-xm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
89 changes: 89 additions & 0 deletions
89
api/python_api/kubeflow_trainer_api/models/scheduling_v1beta1_network_topology_spec.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| # coding: utf-8 | ||
|
|
||
| """ | ||
| Kubeflow Trainer OpenAPI Spec | ||
|
|
||
| No description provided (generated by Openapi Generator https://github.com/openapitools/openapi-generator) | ||
|
|
||
| The version of the OpenAPI document: unversioned | ||
| Generated by OpenAPI Generator (https://openapi-generator.tech) | ||
|
|
||
| Do not edit the class manually. | ||
| """ # noqa: E501 | ||
|
|
||
|
|
||
| from __future__ import annotations | ||
| import pprint | ||
| import re # noqa: F401 | ||
| import json | ||
|
|
||
| from pydantic import BaseModel, ConfigDict, Field, StrictInt, StrictStr | ||
| from typing import Any, ClassVar, Dict, List, Optional | ||
| from typing import Optional, Set | ||
| from typing_extensions import Self | ||
|
|
||
| class SchedulingV1beta1NetworkTopologySpec(BaseModel): | ||
| """ | ||
| SchedulingV1beta1NetworkTopologySpec | ||
| """ # noqa: E501 | ||
| highest_tier_allowed: Optional[StrictInt] = Field(default=None, description="HighestTierAllowed specifies the highest tier that a job allowed to cross when scheduling.", alias="highestTierAllowed") | ||
| mode: Optional[StrictStr] = Field(default=None, description="Mode specifies the mode of the network topology constrain.") | ||
| __properties: ClassVar[List[str]] = ["highestTierAllowed", "mode"] | ||
|
|
||
| model_config = ConfigDict( | ||
| populate_by_name=True, | ||
| validate_assignment=True, | ||
| protected_namespaces=(), | ||
| ) | ||
|
|
||
|
|
||
| def to_str(self) -> str: | ||
| """Returns the string representation of the model using alias""" | ||
| return pprint.pformat(self.model_dump(by_alias=True)) | ||
|
|
||
| def to_json(self) -> str: | ||
| """Returns the JSON representation of the model using alias""" | ||
| # TODO: pydantic v2: use .model_dump_json(by_alias=True, exclude_unset=True) instead | ||
| return json.dumps(self.to_dict()) | ||
|
|
||
| @classmethod | ||
| def from_json(cls, json_str: str) -> Optional[Self]: | ||
| """Create an instance of SchedulingV1beta1NetworkTopologySpec from a JSON string""" | ||
| return cls.from_dict(json.loads(json_str)) | ||
|
|
||
| def to_dict(self) -> Dict[str, Any]: | ||
| """Return the dictionary representation of the model using alias. | ||
|
|
||
| This has the following differences from calling pydantic's | ||
| `self.model_dump(by_alias=True)`: | ||
|
|
||
| * `None` is only added to the output dict for nullable fields that | ||
| were set at model initialization. Other fields with value `None` | ||
| are ignored. | ||
| """ | ||
| excluded_fields: Set[str] = set([ | ||
| ]) | ||
|
|
||
| _dict = self.model_dump( | ||
| by_alias=True, | ||
| exclude=excluded_fields, | ||
| exclude_none=True, | ||
| ) | ||
| return _dict | ||
|
|
||
| @classmethod | ||
| def from_dict(cls, obj: Optional[Dict[str, Any]]) -> Optional[Self]: | ||
| """Create an instance of SchedulingV1beta1NetworkTopologySpec from a dict""" | ||
| if obj is None: | ||
| return None | ||
|
|
||
| if not isinstance(obj, dict): | ||
| return cls.model_validate(obj) | ||
|
|
||
| _obj = cls.model_validate({ | ||
| "highestTierAllowed": obj.get("highestTierAllowed"), | ||
| "mode": obj.get("mode") | ||
| }) | ||
| return _obj | ||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
91 changes: 91 additions & 0 deletions
91
...ython_api/kubeflow_trainer_api/models/trainer_v1alpha1_volcano_pod_group_policy_source.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| # coding: utf-8 | ||
|
|
||
| """ | ||
| Kubeflow Trainer OpenAPI Spec | ||
|
|
||
| No description provided (generated by Openapi Generator https://github.com/openapitools/openapi-generator) | ||
|
|
||
| The version of the OpenAPI document: unversioned | ||
| Generated by OpenAPI Generator (https://openapi-generator.tech) | ||
|
|
||
| Do not edit the class manually. | ||
| """ # noqa: E501 | ||
|
|
||
|
|
||
| from __future__ import annotations | ||
| import pprint | ||
| import re # noqa: F401 | ||
| import json | ||
|
|
||
| from pydantic import BaseModel, ConfigDict, Field | ||
| from typing import Any, ClassVar, Dict, List, Optional | ||
| from kubeflow_trainer_api.models.scheduling_v1beta1_network_topology_spec import SchedulingV1beta1NetworkTopologySpec | ||
| from typing import Optional, Set | ||
| from typing_extensions import Self | ||
|
|
||
| class TrainerV1alpha1VolcanoPodGroupPolicySource(BaseModel): | ||
| """ | ||
| VolcanoPodGroupPolicySource represents configuration for the Volcano gang-scheduler. | ||
| """ # noqa: E501 | ||
| network_topology: Optional[SchedulingV1beta1NetworkTopologySpec] = Field(default=None, description="NetworkTopology defines the NetworkTopology config, this field works in conjunction with network topology feature and hyperNode CRD.", alias="networkTopology") | ||
| __properties: ClassVar[List[str]] = ["networkTopology"] | ||
|
|
||
| model_config = ConfigDict( | ||
| populate_by_name=True, | ||
| validate_assignment=True, | ||
| protected_namespaces=(), | ||
| ) | ||
|
|
||
|
|
||
| def to_str(self) -> str: | ||
| """Returns the string representation of the model using alias""" | ||
| return pprint.pformat(self.model_dump(by_alias=True)) | ||
|
|
||
| def to_json(self) -> str: | ||
| """Returns the JSON representation of the model using alias""" | ||
| # TODO: pydantic v2: use .model_dump_json(by_alias=True, exclude_unset=True) instead | ||
| return json.dumps(self.to_dict()) | ||
|
|
||
| @classmethod | ||
| def from_json(cls, json_str: str) -> Optional[Self]: | ||
| """Create an instance of TrainerV1alpha1VolcanoPodGroupPolicySource from a JSON string""" | ||
| return cls.from_dict(json.loads(json_str)) | ||
|
|
||
| def to_dict(self) -> Dict[str, Any]: | ||
| """Return the dictionary representation of the model using alias. | ||
|
|
||
| This has the following differences from calling pydantic's | ||
| `self.model_dump(by_alias=True)`: | ||
|
|
||
| * `None` is only added to the output dict for nullable fields that | ||
| were set at model initialization. Other fields with value `None` | ||
| are ignored. | ||
| """ | ||
| excluded_fields: Set[str] = set([ | ||
| ]) | ||
|
|
||
| _dict = self.model_dump( | ||
| by_alias=True, | ||
| exclude=excluded_fields, | ||
| exclude_none=True, | ||
| ) | ||
| # override the default output from pydantic by calling `to_dict()` of network_topology | ||
| if self.network_topology: | ||
| _dict['networkTopology'] = self.network_topology.to_dict() | ||
| return _dict | ||
|
|
||
| @classmethod | ||
| def from_dict(cls, obj: Optional[Dict[str, Any]]) -> Optional[Self]: | ||
| """Create an instance of TrainerV1alpha1VolcanoPodGroupPolicySource from a dict""" | ||
| if obj is None: | ||
| return None | ||
|
|
||
| if not isinstance(obj, dict): | ||
| return cls.model_validate(obj) | ||
|
|
||
| _obj = cls.model_validate({ | ||
| "networkTopology": SchedulingV1beta1NetworkTopologySpec.from_dict(obj["networkTopology"]) if obj.get("networkTopology") is not None else None | ||
| }) | ||
| return _obj | ||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.