[Data] Elevate num_cpus/gpus and memory as top-level params in most APIs#56419
Conversation
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
24afeca to
a1ad9d6
Compare
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
…ote-resource-args-top-level-args
|
@richardliaw @gvspraveen @alexeykudinkin PTAL, Thanks! 🙏🙏🙏 |
python/ray/data/dataset.py
Outdated
| num_cpus: The number of CPUs to reserve for each parallel map worker. | ||
| num_gpus: The number of GPUs to reserve for each parallel map worker. For | ||
| example, specify `num_gpus=1` to request 1 GPU for each parallel map | ||
| worker. | ||
| memory: The heap memory in bytes to reserve for each parallel map worker. |
There was a problem hiding this comment.
hmm, do we actually want to expose this here?
There was a problem hiding this comment.
Yeah, users can set these attributes here, so I think its reasonable to expose here?
There was a problem hiding this comment.
hmm, will let @alexeykudinkin decide, but i think for things like drop_columns and add_columns probably it's not that necessary. but let's wait to hear from othesr.
There was a problem hiding this comment.
Discussed with @alexeykudinkin offline -- to avoid adding unnecessary complexity, let's avoid exposing these as top-level parameters for non-UDF APIs:
drop_columnsselect_columnsrename_columns
You can specify a UDF for add_columns, so I think it's okay to keep there.
There was a problem hiding this comment.
Just removed all the changes in non UDF function, thanks!
…ote-resource-args-top-level-args
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: Zhiqiang Ma <zhiqiang.ma@intel.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: zac <zac@anyscale.com>
…PIs (#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes #54708 <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: Marco Stephan <marco@magic.dev>
…PIs (#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes #54708 <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 Signed-off-by: jpatra72 <jyotirmaya72@gmail.com>
was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 Signed-off-by: jpatra72 <jyotirmaya72@gmail.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
) was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 --------- Signed-off-by: jpatra72 <jyotirmaya72@gmail.com>
) was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 --------- Signed-off-by: jpatra72 <jyotirmaya72@gmail.com> Signed-off-by: xgui <xgui@anyscale.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com>
) was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 --------- Signed-off-by: jpatra72 <jyotirmaya72@gmail.com>
) was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 --------- Signed-off-by: jpatra72 <jyotirmaya72@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…PIs (ray-project#56419) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR: - Add a util method `merge_resources_to_ray_remote_args` to add reaource args : `num_cpus` `num_gpus` `memory` to `ray_remote_args` and a test for it. - Update `read_api.py` and `dataset.py` to elevate num_cpus/gpus and memory as top-level params <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number Closes ray-project#54708 <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: You-Cheng Lin (Owen) <mses010108@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
) was added in PR ray-project#56588 , but got lost again in a latter PR ray-project#56419 --------- Signed-off-by: jpatra72 <jyotirmaya72@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Why are these changes needed?
This PR:
merge_resources_to_ray_remote_argsto add reaource args :num_cpusnum_gpusmemorytoray_remote_argsand a test for it.read_api.pyanddataset.pyto elevate num_cpus/gpus and memory as top-level paramsRelated issue number
Closes #54708
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.