Skip to content

Commit ae706f2

Browse files
more typo fixes
1 parent 1d21088 commit ae706f2

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

CodeFlareSDK_Design_Doc.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ In order to achieve this we need the capacity to:
2727
* Generate valid AppWrapper yaml files based on user provided parameters
2828
* Get, list, watch, create, update, patch, and delete AppWrapper custom resources on a kubernetes cluster
2929
* Get, list, watch, create, update, patch, and delete RayCluster custom resources on a kubernetes cluster.
30-
* Expose a secure route to the Ray Dashboard endpoint.
30+
* Expose a secure route to the Ray Dashboard endpoint.
3131
* Define, submit, monitor and cancel Jobs submitted via TorchX. TorchX jobs must support both Ray and MCAD-Kubernetes scheduler backends.
3232
* Provide means of authenticating to a Kubernetes cluster
3333

@@ -37,17 +37,17 @@ In order to achieve this we need the capacity to:
3737

3838
In order to create these framework clusters, we will start with a template AppWrapper yaml file with reasonable defaults that will generate a valid RayCluster via MCAD.
3939

40-
Users can customize their AppWrapper by passing their desired parameters to `ClusterConfig()` and applying that configuration when initializing a `Cluster()` object. When a `Cluster()` is initialized, it will update the AppWrapper template with the user’s specified requirements, and save it to the current working directory.
40+
Users can customize their AppWrapper by passing their desired parameters to `ClusterConfig()` and applying that configuration when initializing a `Cluster()` object. When a `Cluster()` is initialized, it will update the AppWrapper template with the user’s specified requirements, and save it to the current working directory.
4141

4242
Our aim is to simplify the process of generating valid AppWrappers for RayClusters, so we will strive to find the appropriate balance between ease of use and exposing all possible AppWrapper parameters. And we will find this balance through user feedback.
4343

4444
With a valid AppWrapper, we will use the Kubernetes python client to apply the AppWrapper to our Kubernetes cluster via a call to `cluster.up()`
4545

46-
We will also use the Kubernetes python client to get information about both the RayCluster and AppWrapper custom resources to monitor the status of our Framework Cluster.
46+
We will also use the Kubernetes python client to get information about both the RayCluster and AppWrapper custom resources to monitor the status of our Framework Cluster via `cluster.status()` and `cluster,details()`.
4747

4848
The RayCluster deployed on your kubernetes cluster can be interacted with in two ways: Either through an interactive session via `ray.init()` or through the submission of batch jobs.
4949

50-
Finally we will use the Kubernetes python client to delete the AppWrapper via `Cluster.down()`
50+
Finally we will use the Kubernetes python client to delete the AppWrapper via `cluster.down()`
5151

5252
### Training Jobs:
5353

@@ -57,7 +57,7 @@ Users can define their jobs with `DDPJobDefinition()` providing parameters for t
5757

5858
Once a job is defined it can be submitted to the Kuberentes cluster to be run via `job.submit()`. If `job.submit()` is left empty the SDK will assume the Kuberentes-MCAD scheduler is being used. If a RayCluster is specified like, `job.submit(cluster)`, then the SDK will assume that the Ray scheduler is being used and submit the job to that RayCluster.
5959

60-
After the job is submitted, a user can monitor its progress via `job.status()` and `job.logs()` to retrieve the status and logs output by the job. At any point the user can also call `.cancel()` to stop the job.
60+
After the job is submitted, a user can monitor its progress via `job.status()` and `job.logs()` to retrieve the status and logs output by the job. At any point the user can also call `job.cancel()` to stop the job.
6161

6262
### Authentication:
6363

@@ -93,7 +93,7 @@ We will rely on the kubernetes cluster’s default security, where users cannot
9393

9494
* Unit testing for all SDK functionality
9595
* Integration testing of SDK interactions with OpenShift and Kubernetes
96-
* System tests of SDK as part of the entire CodeFlare stack for main scenarios
96+
* System tests of SDK as part of the entire CodeFlare stack for main scenarios
9797
* Unit testing, integration testing, and system testing approaches
9898
* Unit testing will occur with every PR.
9999
* For system testing we can leverage [current e2e](https://github.com/project-codeflare/codeflare-operator/tree/main/test/e2e) tests from the operator repo.

0 commit comments

Comments
 (0)