Skip to content

Added guided demos to SDK repo #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
205 changes: 205 additions & 0 deletions demo-notebooks/guided-demos/0_basic_ray.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "8d4a42f6",
"metadata": {},
"source": [
"In this first notebook, we will go through the basics of using the SDK to:\n",
" - Spin up a Ray cluster with our desired resources\n",
" - View the status and specs of our Ray cluster\n",
" - Take down the Ray cluster when finished"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b55bc3ea-4ce3-49bf-bb1f-e209de8ca47a",
"metadata": {},
"outputs": [],
"source": [
"# Import pieces from codeflare-sdk\n",
"from codeflare_sdk.cluster.cluster import Cluster, ClusterConfiguration\n",
"from codeflare_sdk.cluster.auth import TokenAuthentication"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "614daa0c",
"metadata": {},
"outputs": [],
"source": [
"# Create authentication object for oc user permissions\n",
"auth = TokenAuthentication(\n",
" token = \"XXXXX\",\n",
" server = \"XXXXX\",\n",
" skip_tls=False\n",
")\n",
"auth.login()"
]
},
{
"cell_type": "markdown",
"id": "bc27f84c",
"metadata": {},
"source": [
"Here, we want to define our cluster by specifying the resources we require for our batch workload. Below, we define our cluster object (which generates a corresponding AppWrapper)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f4bc870-091f-4e11-9642-cba145710159",
"metadata": {},
"outputs": [],
"source": [
"# Create and configure our cluster object (and appwrapper)\n",
"cluster = Cluster(ClusterConfiguration(\n",
" name='raytest',\n",
" namespace='default',\n",
" min_worker=2,\n",
" max_worker=2,\n",
" min_cpus=1,\n",
" max_cpus=1,\n",
" min_memory=4,\n",
" max_memory=4,\n",
" gpu=0,\n",
" instascale=False\n",
"))"
]
},
{
"cell_type": "markdown",
"id": "12eef53c",
"metadata": {},
"source": [
"Next, we want to bring our cluster up, so we call the `up()` function below to submit our cluster AppWrapper yaml onto the MCAD queue, and begin the process of obtaining our resource cluster."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0884bbc-c224-4ca0-98a0-02dfa09c2200",
"metadata": {},
"outputs": [],
"source": [
"# Bring up the cluster\n",
"cluster.up()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "657ebdfb",
"metadata": {},
"source": [
"Now, we want to check on the status of our resource cluster, and wait until it is finally ready for use."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c1b4311-2e61-44c9-8225-87c2db11363d",
"metadata": {},
"outputs": [],
"source": [
"cluster.status()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a99d5aff",
"metadata": {},
"outputs": [],
"source": [
"cluster.wait_ready()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df71c1ed",
"metadata": {},
"outputs": [],
"source": [
"cluster.status()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b3a55fe4",
"metadata": {},
"source": [
"Let's quickly verify that the specs of the cluster are as expected."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7fd45bc5-03c0-4ae5-9ec5-dd1c30f1a084",
"metadata": {},
"outputs": [],
"source": [
"cluster.details()"
]
},
{
"cell_type": "markdown",
"id": "5af8cd32",
"metadata": {},
"source": [
"Finally, we bring our resource cluster down and release/terminate the associated resources, bringing everything back to the way it was before our cluster was brought up."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f36db0f-31f6-4373-9503-dc3c1c4c3f57",
"metadata": {},
"outputs": [],
"source": [
"cluster.down()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0d41b90e",
"metadata": {},
"outputs": [],
"source": [
"auth.logout()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"vscode": {
"interpreter": {
"hash": "f9f85f796d01129d0dd105a088854619f454435301f6ffec2fea96ecbd9be4ac"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
178 changes: 178 additions & 0 deletions demo-notebooks/guided-demos/1_basic_instascale.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "9865ee8c",
"metadata": {},
"source": [
"In this second notebook, we will go over the basics of using InstaScale to scale up/down necessary resources that are not currently available on your OpenShift Cluster (in cloud environments)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b55bc3ea-4ce3-49bf-bb1f-e209de8ca47a",
"metadata": {},
"outputs": [],
"source": [
"# Import pieces from codeflare-sdk\n",
"from codeflare_sdk.cluster.cluster import Cluster, ClusterConfiguration\n",
"from codeflare_sdk.cluster.auth import TokenAuthentication"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "614daa0c",
"metadata": {},
"outputs": [],
"source": [
"# Create authentication object for oc user permissions\n",
"auth = TokenAuthentication(\n",
" token = \"XXXXX\",\n",
" server = \"XXXXX\",\n",
" skip_tls=False\n",
")\n",
"auth.login()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "bc27f84c",
"metadata": {},
"source": [
"This time, we are working in a cloud environment, and our OpenShift cluster does not have the resources needed for our desired workloads. We will use InstaScale to dynamically scale-up guaranteed resources based on our request (that will also automatically scale-down when we are finished working):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f4bc870-091f-4e11-9642-cba145710159",
"metadata": {},
"outputs": [],
"source": [
"# Create and configure our cluster object (and appwrapper)\n",
"cluster = Cluster(ClusterConfiguration(\n",
" name='instascaletest',\n",
" namespace='default',\n",
" min_worker=2,\n",
" max_worker=2,\n",
" min_cpus=2,\n",
" max_cpus=2,\n",
" min_memory=8,\n",
" max_memory=8,\n",
" gpu=1,\n",
" instascale=True, # InstaScale now enabled, will scale OCP cluster to guarantee resource request\n",
" machine_types=[\"m5.xlarge\", \"g4dn.xlarge\"] # Head, worker AWS machine types desired\n",
"))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "12eef53c",
"metadata": {},
"source": [
"Same as last time, we will bring the cluster up, wait for it to be ready, and confirm that the specs are as-requested:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0884bbc-c224-4ca0-98a0-02dfa09c2200",
"metadata": {},
"outputs": [],
"source": [
"# Bring up the cluster\n",
"cluster.up()\n",
"cluster.wait_ready()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6abfe904",
"metadata": {},
"source": [
"While the resources are being scaled, we can also go into the console and take a look at the InstaScale logs, as well as the new machines/nodes spinning up.\n",
"\n",
"Once the cluster is ready, we can confirm the specs:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7fd45bc5-03c0-4ae5-9ec5-dd1c30f1a084",
"metadata": {},
"outputs": [],
"source": [
"cluster.details()"
]
},
{
"cell_type": "markdown",
"id": "5af8cd32",
"metadata": {},
"source": [
"Finally, we bring our resource cluster down and release/terminate the associated resources, bringing everything back to the way it was before our cluster was brought up."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f36db0f-31f6-4373-9503-dc3c1c4c3f57",
"metadata": {},
"outputs": [],
"source": [
"cluster.down()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c883caea",
"metadata": {},
"source": [
"Once again, we can look at the machines/nodes and see that everything has been successfully scaled down!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0d41b90e",
"metadata": {},
"outputs": [],
"source": [
"auth.logout()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"vscode": {
"interpreter": {
"hash": "f9f85f796d01129d0dd105a088854619f454435301f6ffec2fea96ecbd9be4ac"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading