docs/blueprints/model-storage: move each deployment target into their own pages

Xe · Xe · commit 650c637b5f64 · 2024-10-24T10:45:39.000-04:00
Signed-off-by: Xe Iaso &lt;xe@tigrisdata.com&gt;
diff --git a/docs/blueprints/model-storage/fly-io.md b/docs/blueprints/model-storage/fly-io.md
@@ -0,0 +1,87 @@
+# Using model weights in Tigris on fly.io
+
+:::note
+
+The instructions on this page assume you have followed all of the steps in the
+main [Storing Model Weights in Tigris](/blueprints/model-storage) blueprint.
+
+:::
+
+Insert something about needing a fly.io account, link to the hello/tigris page
+for some free capitalism dollars.
+
+First, create a new app for this to live in:
+
+```text
+fly apps create your-app-name-here
+```
+
+Then create a GPU machine with an a100-80gb GPU in it:
+
+```text
+fly machine run \
+  -a your-app-name-here \
+  --name fluxschnell \
+  -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> \
+  -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> \
+  -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev \
+  -e AWS_REGION=auto \
+  -e MODEL_BUCKET_NAME=model-storage \
+  -e PUBLIC_BUCKET_NAME=generated-images \
+  -e MODEL_PATH=black-forest-labs/FLUX.1-schnell \
+  --vm-gpu-kind a100-80gb \
+  -r sjc \
+  your-docker-username/flux-tigris:latest \
+  -- python -m cog.server.http --host ::
+```
+
+This will print a machine IP like this:
+
+```text
+Machine started, you can connect via the following private ip
+  fdaa:0:641b:a7b:165:347b:d972:2
+```
+
+Then proxy to the machine:
+
+```text
+fly proxy -a your-app-name-here \
+  5001:5000 \
+  fdaa:0:641b:a7b:165:347b:d972:2
+```
+
+Then you need to wait a few minutes while the machine sets itself up. It's done
+when it prints this line in the logs:
+
+```text
+{"logger": "cog.server.probes", "timestamp": "2024-10-22T17:36:06.651457Z", "severity": "INFO", "message": "Not running in Kubernetes: disabling probe helpers."}
+```
+
+Do a test generation with this curl command:
+
+```text
+curl "http://localhost:5001/predictions/$(uuidgen)" \
+  -X PUT \
+  -H "Content-Type: application/json" \
+  --data-binary '{
+    "input": {
+        "prompt": "The word 'success' in front of the Space Needle, anime depiction, best quality",
+        "aspect_ratio": "16:9",
+        "guidance_scale": 3.5,
+        "num_inference_steps": 50,
+        "max_sequence_length": 512,
+        "output_format": "png",
+        "num_outputs": 1
+    }
+}'
+```
+
+If all goes well, you should get an image like this:
+
+![The word 'success' in front of the Space Needle](./success.webp)
+
+You can destroy the machine with this command:
+
+```text
+fly machine destroy --force -a your-app-name-here fluxschnell
+```
diff --git a/docs/blueprints/model-storage/index.md b/docs/blueprints/model-storage/index.md
@@ -123,193 +123,12 @@ environment variables in your deployments:
 | `AWS_SECRET_ACCESS_KEY` | The secret access key from the runner keypair          |
 |   `AWS_ENDPOINT_URL_S3` | `https://fly.storage.tigris.dev`                       |
 |            `AWS_REGION` | `auto`                                                 |
-|     `MODEL_BUCKET_NAME` | `model-storage` (replace with your own bucket name)    |
 |            `MODEL_PATH` | `black-forest-labs/FLUX.1-schnell`                     |
+|     `MODEL_BUCKET_NAME` | `model-storage` (replace with your own bucket name)    |
 |    `PUBLIC_BUCKET_NAME` | `generated-images` (replace with your own bucket name) |
 
-### fly.io
-
-First, create a new app for this to live in:
-
-```text
-fly apps create your-app-name-here
-```
-
-Then create a GPU machine with an a100-80gb GPU in it:
-
-```text
-fly machine run \
-  -a your-app-name-here \
-  --name fluxschnell \
-  -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> \
-  -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> \
-  -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev \
-  -e AWS_REGION=auto \
-  -e MODEL_BUCKET_NAME=model-storage \
-  -e PUBLIC_BUCKET_NAME=generated-images \
-  -e MODEL_PATH=black-forest-labs/FLUX.1-schnell \
-  --vm-gpu-kind a100-80gb \
-  -r sjc \
-  your-docker-username/flux-tigris:latest \
-  -- python -m cog.server.http --host ::
-```
-
-This will print a machine IP like this:
-
-```text
-Machine started, you can connect via the following private ip
-  fdaa:0:641b:a7b:165:347b:d972:2
-```
-
-Then proxy to the machine:
-
-```text
-fly proxy -a your-app-name-here \
-  5001:5000 \
-  fdaa:0:641b:a7b:165:347b:d972:2
-```
-
-Then you need to wait a few minutes while the machine sets itself up. It's done
-when it prints this line in the logs:
-
-```text
-{"logger": "cog.server.probes", "timestamp": "2024-10-22T17:36:06.651457Z", "severity": "INFO", "message": "Not running in Kubernetes: disabling probe helpers."}
-```
-
-Do a test generation with this curl command:
-
-```text
-curl "http://localhost:5001/predictions/$(uuidgen)" \
-  -X PUT \
-  -H "Content-Type: application/json" \
-  --data-binary '{
-    "input": {
-        "prompt": "The word 'success' in front of the Space Needle, anime depiction, best quality",
-        "aspect_ratio": "16:9",
-        "guidance_scale": 3.5,
-        "num_inference_steps": 50,
-        "max_sequence_length": 512,
-        "output_format": "png",
-        "num_outputs": 1
-    }
-}'
-```
-
-If all goes well, you should get an image like this:
-
-![The word 'success' in front of the Space Needle](./success.webp)
-
-You can destroy the machine with this command:
-
-```text
-fly machine destroy --force -a your-app-name-here fluxschnell
-```
-
-### Skypilot
-
-TODO(Xe): all of this
-
-### Vast.ai
-
-Create an account on [Vast.ai](https://vast.ai) and load it with credit if you
-don't have one already. You should need at least $5 of credit to complete this
-blueprint.
-
-In your virtual environment that you used to optimize your model, install the
-`vastai` CLI tool:
-
-```text
-pip install --upgrade vastai;
-```
-
-Follow Vast.ai's instructions on
-[how to load your API key](https://cloud.vast.ai/cli/).
+You can run this on any platform you want that has the right GPUs, however we
+have tutorials for a few platforms to try:
 
-Then you need to find an instance. This example requires a GPU with 80 GB of
-vram. Use this command to find a suitable host:
-
-```text
-vastai search offers 'verified=true cuda_max_good>=12.1 gpu_ram>=64 num_gpus=1 inet_down>=850' -o 'dph+'
-```
-
-The first column is the instance ID for the launch command. You can use this to
-assemble your launch command. It will be made up out of the following:
-
-- The docker image name you pushed to the docker hub (or another registry)
-- The "environment" string with your exposed ports and environment variables
-- A signal to the runtime that we need 48 GB of disk space to run this app
-- The onstart command telling the runtime to start the cog process
-
-Format all of your environment variables as you would in a `docker run` command.
-EG:
-
-```text
-"-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=black-forest-labs/FLUX.1-schnell -e PUBLIC_BUCKET_NAME=generated-images"
-```
-
-Then execute the launch command:
-
-```text
-vastai create instance \
-  <id-from-search> \
-  --image your-docker-username/flux-tigris:latest \
-  --env "-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=black-forest-labs/FLUX.1-schnell -e PUBLIC_BUCKET_NAME=generated-images" \
-  --disk 48 \
-  --onstart-cmd "python -m cog.server.http"
-```
-
-It will report success with a message like this:
-
-```text
-Started. {'success': True, 'new_contract': 13288520}
-```
-
-The `new_contract` field is your instance ID.
-
-Give it a moment to download and start up. If you want to check on it, run this
-command:
-
-```text
-vastai logs <instance-id>
-```
-
-It's done when it prints this line in the logs:
-
-```text
-{"logger": "cog.server.probes", "timestamp": "2024-10-22T17:36:06.651457Z", "severity": "INFO", "message": "Not running in Kubernetes: disabling probe helpers."}
-```
-
-Then fetch the IP address and port for your app with this command:
-
-```text
-vastai show instance <instance-id> --raw | jq -r '"\(.public_ipaddr):\(.ports["5000/tcp"][0].HostPort)"'
-```
-
-Finally, run a test generation with this curl command:
-
-```text
-curl "http://ip:port/predictions/$(uuidgen)" \
-  -X PUT \
-  -H "Content-Type: application/json" \
-  --data-binary '{
-    "input": {
-        "prompt": "The word 'success' in front of the Space Needle, anime depiction, best quality",
-        "aspect_ratio": "16:9",
-        "guidance_scale": 3.5,
-        "num_inference_steps": 50,
-        "max_sequence_length": 512,
-        "output_format": "png",
-        "num_outputs": 1
-    }
-}'
-```
-
-If all goes well, you should get an image like this:
-
-![The word 'success' in front of the Space Needle](./success.webp)
-
-You can destroy the machine with this command:
-
-```text
-vastai destroy instance <instance-id>
-```
+- [Fly.io](./fly-io)
+- [Vast.ai](./vast-ai)
diff --git a/docs/blueprints/model-storage/vast-ai.md b/docs/blueprints/model-storage/vast-ai.md