docs/blueprints/model-storage: use SDXL instead of Flux

Xe · Xe · commit 8013529e4051 · 2024-10-24T16:07:35.000-04:00
I tried. I really tried. I'm not skilled enough in fighting the
diffusers library to make Flux use less than 49 GB of vram. I tried
everything that was documented to work and found out it was all
broken in any number of ways that made me feel like I was being gaslit
hard.

It's not as flashy to use SDXL here, but I needed to get _something_
working, so I fell back to what I know will work.

Signed-off-by: Xe Iaso &lt;xe@tigrisdata.com&gt;
diff --git a/docs/blueprints/model-storage/fly-io.md b/docs/blueprints/model-storage/fly-io.md
@@ -24,22 +24,35 @@ in:
 fly apps create your-app-name-here
 ```
 
-Then create a GPU machine with an a100-80gb GPU in it:
+As a reminder, this example is configured with environment variables. Set the
+following environment variables in your deployments:
+
+|             Envvar name | Value                                                  |
+| ----------------------: | :----------------------------------------------------- |
+|     `AWS_ACCESS_KEY_ID` | The access key ID from the runner keypair              |
+| `AWS_SECRET_ACCESS_KEY` | The secret access key from the runner keypair          |
+|   `AWS_ENDPOINT_URL_S3` | `https://fly.storage.tigris.dev`                       |
+|            `AWS_REGION` | `auto`                                                 |
+|            `MODEL_PATH` | `ByteDance/SDXL-Lightning`                             |
+|     `MODEL_BUCKET_NAME` | `model-storage` (replace with your own bucket name)    |
+|    `PUBLIC_BUCKET_NAME` | `generated-images` (replace with your own bucket name) |
+
+Then create a GPU machine with an l40s GPU in it in Seattle:
 
 ```text
 fly machine run \
   -a your-app-name-here \
-  --name fluxschnell \
+  --name sdxl-lightning \
   -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> \
   -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> \
   -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev \
   -e AWS_REGION=auto \
   -e MODEL_BUCKET_NAME=model-storage \
   -e PUBLIC_BUCKET_NAME=generated-images \
-  -e MODEL_PATH=black-forest-labs/FLUX.1-schnell \
-  --vm-gpu-kind a100-80gb \
-  -r sjc \
-  your-docker-username/flux-tigris:latest \
+  -e MODEL_PATH=ByteDance/SDXL-Lightning \
+  --vm-gpu-kind l40s \
+  -r sea \
+  your-docker-username/sdxl-tigris:latest \
   -- python -m cog.server.http --host ::
 ```
 
@@ -73,10 +86,10 @@ curl "http://localhost:5001/predictions/$(uuidgen)" \
   -H "Content-Type: application/json" \
   --data-binary '{
     "input": {
-        "prompt": "The word 'success' in front of the Space Needle, anime depiction, best quality",
-        "aspect_ratio": "16:9",
+        "prompt": "The space needle in Seattle, best quality, masterpiece",
+        "aspect_ratio": "1:1",
         "guidance_scale": 3.5,
-        "num_inference_steps": 50,
+        "num_inference_steps": 4,
         "max_sequence_length": 512,
         "output_format": "png",
         "num_outputs": 1
@@ -91,5 +104,5 @@ If all goes well, you should get an image like this:
 You can destroy the machine with this command:
 
 ```text
-fly machine destroy --force -a your-app-name-here fluxschnell
+fly machine destroy --force -a your-app-name-here sdxl-lightning
 ```
diff --git a/docs/blueprints/model-storage/index.md b/docs/blueprints/model-storage/index.md
@@ -24,22 +24,22 @@ enabling your workloads to start quickly no matter where they are in the world.
 ## Getting Started
 
 For this example, we’ll set up
-[Flux.1 [schnell]](https://huggingface.co/black-forest-labs/FLUX.1-schnell) by
-Black Forest Labs for inference with the weights stored in Tigris.
+[SDXL Lightning](https://huggingface.co/ByteDance/SDXL-Lightning) by ByteDance
+for inference with the weights stored in Tigris.
 
 Create two new buckets:
 
-1. One bucket will be for generated flux images, it’ll be called
-   `generated-images` in this article
+1. One bucket will be for generated images, it’ll be called `generated-images`
+   in this article
 2. One bucket will be for storing models, it’ll be called `model-storage` in
    this article
 
 Both of these buckets should be private.
 
-Download the flux-in-tigris template from GitHub:
+Download the `sdxl-in-tigris` template from GitHub:
 
 ```text
-git clone https://github.com/tigrisdata-community/flux-in-tigris
+git clone https://github.com/tigrisdata-community/sdxl-in-tigris
 ```
 
 Enter the folder in a terminal window.
@@ -73,7 +73,7 @@ pip install -r requirements.txt
 Then run the script to upload a model:
 
 ```text
-python scripts/prepare_model.py black-forest-labs/FLUX.1-schnell model-storage
+python scripts/prepare_model.py ByteDance/SDXL-Lightning model-storage
 ```
 
 This will take a bit to run, depending on your internet connection speed, hard
@@ -99,20 +99,32 @@ variables:
   AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev
   AWS_REGION=auto
   MODEL_BUCKET_NAME=model-storage
-  MODEL_PATH=black-forest-labs/FLUX.1-schnell
+  MODEL_PATH=ByteDance/SDXL-Lightning
 ```
 
+:::info
+
+Want differently styled images? Try finetunes like
+[Kohaku XL](https://huggingface.co/KBlueLeaf/Kohaku-XL-Zeta)! Pass the Hugging
+Face repo name to the `prepare_model` script like this:
+
+```text
+python scripts/prepare_model.py KBlueLeaf/Kohaku-XL-Zeta model-storage
+```
+
+:::
+
 ## Deploying it
 
 In order to deploy this, you need to build the image with the cog tool. Log into
 a Docker registry and run this command to build and push it:
 
 ```text
-cog push your-docker-username/flux-tigris --use-cuda-base-image false
+cog push your-docker-username/sdxl-tigris --use-cuda-base-image false
 ```
 
 You can now use it with your GPU host of choice as long as it supports Cuda 12.1
-and has at least 80 GB of video memory.
+and has at least 12 GB of video memory.
 
 This example is configured with environment variables. Set the following
 environment variables in your deployments:
@@ -123,7 +135,7 @@ environment variables in your deployments:
 | `AWS_SECRET_ACCESS_KEY` | The secret access key from the runner keypair          |
 |   `AWS_ENDPOINT_URL_S3` | `https://fly.storage.tigris.dev`                       |
 |            `AWS_REGION` | `auto`                                                 |
-|            `MODEL_PATH` | `black-forest-labs/FLUX.1-schnell`                     |
+|            `MODEL_PATH` | `ByteDance/SDXL-Lightning`                             |
 |     `MODEL_BUCKET_NAME` | `model-storage` (replace with your own bucket name)    |
 |    `PUBLIC_BUCKET_NAME` | `generated-images` (replace with your own bucket name) |
 
diff --git a/docs/blueprints/model-storage/vast-ai.md b/docs/blueprints/model-storage/vast-ai.md
@@ -27,11 +27,11 @@ pip install --upgrade vastai;
 Follow Vast.ai's instructions on
 [how to load your API key](https://cloud.vast.ai/cli/).
 
-Then you need to find an instance. This example requires a GPU with 80 GB of
+Then you need to find an instance. This example requires a GPU with 12 GB of
 vram. Use this command to find a suitable host:
 
 ```text
-vastai search offers 'verified=true cuda_max_good>=12.1 gpu_ram>=80 num_gpus=1 inet_down>=850' -o 'dph+'
+vastai search offers 'verified=true cuda_max_good>=12.1 gpu_ram>=12 num_gpus=1 inet_down>=850' -o 'dph+'
 ```
 
 The first column is the instance ID for the launch command. You can use this to
@@ -42,11 +42,24 @@ assemble your launch command. It will be made up out of the following:
 - A signal to the runtime that we need 48 GB of disk space to run this app
 - The onstart command telling the runtime to start the cog process
 
+As a reminder, this example is configured with environment variables. Set the
+following environment variables in your deployments:
+
+|             Envvar name | Value                                                  |
+| ----------------------: | :----------------------------------------------------- |
+|     `AWS_ACCESS_KEY_ID` | The access key ID from the runner keypair              |
+| `AWS_SECRET_ACCESS_KEY` | The secret access key from the runner keypair          |
+|   `AWS_ENDPOINT_URL_S3` | `https://fly.storage.tigris.dev`                       |
+|            `AWS_REGION` | `auto`                                                 |
+|            `MODEL_PATH` | `ByteDance/SDXL-Lightning`                             |
+|     `MODEL_BUCKET_NAME` | `model-storage` (replace with your own bucket name)    |
+|    `PUBLIC_BUCKET_NAME` | `generated-images` (replace with your own bucket name) |
+
 Format all of your environment variables as you would in a `docker run` command.
 EG:
 
 ```text
-"-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=black-forest-labs/FLUX.1-schnell -e PUBLIC_BUCKET_NAME=generated-images"
+"-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=ByteDance/SDXL-Lightning -e PUBLIC_BUCKET_NAME=generated-images"
 ```
 
 Then execute the launch command:
@@ -55,7 +68,7 @@ Then execute the launch command:
 vastai create instance \
   <id-from-search> \
   --image your-docker-username/flux-tigris:latest \
-  --env "-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=black-forest-labs/FLUX.1-schnell -e PUBLIC_BUCKET_NAME=generated-images" \
+  --env "-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=ByteDance/SDXL-Lightning -e PUBLIC_BUCKET_NAME=generated-images" \
   --disk 48 \
   --onstart-cmd "python -m cog.server.http"
 ```
@@ -95,10 +108,10 @@ curl "http://ip:port/predictions/$(uuidgen)" \
   -H "Content-Type: application/json" \
   --data-binary '{
     "input": {
-        "prompt": "The word 'success' in front of the Space Needle, anime depiction, best quality",
-        "aspect_ratio": "16:9",
+        "prompt": "The space needle in Seattle, best quality, masterpiece",
+        "aspect_ratio": "1:1",
         "guidance_scale": 3.5,
-        "num_inference_steps": 50,
+        "num_inference_steps": 4,
         "max_sequence_length": 512,
         "output_format": "png",
         "num_outputs": 1