Skip to content

Commit 3b5b8ac

Browse files
committed
blueprints/model-storage: review feedback fixes
Signed-off-by: Xe Iaso <[email protected]>
1 parent 8013529 commit 3b5b8ac

File tree

3 files changed

+98
-57
lines changed

3 files changed

+98
-57
lines changed

docs/blueprints/model-storage/fly-io.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ fly machine run \
5252
-e MODEL_PATH=ByteDance/SDXL-Lightning \
5353
--vm-gpu-kind l40s \
5454
-r sea \
55-
your-docker-username/sdxl-tigris:latest \
55+
ghcr.io/tigrisdata-community/runner/sdxl:latest \
5656
-- python -m cog.server.http --host ::
5757
```
5858

docs/blueprints/model-storage/index.md renamed to docs/blueprints/model-storage/index.mdx

Lines changed: 96 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -6,61 +6,82 @@ state the underlying hardware is in. You don’t know if you have your models
66
cached, and in the worst case you need to do a cold start and download your
77
model weights from scratch.
88

9-
This is typically done with the Hugging Face CDN, but sometimes that’s just not
10-
fast enough, you don’t want to distribute your private model weights over a
11-
third party, or compliance issues force you to make sure that your model weights
12-
live and die in a single part of the globe. Remember, while your instances are
13-
sitting there pulling model weights from the cloud, you're burning GPU spend.
14-
Time is money.
9+
A couple fixable problems arise when running your models on serverless or any
10+
frequently changing infrastructure:
11+
12+
- Model distribution that's not optimized for latency causes needless GPU idle
13+
time as the model weights are downloaded to the machine on cold start. Tigris
14+
behaves like a content delivery network by default and is designed for low
15+
latency, saving idle time on cold start.
16+
- Compliance restrictions like data sovereignty and GDPR increase complexity
17+
quickly. Tigris makes regional restrictions a one-line configuration, guide
18+
[here](https://www.tigrisdata.com/docs/objects/object_regions/).
19+
- Reliance on third party caches for distributing models creates an upstream
20+
dependency and leaves your system vulnerable to downtime. Tigris guarantees
21+
99.99% availability with
22+
[public availability data](https://www.tigrisdata.com/blog/availability-metrics-public/).
1523

1624
## Usecase
1725

1826
You can put AI model weights into Tigris so that they are cached and fast no
1927
matter where you’re inferencing from. This allows you to have cold starts be
2028
faster and you can take advantage of Tigris'
21-
[globally distributed pull-through caching architecture](/docs/overview/),
22-
enabling your workloads to start quickly no matter where they are in the world.
23-
24-
## Getting Started
29+
[globally distributed architecture](/docs/overview/), enabling your workloads to
30+
start quickly no matter where they are in the world.
2531

2632
For this example, we’ll set up
2733
[SDXL Lightning](https://huggingface.co/ByteDance/SDXL-Lightning) by ByteDance
28-
for inference with the weights stored in Tigris.
29-
30-
Create two new buckets:
31-
32-
1. One bucket will be for generated images, it’ll be called `generated-images`
33-
in this article
34-
2. One bucket will be for storing models, it’ll be called `model-storage` in
35-
this article
34+
for inference with the weights stored in Tigris. Here's what you need to do:
3635

37-
Both of these buckets should be private.
36+
- Prepare and upload the model to Tigris
37+
- Create a restricted access key for model runners
38+
- Run inference somewhere
3839

3940
Download the `sdxl-in-tigris` template from GitHub:
4041

4142
```text
4243
git clone https://github.com/tigrisdata-community/sdxl-in-tigris
4344
```
4445

45-
Enter the folder in a terminal window.
46+
<details>
47+
<summary>Prerequisite tools</summary>
4648

47-
If you have [Homebrew](https://brew.sh) installed on macOS or Linux, run
48-
`brew bundle` to automatically install all of the dependencies. If you don't,
49-
here's what you need to install via your package manager of choice:
49+
In order to run this example locally, you need these tools installed:
5050

51-
- Python 3.11 (the minor version matters, the patch version can and will vary)
51+
- Python 3.11
5252
- pipenv
53-
- [Replicate's cog tool](https://github.com/replicate/cog)
5453
- The AWS CLI
55-
- The Hugging Face CLI
56-
- [jq](https://jqlang.github.io/jq/)
5754

58-
Configure the AWS CLI for use with Tigris:
55+
Also be sure to configure the AWS CLI for use with Tigris:
5956
[Configuring the AWS CLI](/docs/sdks/s3/aws-cli/).
6057

61-
If you are on a Mac, Install the
62-
[Docker Desktop app](https://www.docker.com/products/docker-desktop/). This will
63-
not work with alternatives such as Podman Desktop.
58+
To build a custom variant of the image, you need these tools installed:
59+
60+
- Mac/Windows:
61+
[Docker Desktop app](https://www.docker.com/products/docker-desktop/),
62+
alternatives such as Podman Desktop will not work.
63+
- Linux: Docker daemon, alternatives such as Podman will not work.
64+
- [Replicate's cog tool](https://github.com/replicate/cog)
65+
- [jq](https://jqlang.github.io/jq/)
66+
67+
To install all of the tool depedencies at once, clone the template repo and run
68+
`brew bundle`.
69+
70+
</details>
71+
72+
Create two new buckets:
73+
74+
1. One bucket will be for generated images, it’ll be called `generated-images`
75+
in this article
76+
2. One bucket will be for storing models, it’ll be called `model-storage` in
77+
this article
78+
79+
```text
80+
aws s3 create-bucket --acl private generated-images
81+
aws s3 create-bucket --acl private model-storage
82+
```
83+
84+
Both of these buckets should be private.
6485

6586
Then activate the virtual environment with `pipenv shell` and install the
6687
dependencies for uploading a model:
@@ -70,51 +91,69 @@ pipenv shell --python 3.11
7091
pip install -r requirements.txt
7192
```
7293

73-
Then run the script to upload a model:
94+
Then run the `prepare_model` script to massage and upload a Stable Diffusion XL
95+
model or finetune to Tigris:
7496

7597
```text
7698
python scripts/prepare_model.py ByteDance/SDXL-Lightning model-storage
7799
```
78100

79-
This will take a bit to run, depending on your internet connection speed, hard
80-
drive speed, and the current phase of the moon.
101+
:::info
102+
103+
Want differently styled images? Try finetunes like
104+
[Kohaku XL](https://huggingface.co/KBlueLeaf/Kohaku-XL-Zeta)! Pass the Hugging
105+
Face repo name to the `prepare_model` script like this:
106+
107+
```text
108+
python scripts/prepare_model.py KBlueLeaf/Kohaku-XL-Zeta model-storage
109+
```
81110

82-
While it’s running, head to the Tigris console and create a new access key, give
83-
it the following permissions:
111+
:::
84112

85-
- Read-only on your `model-storage` bucket
86-
- Editor on your `generated-images` bucket
113+
This will take a bit to run, depending on your internet connection speed, hard
114+
drive speed, and the current phase of the moon. While it’s running, head to the
115+
Tigris console and create a new access key. Don't assign any permissions to it.
116+
117+
### Create a restricted access key for model runners
87118

88119
Copy the access key ID and secret access keys into either your notes or a
89120
password manager, you will not be able to see them again. These credentials will
90121
be used later to deploy your app in the cloud. This keypair will be referred to
91122
as the `runner-keypair` in this tutorial.
92123

93-
Once it’s done, you’ll have everything in Tigris and get a list of environment
94-
variables:
124+
Open `iam/model-runner.json` in your text editor. Change all references for
125+
`model-storage` and `generated-images` to the buckets you created earlier.
126+
127+
Then export this variable to make IAM changes in Tigris:
95128

96129
```text
97-
AWS_ACCESS_KEY_ID=<key from earlier>
98-
AWS_SECRET_ACCESS_KEY=<key from earlier>
99-
AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev
100-
AWS_REGION=auto
101-
MODEL_BUCKET_NAME=model-storage
102-
MODEL_PATH=ByteDance/SDXL-Lightning
130+
AWS_ENDPOINT_URL_IAM=https://fly.iam.storage.tigris.dev
103131
```
104132

105-
:::info
133+
Create an IAM policy based on the document you edited:
106134

107-
Want differently styled images? Try finetunes like
108-
[Kohaku XL](https://huggingface.co/KBlueLeaf/Kohaku-XL-Zeta)! Pass the Hugging
109-
Face repo name to the `prepare_model` script like this:
135+
```text
136+
aws iam create-policy --policy-name sdxl-runner --policy-document file://./iam/model-runner.json
137+
```
138+
139+
Copy down the ARN in the output, it should look something like this:
110140

111141
```text
112-
python scripts/prepare_model.py KBlueLeaf/Kohaku-XL-Zeta model-storage
142+
arn:aws:iam::flyio_hunter2hunter2hunter2:policy/sdxl-runner
113143
```
114144

115-
:::
145+
Attach it to the token you just created:
116146

117-
## Deploying it
147+
```text
148+
aws iam attach-user-policy \
149+
--policy-arn arn:aws:iam::flyio_hunter2hunter2:policy/sdxl-runner \
150+
--user-name tid_runner_keypair_access_key_id
151+
```
152+
153+
### Running inference
154+
155+
<details>
156+
<summary>Optional: building your own image</summary>
118157

119158
In order to deploy this, you need to build the image with the cog tool. Log into
120159
a Docker registry and run this command to build and push it:
@@ -123,8 +162,10 @@ a Docker registry and run this command to build and push it:
123162
cog push your-docker-username/sdxl-tigris --use-cuda-base-image false
124163
```
125164

126-
You can now use it with your GPU host of choice as long as it supports Cuda 12.1
127-
and has at least 12 GB of video memory.
165+
</details>
166+
167+
You can now use it with your GPU host of choice as long as it support at least
168+
Cuda 12.1 and has at least 12 GB of video memory.
128169

129170
This example is configured with environment variables. Set the following
130171
environment variables in your deployments:

docs/blueprints/model-storage/vast-ai.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ Then execute the launch command:
6767
```text
6868
vastai create instance \
6969
<id-from-search> \
70-
--image your-docker-username/flux-tigris:latest \
70+
--image ghcr.io/tigrisdata-community/runner/sdxl:latest \
7171
--env "-p 5000:5000 -e AWS_ACCESS_KEY_ID=<runner-keypair-access-key-id> -e AWS_SECRET_ACCESS_KEY=<runner-keypair-secret-access-key> -e AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev -e AWS_REGION=auto -e MODEL_BUCKET_NAME=model-storage -e MODEL_PATH=ByteDance/SDXL-Lightning -e PUBLIC_BUCKET_NAME=generated-images" \
7272
--disk 48 \
7373
--onstart-cmd "python -m cog.server.http"

0 commit comments

Comments
 (0)