@@ -6,61 +6,82 @@ state the underlying hardware is in. You don’t know if you have your models
66cached, and in the worst case you need to do a cold start and download your
77model weights from scratch.
88
9- This is typically done with the Hugging Face CDN, but sometimes that’s just not
10- fast enough, you don’t want to distribute your private model weights over a
11- third party, or compliance issues force you to make sure that your model weights
12- live and die in a single part of the globe. Remember, while your instances are
13- sitting there pulling model weights from the cloud, you're burning GPU spend.
14- Time is money.
9+ A couple fixable problems arise when running your models on serverless or any
10+ frequently changing infrastructure:
11+
12+ - Model distribution that's not optimized for latency causes needless GPU idle
13+ time as the model weights are downloaded to the machine on cold start. Tigris
14+ behaves like a content delivery network by default and is designed for low
15+ latency, saving idle time on cold start.
16+ - Compliance restrictions like data sovereignty and GDPR increase complexity
17+ quickly. Tigris makes regional restrictions a one-line configuration, guide
18+ [ here] ( https://www.tigrisdata.com/docs/objects/object_regions/ ) .
19+ - Reliance on third party caches for distributing models creates an upstream
20+ dependency and leaves your system vulnerable to downtime. Tigris guarantees
21+ 99.99% availability with
22+ [ public availability data] ( https://www.tigrisdata.com/blog/availability-metrics-public/ ) .
1523
1624## Usecase
1725
1826You can put AI model weights into Tigris so that they are cached and fast no
1927matter where you’re inferencing from. This allows you to have cold starts be
2028faster and you can take advantage of Tigris'
21- [ globally distributed pull-through caching architecture] ( /docs/overview/ ) ,
22- enabling your workloads to start quickly no matter where they are in the world.
23-
24- ## Getting Started
29+ [ globally distributed architecture] ( /docs/overview/ ) , enabling your workloads to
30+ start quickly no matter where they are in the world.
2531
2632For this example, we’ll set up
2733[ SDXL Lightning] ( https://huggingface.co/ByteDance/SDXL-Lightning ) by ByteDance
28- for inference with the weights stored in Tigris.
29-
30- Create two new buckets:
31-
32- 1 . One bucket will be for generated images, it’ll be called ` generated-images `
33- in this article
34- 2 . One bucket will be for storing models, it’ll be called ` model-storage ` in
35- this article
34+ for inference with the weights stored in Tigris. Here's what you need to do:
3635
37- Both of these buckets should be private.
36+ - Prepare and upload the model to Tigris
37+ - Create a restricted access key for model runners
38+ - Run inference somewhere
3839
3940Download the ` sdxl-in-tigris ` template from GitHub:
4041
4142``` text
4243git clone https://github.com/tigrisdata-community/sdxl-in-tigris
4344```
4445
45- Enter the folder in a terminal window.
46+ <details >
47+ <summary >Prerequisite tools</summary >
4648
47- If you have [ Homebrew] ( https://brew.sh ) installed on macOS or Linux, run
48- ` brew bundle ` to automatically install all of the dependencies. If you don't,
49- here's what you need to install via your package manager of choice:
49+ In order to run this example locally, you need these tools installed:
5050
51- - Python 3.11 (the minor version matters, the patch version can and will vary)
51+ - Python 3.11
5252- pipenv
53- - [ Replicate's cog tool] ( https://github.com/replicate/cog )
5453- The AWS CLI
55- - The Hugging Face CLI
56- - [ jq] ( https://jqlang.github.io/jq/ )
5754
58- Configure the AWS CLI for use with Tigris:
55+ Also be sure to configure the AWS CLI for use with Tigris:
5956[ Configuring the AWS CLI] ( /docs/sdks/s3/aws-cli/ ) .
6057
61- If you are on a Mac, Install the
62- [ Docker Desktop app] ( https://www.docker.com/products/docker-desktop/ ) . This will
63- not work with alternatives such as Podman Desktop.
58+ To build a custom variant of the image, you need these tools installed:
59+
60+ - Mac/Windows:
61+ [ Docker Desktop app] ( https://www.docker.com/products/docker-desktop/ ) ,
62+ alternatives such as Podman Desktop will not work.
63+ - Linux: Docker daemon, alternatives such as Podman will not work.
64+ - [ Replicate's cog tool] ( https://github.com/replicate/cog )
65+ - [ jq] ( https://jqlang.github.io/jq/ )
66+
67+ To install all of the tool depedencies at once, clone the template repo and run
68+ ` brew bundle ` .
69+
70+ </details >
71+
72+ Create two new buckets:
73+
74+ 1 . One bucket will be for generated images, it’ll be called ` generated-images `
75+ in this article
76+ 2 . One bucket will be for storing models, it’ll be called ` model-storage ` in
77+ this article
78+
79+ ``` text
80+ aws s3 create-bucket --acl private generated-images
81+ aws s3 create-bucket --acl private model-storage
82+ ```
83+
84+ Both of these buckets should be private.
6485
6586Then activate the virtual environment with ` pipenv shell ` and install the
6687dependencies for uploading a model:
@@ -70,51 +91,69 @@ pipenv shell --python 3.11
7091pip install -r requirements.txt
7192```
7293
73- Then run the script to upload a model:
94+ Then run the ` prepare_model ` script to massage and upload a Stable Diffusion XL
95+ model or finetune to Tigris:
7496
7597``` text
7698python scripts/prepare_model.py ByteDance/SDXL-Lightning model-storage
7799```
78100
79- This will take a bit to run, depending on your internet connection speed, hard
80- drive speed, and the current phase of the moon.
101+ :::info
102+
103+ Want differently styled images? Try finetunes like
104+ [ Kohaku XL] ( https://huggingface.co/KBlueLeaf/Kohaku-XL-Zeta ) ! Pass the Hugging
105+ Face repo name to the ` prepare_model ` script like this:
106+
107+ ``` text
108+ python scripts/prepare_model.py KBlueLeaf/Kohaku-XL-Zeta model-storage
109+ ```
81110
82- While it’s running, head to the Tigris console and create a new access key, give
83- it the following permissions:
111+ :::
84112
85- - Read-only on your ` model-storage ` bucket
86- - Editor on your ` generated-images ` bucket
113+ This will take a bit to run, depending on your internet connection speed, hard
114+ drive speed, and the current phase of the moon. While it’s running, head to the
115+ Tigris console and create a new access key. Don't assign any permissions to it.
116+
117+ ### Create a restricted access key for model runners
87118
88119Copy the access key ID and secret access keys into either your notes or a
89120password manager, you will not be able to see them again. These credentials will
90121be used later to deploy your app in the cloud. This keypair will be referred to
91122as the ` runner-keypair ` in this tutorial.
92123
93- Once it’s done, you’ll have everything in Tigris and get a list of environment
94- variables:
124+ Open ` iam/model-runner.json ` in your text editor. Change all references for
125+ ` model-storage ` and ` generated-images ` to the buckets you created earlier.
126+
127+ Then export this variable to make IAM changes in Tigris:
95128
96129``` text
97- AWS_ACCESS_KEY_ID=<key from earlier>
98- AWS_SECRET_ACCESS_KEY=<key from earlier>
99- AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev
100- AWS_REGION=auto
101- MODEL_BUCKET_NAME=model-storage
102- MODEL_PATH=ByteDance/SDXL-Lightning
130+ AWS_ENDPOINT_URL_IAM=https://fly.iam.storage.tigris.dev
103131```
104132
105- ::: info
133+ Create an IAM policy based on the document you edited:
106134
107- Want differently styled images? Try finetunes like
108- [ Kohaku XL] ( https://huggingface.co/KBlueLeaf/Kohaku-XL-Zeta ) ! Pass the Hugging
109- Face repo name to the ` prepare_model ` script like this:
135+ ``` text
136+ aws iam create-policy --policy-name sdxl-runner --policy-document file://./iam/model-runner.json
137+ ```
138+
139+ Copy down the ARN in the output, it should look something like this:
110140
111141``` text
112- python scripts/prepare_model.py KBlueLeaf/Kohaku-XL-Zeta model-storage
142+ arn:aws:iam::flyio_hunter2hunter2hunter2:policy/sdxl-runner
113143```
114144
115- :: :
145+ Attach it to the token you just created :
116146
117- ## Deploying it
147+ ``` text
148+ aws iam attach-user-policy \
149+ --policy-arn arn:aws:iam::flyio_hunter2hunter2:policy/sdxl-runner \
150+ --user-name tid_runner_keypair_access_key_id
151+ ```
152+
153+ ### Running inference
154+
155+ <details >
156+ <summary >Optional: building your own image</summary >
118157
119158In order to deploy this, you need to build the image with the cog tool. Log into
120159a Docker registry and run this command to build and push it:
@@ -123,8 +162,10 @@ a Docker registry and run this command to build and push it:
123162cog push your-docker-username/sdxl-tigris --use-cuda-base-image false
124163```
125164
126- You can now use it with your GPU host of choice as long as it supports Cuda 12.1
127- and has at least 12 GB of video memory.
165+ </details >
166+
167+ You can now use it with your GPU host of choice as long as it support at least
168+ Cuda 12.1 and has at least 12 GB of video memory.
128169
129170This example is configured with environment variables. Set the following
130171environment variables in your deployments:
0 commit comments