Skip to content

Commit 82ac622

Browse files
committed
add model storage blueprint for beamcloud
Signed-off-by: Xe Iaso <[email protected]>
1 parent 486fe90 commit 82ac622

File tree

1 file changed

+216
-0
lines changed

1 file changed

+216
-0
lines changed
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# Using model weights in Tigris anywhere with Beam
2+
3+
The most common way to deploy AI models in production is by using “serverless”
4+
inference. This means that every time you get a request, you don’t know what
5+
state the underlying hardware is in. You don’t know if you have your models
6+
cached, and in the worst case you need to do a cold start and download your
7+
model weights from scratch.
8+
9+
A couple fixable problems arise when running your models on serverless or any
10+
frequently changing infrastructure:
11+
12+
- Model distribution that's not optimized for latency causes needless GPU idle
13+
time as the model weights are downloaded to the machine on cold start. Tigris
14+
behaves like a content delivery network by default and is designed for low
15+
latency, saving idle time on cold start.
16+
- Compliance restrictions like data sovereignty and GDPR increase complexity
17+
quickly. Tigris makes regional restrictions a one-line configuration, guide
18+
[here](https://www.tigrisdata.com/docs/objects/object_regions/).
19+
- Reliance on third party caches for distributing models creates an upstream
20+
dependency and leaves your system vulnerable to downtime. Tigris guarantees
21+
99.99% availability with
22+
[public availability data](https://www.tigrisdata.com/blog/availability-metrics-public/).
23+
24+
## Beam
25+
26+
Defining HTTP endpoints for AI things is annoyingly complicated. There's a lot
27+
of opinionated frameworks and layers that get in the way of you just running the
28+
bit of code you need to get your app working. [Beam](https://www.beam.cloud) is
29+
all about simplifying down the experence so that all you need to do to get an
30+
endpoint working is define a single function:
31+
32+
```python
33+
from beam import endpoint, Image
34+
35+
36+
@endpoint(
37+
name="quickstart",
38+
cpu=1,
39+
memory="1Gi",
40+
image=Image().add_python_packages(["numpy"]),
41+
)
42+
def predict(**inputs):
43+
x = inputs.get("x", 256)
44+
return {"result": x**2}
45+
```
46+
47+
This lets you unify your code and configuration into the same file, allowing you
48+
to glance at a file and instantly understand what the endpoints are and do. They
49+
also [support GPU compute](https://docs.beam.cloud/v2/environment/gpu), allowing
50+
you to do scale-to-zero inference seamlessly.
51+
52+
## Usecase
53+
54+
You can put AI model weights into Tigris so that they are cached and fast no
55+
matter where you’re inferencing from. This allows you to have cold starts be
56+
faster and you can take advantage of Tigris'
57+
[globally distributed architecture](/docs/overview/), enabling your workloads to
58+
start quickly no matter where they are in the world.
59+
60+
For this example, we’ll set up
61+
[SDXL Lightning](https://huggingface.co/ByteDance/SDXL-Lightning) by ByteDance
62+
for inference with the weights stored in Tigris.
63+
64+
## Getting Started
65+
66+
Download the `sdxl-in-tigris` template from GitHub:
67+
68+
```text
69+
git clone https://github.com/tigrisdata-community/sdxl-in-tigris
70+
```
71+
72+
<details>
73+
<summary>Prerequisite tools</summary>
74+
75+
In order to run this example locally, you need these tools installed:
76+
77+
- Python 3.11
78+
- pipenv
79+
- The AWS CLI
80+
81+
Also be sure to configure the AWS CLI for use with Tigris:
82+
[Configuring the AWS CLI](/docs/sdks/s3/aws-cli/).
83+
84+
To build a custom variant of the image, you need these tools installed:
85+
86+
- Mac/Windows:
87+
[Docker Desktop app](https://www.docker.com/products/docker-desktop/),
88+
alternatives such as Podman Desktop will not work.
89+
- Linux: Docker daemon, alternatives such as Podman will not work.
90+
- [Replicate's cog tool](https://github.com/replicate/cog)
91+
- [jq](https://jqlang.github.io/jq/)
92+
93+
To install all of the tool depedencies at once, clone the template repo and run
94+
`brew bundle`.
95+
96+
</details>
97+
98+
Create a new bucket for generated images, it’ll be called `generated-images` in
99+
this article.
100+
101+
```text
102+
aws s3 create-bucket --acl private generated-images
103+
```
104+
105+
<details>
106+
<summary>Optional: upload your own model</summary>
107+
108+
If you want to upload your own models, create a bucket for this. It'll be called
109+
`model-storage` in this tutorial.
110+
111+
Both of these buckets should be private.
112+
113+
Then activate the virtual environment with `pipenv shell` and install the
114+
dependencies for uploading a model:
115+
116+
```text
117+
pipenv shell --python 3.11
118+
pip install -r requirements.txt
119+
```
120+
121+
Run the `prepare_model` script to massage and upload a Stable Diffusion XL model
122+
or finetune to Tigris:
123+
124+
```text
125+
python scripts/prepare_model.py ByteDance/SDXL-Lightning model-storage
126+
```
127+
128+
:::info
129+
130+
Want differently styled images? Try finetunes like
131+
[Kohaku XL](https://huggingface.co/KBlueLeaf/Kohaku-XL-Zeta)! Pass the Hugging
132+
Face repo name to the `prepare_model` script like this:
133+
134+
```text
135+
python scripts/prepare_model.py KBlueLeaf/Kohaku-XL-Zeta model-storage
136+
```
137+
138+
:::
139+
140+
</details>
141+
142+
## Access keys
143+
144+
Create a new access key in the [Tigris Dashboard](https://console.tigris.dev).
145+
Don't assign any permissions to it.
146+
147+
Copy the access key ID and secret access keys into either your notes or a
148+
password manager, you will not be able to see them again. These credentials will
149+
be used later to deploy your app in the cloud. This keypair will be referred to
150+
as the `workload-keypair` in this tutorial.
151+
152+
[Limit the scope of this access key](/docs/blueprints/limited-access-key) to
153+
only the `model-storage-demo` (or a custom bucket if you're uploading your own
154+
models) and `generated-images` buckets.
155+
156+
## Deploying it to Beam
157+
158+
Install the Beam SDK and CLI into your python environment
159+
[according to their directions](https://docs.beam.cloud/v2/getting-started/installation).
160+
Be sure to run `beam config create` to authenticate with an API key.
161+
162+
As a reminder, this example is configured with environment variables. Set the
163+
following secrets in your deployments:
164+
165+
| Envvar name | Value |
166+
| ----------------------: | :----------------------------------------------------------------- |
167+
| `AWS_ACCESS_KEY_ID` | The access key ID from the workload keypair |
168+
| `AWS_SECRET_ACCESS_KEY` | The secret access key from the workload keypair |
169+
| `AWS_ENDPOINT_URL_S3` | `https://fly.storage.tigris.dev` |
170+
| `AWS_REGION` | `auto` |
171+
| `MODEL_PATH` | `ByteDance/SDXL-Lightning` |
172+
| `MODEL_BUCKET_NAME` | `model-storage-demo` (Optional: replace with your own bucket name) |
173+
| `PUBLIC_BUCKET_NAME` | `generated-images` (replace with your own bucket name) |
174+
175+
You will need to run the `beam secret create` command for each of these:
176+
177+
```text
178+
beam secret create AWS_ENDPOINT_URL_S3 https://fly.storage.tigris.dev
179+
```
180+
181+
Then deploy it with `beam deploy`:
182+
183+
```text
184+
beam deploy beamcloud.py:generate
185+
```
186+
187+
You'll get a URL back that you can use to generate images. Do a test generation
188+
with this curl command:
189+
190+
```text
191+
curl "https://url-you-were-given-v1.app.beam-cloud" \
192+
-X PUT \
193+
-H "Content-Type: application/json" \
194+
-H 'Authorization: Bearer put-your-beam-auth-token-here' \
195+
--data-binary '{
196+
"prompt": "The space needle in Seattle, best quality, masterpiece",
197+
"aspect_ratio": "1:1",
198+
"guidance_scale": 3.5,
199+
"num_inference_steps": 4,
200+
"max_sequence_length": 512,
201+
"output_format": "png",
202+
"num_outputs": 1
203+
}'
204+
```
205+
206+
If all goes well, you should get an image like this:
207+
208+
![The word 'success' in front of the Space Needle](./success.webp)
209+
210+
Beam will automatically scale the deployment down when it's not in use. You can
211+
fully destroy your deployment with `beam deployment delete`:
212+
213+
```
214+
beam deployment list # to find the UUID of the deployment
215+
beam deployment delete uuid-of-deployment
216+
```

0 commit comments

Comments
 (0)