Skip to content

Commit fab6e8c

Browse files
committed
readme improvemets (separate deploy/local, add testing info)
1 parent 8eca8fe commit fab6e8c

1 file changed

Lines changed: 65 additions & 69 deletions

File tree

README.md

Lines changed: 65 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -22,42 +22,46 @@ assumptions. If anything is unclear, please open an issue.
2222

2323
## Usage:
2424

25-
1. Clone or fork this repo.
26-
27-
1. Most of the configuration happens via docker build variables. You can
28-
see all the options in the [Dockerfile](./Dockerfile), and edit them
29-
there directly, or set via docker command line or e.g. Banana's dashboard
30-
UI once support for build variables land (any day now).
31-
32-
If you're only deploying one container, that's all you need! If you
33-
intend to deploy multiple containers each with different variables
34-
(e.g. a few different models), you can edit the example
35-
[`scripts/permutations.yaml`](scripts/permutations.yaml)] file and
36-
run [`scripts/permute.sh`](scripts/permute.sh)` to create a number
37-
of sub-repos in the `permutations` directory.
38-
39-
Lastly, there's an option to set `MODEL_ID=ALL`, and *all* models will
40-
be downloaded, and switched at request time (great for dev, useless for
41-
serverless).
42-
43-
1. **Building**
44-
45-
1. Set `HF_AUTH_TOKEN` environment var if you haven't set it elsewhere.
46-
1. `docker build -t banana-sd --build-arg HF_AUTH_TOKEN=$HF_AUTH_TOKEN .`
47-
1. Optionally add `DOCKER_BUILDKIT=1 BUILDKIT_PROGRESS=plain` to
48-
start of the line, depending on your preferences. (Recommended if
49-
you're using the `root-cache` feature.)
50-
1. Note: your first build can take a really long time, depending on
51-
your PC & network speed, and *especially when using the `CHECKPOINT_URL`
52-
feature*. Great time to grab a coffee or take a walk.
53-
54-
1. **Running**
55-
56-
1. `docker run -it --gpus all -p 8000:8000 banana-sd python3 server.py`
57-
1. Note: the `-it` is optional but makes it alot quicker/easier to stop the
58-
container using `Ctrl-C`.
59-
1. If you get a `CUDA initialization: CUDA unknown error` after suspend,
60-
just stop the container, `rmmod nvidia_uvm`, and restart.
25+
Firstly, fork and clone this repo.
26+
27+
Most of the configuration happens via docker build variables. You can
28+
see all the options in the [Dockerfile](./Dockerfile), and edit them
29+
there directly, or set via docker command line or e.g. Banana's dashboard
30+
UI once support for build variables land (any day now).
31+
32+
If you're only deploying one container, that's all you need! If you
33+
intend to deploy multiple containers each with different variables
34+
(e.g. a few different models), you can edit the example
35+
[`scripts/permutations.yaml`](scripts/permutations.yaml)] file and
36+
run [`scripts/permute.sh`](scripts/permute.sh) to create a number
37+
of sub-repos in the `permutations` directory.
38+
39+
Lastly, there's an option to set `MODEL_ID=ALL`, and *all* models will
40+
be downloaded, and switched at request time (great for dev, useless for
41+
serverless).
42+
43+
**Deploying to banana?** That's it! You're done. Commit your changes and push.
44+
45+
## Running locally / development:
46+
47+
**Building**
48+
49+
1. Set `HF_AUTH_TOKEN` environment var if you haven't set it elsewhere.
50+
1. `docker build -t banana-sd --build-arg HF_AUTH_TOKEN=$HF_AUTH_TOKEN .`
51+
1. Optionally add `DOCKER_BUILDKIT=1 BUILDKIT_PROGRESS=plain` to
52+
start of the line, depending on your preferences. (Recommended if
53+
you're using the `root-cache` feature.)
54+
1. Note: your first build can take a really long time, depending on
55+
your PC & network speed, and *especially when using the `CHECKPOINT_URL`
56+
feature*. Great time to grab a coffee or take a walk.
57+
58+
**Running**
59+
60+
1. `docker run -it --gpus all -p 8000:8000 banana-sd python3 server.py`
61+
1. Note: the `-it` is optional but makes it alot quicker/easier to stop the
62+
container using `Ctrl-C`.
63+
1. If you get a `CUDA initialization: CUDA unknown error` after suspend,
64+
just stop the container, `rmmod nvidia_uvm`, and restart.
6165

6266
## Sending requests
6367

@@ -76,7 +80,7 @@ The container expects an `HTTP POST` request with the following JSON body:
7680
"callInputs": {
7781
"MODEL_ID": "runwayml/stable-diffusion-v1-5",
7882
"PIPELINE": "StableDiffusionPipeline",
79-
"SCHEDULER": "LMS",
83+
"SCHEDULER": "LMSDiscreteScheduler",
8084
"safety_checker": true,
8185
},
8286
}
@@ -97,10 +101,30 @@ explicitly name `modelInputs` above, and send a bigger object (with
97101

98102
There are also very basic examples in [test.py](./test.py), which you can view
99103
and call `python test.py` if the container is already running on port 8000.
104+
You can also specify a specific test, change some options, and run against a
105+
deployed banana image:
106+
107+
```bash
108+
# Run against http://localhost:8000/
109+
$ python test.py txt2img
110+
Usage: python3 test.py [--banana] [--xmfe=1/0] [--scheduler=SomeScheduler] [test1] [test2] [etc]
111+
Running test: txt2img
112+
Request took 5.2s (init: 9.2s, inference: 5.1s)
113+
Saved /home/dragon/www/banana/banana-sd-base/tests/output/txt2img.png
114+
115+
# Run against deployed banana image
116+
$ export BANANA_API_KEY=XXX
117+
$ BANANA_MODEL_KEY=XXX python3 test.py --banana txt2img
118+
Running test: txt2img
119+
Request took 4.3s (init: 6.5s, inference: 2.3s)
120+
Saved /home/dragon/www/banana/banana-sd-base/tests/output/txt2img.png
121+
```
100122

101123
The best example of course is https://kiri.art/ and it's
102124
[source code](https://github.com/kiri-art/stable-diffusion-react-nextjs-mui-pwa).
103125

126+
127+
104128
## Troubleshooting
105129

106130
* **403 Client Error: Forbidden for url**
@@ -142,38 +166,10 @@ Set `CALL_URL` and `SIGN_KEY` environment variables to send timing data on `init
142166
and `inference` start and end data. You'll need to check the source code of here
143167
and sd-mui as the format is in flux.
144168

145-
***Original Template README follows***
146-
147-
# 🍌 Banana Serverless
148-
149-
This repo gives a basic framework for serving Stable Diffusion in production using simple HTTP servers.
150-
151-
## Quickstart:
152-
153-
1. Create your own private repo and copy the files from this template repo into it. You'll want a private repo so that your huggingface keys are secure.
154-
155-
2. Install the [Banana Github App](https://github.com/apps/banana-serverless) to your new repo.
156-
157-
3. Login in to the [Banana Dashboard](https://app.banana.dev) and setup your account by saving your payment details and linking your Github.
158-
159-
4. Create huggingface account to get permission to download and run [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion-v1-4) text-to-image model.
160-
- Accept terms and conditions for the use of the v1-4 [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion-v1-4)
161-
162-
5. Edit the `dockerfile` in your forked repo with `ENV HF_AUTH_TOKEN=your_auth_token`
163-
164-
6. Push that repo to main.
165-
166-
From then onward, any pushes to the default repo branch (usually "main" or "master") trigger Banana to build and deploy your server, using the Dockerfile.
167-
Throughout the build we'll sprinkle in some secret sauce to make your server extra snappy 🔥
168-
169-
It'll then be deployed on our Serverless GPU cluster and callable with any of our serverside SDKs:
170-
171-
- [Python](https://github.com/bananaml/banana-python-sdk)
172-
- [Node JS / Typescript](https://github.com/bananaml/banana-node-sdk)
173-
- [Go](https://github.com/bananaml/banana-go)
169+
This info is now logged regardless, and `init()` and `inference()` times are sent
170+
back via `{ $timings: { init: timeInMs, inference: timeInMs } }`.
174171

175-
You can monitor buildtime and runtime logs by clicking the logs button in the model view on the [Banana Dashboard](https://app.banana.dev)
172+
## Acknowledgements
176173

177-
<br>
174+
Originally based on https://github.com/bananaml/serverless-template-stable-diffusion.
178175

179-
## Use Banana for scale.

0 commit comments

Comments
 (0)