Running other pipelines #25

irgolic · 2022-10-26T01:56:53Z

In #23 I tried to implement running an arbitrary HuggingFace pipeline as it gets published in e.g. https://github.com/huggingface/diffusers/tree/main/examples/community.

I added a way to set the name of the pipeline to be constructed (referencing e.g., a community published pipeline), and the name of the method to be called on it (None referring to __call__, otherwise usually either text2img, img2img and inpaint).

Extra parameters were also allowed. I got to the point where arbitrary arguments could be passed to the pipe(**kwargs) call:

by GET /pipeline?query1=a&query2=b, arbitrary query strings were parsed by ast.literal_eval, able to express primitive types like strings, bytes, numbers, and even collections like tuples, lists, dicts
POST /task took an extra_parameters dict, able to express any JSON (string, number, boolean, list, dict)

However, this would still mean no way of calling most of the community pipelines.

CLIP-guided pipeline doesn't take negative_prompt, which is defined in our defaults
Composable pipeline would work I think; the only difference is that it allows a token1 | token2 syntax
Interpolate pipeline saves them into files and returns string, which won't work without writing extra code to support handing those over to the caller
Speech to image pipeline takes an untyped audio argument (probably some sort of blob), which can't be easily represented by the types expressable by ast.literal_eval and JSON
Wildcard pipeline would probably work by providing wildcard_option_dict along with a __clothing__ syntaxed prompt
Mega pipeline is the one used by default
LPW pipeline is the one we should switch the default to, does everything mega does plus supports (token:1.3) syntax and longer prompts

Essentially, with #23 you would profit composable and wildcard pipeline. The truth is, I would prefer to allow all of these syntaxes at the same time, but the pipeline definitions aren't composable with one another (a gripe shared by others huggingface/diffusers#841).

Is there some other way to run arbitrary pipelines via API? What changes would we need to make to get there?

The text was updated successfully, but these errors were encountered:

MicahZoltu · 2022-10-26T12:40:25Z

My own thoughts on the matter:
In my experience, trying to generalize early often leads to leaky abstractions which tend to hurt more than help. I think it may be better to design the API around one very specific pipeline, maybe even a specific model, and then add support for a second pipeline/model as a separate endpoint. Once you have two pipelines, you can look for how much code can be reused between them and potentially do some internal refactoring as needed. Then add a third, refactor, etc. Once you get to about 5 you likely will have a better idea about where the right places to add abstractions are, and where it would be better to keep things separate.

irgolic · 2022-10-26T16:03:56Z

@MicahZoltu Good point. Abstracting pipelines is non-trivial, you might find huggingface/diffusers#551 an interesting read.

I'd like to keep models modular, but for now one pipeline (lpw_stable_diffusion) will be used to implement txt2img, img2img, and inpaint.

I look forward to adding image interpolation, CLIP-guided generation and __wildcard__ prompt syntax, but the use of those pipelines should be abstracted behind the endpoint interface. What should that interface be?

For example, in trying to implement interpolation. Let's add a new Params subclass called InterpolateParams, create a GET /interpolate endpoint, and make it possible to call POST /task with InterpolateParams.

Interpolation could encapsulate the call to walk. If we did that, FinishedEvent would need to be amended to return multiple results.
Alternatively, interpolation could be implemented on top of the currently used lpw_stable_diffusion pipeline instead of instantiating a whole other pipeline, in a way that returns multiple tasks, each with their own result. In this case I would amend POST /task to return list[TaskId].

The latter option feels more correct to me, and would allow parallelization among multiple workers, but would require lifting code similar to the walk method. I was hoping diffusers would make it so that we wouldn't need to perform any tensor computation in this repository, but community pipelines are currently non-composable standalone solutions with wildly different interfaces (i.e., returning a list of filenames saved on disk).

irgolic changed the title ~~Running an arbitrary pipeline~~ Running other pipelines Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running other pipelines #25

Running other pipelines #25

irgolic commented Oct 26, 2022

MicahZoltu commented Oct 26, 2022 •

edited

Loading

Uh oh!

irgolic commented Oct 26, 2022

Uh oh!

Running other pipelines #25

Running other pipelines #25

Comments

irgolic commented Oct 26, 2022

MicahZoltu commented Oct 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

irgolic commented Oct 26, 2022

Uh oh!

MicahZoltu commented Oct 26, 2022 •

edited

Loading