Skip to content

Running other pipelines #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
irgolic opened this issue Oct 26, 2022 · 2 comments
Open

Running other pipelines #25

irgolic opened this issue Oct 26, 2022 · 2 comments

Comments

@irgolic
Copy link
Owner

irgolic commented Oct 26, 2022

In #23 I tried to implement running an arbitrary HuggingFace pipeline as it gets published in e.g. https://github.com/huggingface/diffusers/tree/main/examples/community.

I added a way to set the name of the pipeline to be constructed (referencing e.g., a community published pipeline), and the name of the method to be called on it (None referring to __call__, otherwise usually either text2img, img2img and inpaint).

Extra parameters were also allowed. I got to the point where arbitrary arguments could be passed to the pipe(**kwargs) call:

  • by GET /pipeline?query1=a&query2=b, arbitrary query strings were parsed by ast.literal_eval, able to express primitive types like strings, bytes, numbers, and even collections like tuples, lists, dicts
  • POST /task took an extra_parameters dict, able to express any JSON (string, number, boolean, list, dict)

However, this would still mean no way of calling most of the community pipelines.

  • CLIP-guided pipeline doesn't take negative_prompt, which is defined in our defaults
  • Composable pipeline would work I think; the only difference is that it allows a token1 | token2 syntax
  • Interpolate pipeline saves them into files and returns string, which won't work without writing extra code to support handing those over to the caller
  • Speech to image pipeline takes an untyped audio argument (probably some sort of blob), which can't be easily represented by the types expressable by ast.literal_eval and JSON
  • Wildcard pipeline would probably work by providing wildcard_option_dict along with a __clothing__ syntaxed prompt
  • Mega pipeline is the one used by default
  • LPW pipeline is the one we should switch the default to, does everything mega does plus supports (token:1.3) syntax and longer prompts

Essentially, with #23 you would profit composable and wildcard pipeline. The truth is, I would prefer to allow all of these syntaxes at the same time, but the pipeline definitions aren't composable with one another (a gripe shared by others huggingface/diffusers#841).

Is there some other way to run arbitrary pipelines via API? What changes would we need to make to get there?

@MicahZoltu
Copy link
Collaborator

MicahZoltu commented Oct 26, 2022

My own thoughts on the matter:
In my experience, trying to generalize early often leads to leaky abstractions which tend to hurt more than help. I think it may be better to design the API around one very specific pipeline, maybe even a specific model, and then add support for a second pipeline/model as a separate endpoint. Once you have two pipelines, you can look for how much code can be reused between them and potentially do some internal refactoring as needed. Then add a third, refactor, etc. Once you get to about 5 you likely will have a better idea about where the right places to add abstractions are, and where it would be better to keep things separate.

@irgolic
Copy link
Owner Author

irgolic commented Oct 26, 2022

@MicahZoltu Good point. Abstracting pipelines is non-trivial, you might find huggingface/diffusers#551 an interesting read.

I'd like to keep models modular, but for now one pipeline (lpw_stable_diffusion) will be used to implement txt2img, img2img, and inpaint.

I look forward to adding image interpolation, CLIP-guided generation and __wildcard__ prompt syntax, but the use of those pipelines should be abstracted behind the endpoint interface. What should that interface be?

For example, in trying to implement interpolation. Let's add a new Params subclass called InterpolateParams, create a GET /interpolate endpoint, and make it possible to call POST /task with InterpolateParams.

  • Interpolation could encapsulate the call to walk. If we did that, FinishedEvent would need to be amended to return multiple results.
  • Alternatively, interpolation could be implemented on top of the currently used lpw_stable_diffusion pipeline instead of instantiating a whole other pipeline, in a way that returns multiple tasks, each with their own result. In this case I would amend POST /task to return list[TaskId].

The latter option feels more correct to me, and would allow parallelization among multiple workers, but would require lifting code similar to the walk method. I was hoping diffusers would make it so that we wouldn't need to perform any tensor computation in this repository, but community pipelines are currently non-composable standalone solutions with wildly different interfaces (i.e., returning a list of filenames saved on disk).

@irgolic irgolic changed the title Running an arbitrary pipeline Running other pipelines Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants