[Fix]Allow prompt and prior_token_ids to be provided simultaneously in GlmImagePipeline#13092
[Fix]Allow prompt and prior_token_ids to be provided simultaneously in GlmImagePipeline#13092JaredforReal wants to merge 2 commits intohuggingface:mainfrom
prompt and prior_token_ids to be provided simultaneously in GlmImagePipeline#13092Conversation
Signed-off-by: JaredforReal <w13431838023@gmail.com>
|
@yiyixuxu PTAL, thanks |
There was a problem hiding this comment.
Pull request overview
This PR updates the GLM image pipeline input validation to allow supplying prompt together with precomputed prior_token_ids, enabling users to skip the AR prior generation step while still deriving glyph prompt_embeds from the prompt.
Changes:
- Removed the mutual-exclusion validation between
promptandprior_token_idsinGlmImagePipeline.check_inputs(). - Relaxed the
prior_token_idsvalidation to accept eitherpromptorprompt_embedsas the source for glyph embeddings. - Added tests covering the newly supported
prompt + prior_token_idscombination and validating invalid input combinations.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/diffusers/pipelines/glm_image/pipeline_glm_image.py |
Adjusts check_inputs() to permit prompt and prior_token_ids together and updates validation/error paths. |
tests/pipelines/glm_image/test_glm_image.py |
Adds new tests for the supported combo and for rejecting invalid input combinations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| inputs_prompt_only = { | ||
| "prompt": "A photo of a cat", | ||
| "generator": generator, | ||
| "num_inference_steps": 2, | ||
| "guidance_scale": 1.5, | ||
| "height": height, | ||
| "width": width, | ||
| "max_sequence_length": 16, | ||
| "output_type": "pt", | ||
| } | ||
| prior_token_ids, _, _ = pipe.generate_prior_tokens( | ||
| prompt="A photo of a cat", | ||
| height=height, | ||
| width=width, | ||
| device=torch.device(device), | ||
| generator=torch.Generator(device=device).manual_seed(0), |
There was a problem hiding this comment.
inputs_prompt_only (and its generator) is created but never used. This makes the test misleading (it implies a full pipeline run in step 1) and adds dead code; consider removing it or actually using it to run pipe(**inputs_prompt_only) if that's the intended setup.
| inputs_prompt_only = { | |
| "prompt": "A photo of a cat", | |
| "generator": generator, | |
| "num_inference_steps": 2, | |
| "guidance_scale": 1.5, | |
| "height": height, | |
| "width": width, | |
| "max_sequence_length": 16, | |
| "output_type": "pt", | |
| } | |
| prior_token_ids, _, _ = pipe.generate_prior_tokens( | |
| prompt="A photo of a cat", | |
| height=height, | |
| width=width, | |
| device=torch.device(device), | |
| generator=torch.Generator(device=device).manual_seed(0), | |
| prior_token_ids, _, _ = pipe.generate_prior_tokens( | |
| prompt="A photo of a cat", | |
| height=height, | |
| width=width, | |
| device=torch.device(device), | |
| generator=generator, |
| if prior_token_ids is not None and prompt_embeds is None and prompt is None: | ||
| raise ValueError("`prompt_embeds` or `prompt` must also be provided with `prior_token_ids`.") | ||
|
|
There was a problem hiding this comment.
The prior_token_ids validation at the end of check_inputs is effectively unreachable because an earlier check already raises whenever both prompt and prompt_embeds are None. If the goal is a more specific error when prior_token_ids is provided without any prompt inputs, consider folding this condition into the earlier prompt/prompt_embeds validation (or removing this block to avoid redundant code).
| if prior_token_ids is not None and prompt_embeds is None and prompt is None: | |
| raise ValueError("`prompt_embeds` or `prompt` must also be provided with `prior_token_ids`.") |
| " only forward one of the two." | ||
| ) | ||
| elif prompt is None and prior_token_ids is None: | ||
| if prompt is None and prior_token_ids is None: |
There was a problem hiding this comment.
The error raised when prompt is None and prior_token_ids is None can be confusing if a user provided prompt_embeds: earlier validation allows prompt_embeds in place of prompt, but this message implies prompt is required. Consider clarifying the message to indicate that prompt_embeds alone is not sufficient because prompt is needed to generate prior_token_ids when they are not provided.
| if prompt is None and prior_token_ids is None: | |
| if prompt is None and prior_token_ids is None: | |
| # At this point, `prompt_embeds` is guaranteed to be not None (the earlier check | |
| # `elif prompt is None and prompt_embeds is None` would already have raised), | |
| # so clarify that `prompt_embeds` alone is not sufficient to derive `prior_token_ids`. | |
| if prompt_embeds is not None: | |
| raise ValueError( | |
| "You have provided `prompt_embeds` without `prior_token_ids`, but a text `prompt` is also " | |
| "required so that `prior_token_ids` can be generated. Please provide either a text `prompt` " | |
| "so the pipeline can compute `prior_token_ids`, or pass `prior_token_ids` explicitly." | |
| ) |
What does this PR do?
Previously,
GlmImagePipeline.check_inputs()enforced mutual exclusion betweenpromptandprior_token_ids, raising aValueErrorif both were provided. This was unnecessarily restrictive — there is a valid use case where a user wants to supply pre-computedprior_token_ids(to skip the expensive AR generation step) while still passingpromptso thatprompt_embeds(glyph embeddings) can be derived from it automatically.This PR relaxes that constraint:
promptandprior_token_idsare provided.prior_token_idsrequiresprompt_embedscheck — now eitherpromptorprompt_embedssatisfies the requirement, sincepromptwill be used to generateprompt_embedsviaencode_prompt()downstream.Supported input combinations after this change
promptonlyprior_token_ids; glyph encoder generatesprompt_embedsprior_token_ids+prompt_embedsprompt+prior_token_idsprior_token_idsused directly (AR skipped);promptgeneratesprompt_embedspromptnorprior_token_idsValueErrorprior_token_idsalone (noprompt/prompt_embeds)ValueErrorprompt+prompt_embedsValueErrorTests
Added two new tests to test_glm_image.py:
test_prompt_with_prior_token_ids: End-to-end test that first generatesprior_token_idsviagenerate_prior_tokens(), then runs the full pipeline with bothpromptandprior_token_idsprovided together.test_check_inputs_rejects_invalid_combinations: Unit test verifying that the three invalid input combinations listed above still correctly raiseValueError.All existing tests continue to pass.