Skip to content

pipeline.from_pretrained is broken when the pipeline is partially downloaded #3383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
haowang1013 opened this issue May 10, 2023 · 8 comments
Labels
bug Something isn't working stale Issues that haven't received updates

Comments

@haowang1013
Copy link

haowang1013 commented May 10, 2023

Describe the bug

Hi,

There's a bug in pipeline_utils.py which causes pipeline.from_pretrained to fail if the pipeline was partially downloaded. Specifically the code doesn't handle missing components in the feature_extractor, safety_checker, scheduler and tokenizer folders, at least on Windows platform.

The cause of the bug is this line

allow_patterns += [os.path.join(k, "*") for k in folder_names if k not in model_folder_names]

This turns some folder names into a regexp pattern but on Windows the path joining is done via {parent}\\{child}, which gives a pattern like this

[
'text_encoder/model.safetensors', 
'vae/diffusion_pytorch_model.bin', 
'vae/diffusion_pytorch_model.safetensors', 
'text_encoder/pytorch_model.bin', 
'unet/diffusion_pytorch_model.safetensors', 
'unet/diffusion_pytorch_model.bin', 
'feature_extractor\\*', 
'safety_checker\\*', 
'scheduler\\*', 
'tokenizer\\*'
]

The \\* pattern doesn't play nice with the regexp matching later and causes some files to be incorrectly excluded from the "consider list", after the expected_files = [f for f in expected_files if any(p.match(f) for p in re_allow_pattern)] call, expected_files is only

[
'unet/diffusion_pytorch_model.bin', 
'vae/diffusion_pytorch_model.bin', 
'text_encoder/pytorch_model.bin', 
'model_index.json'
]

And this means if some files are missing in those folders mentioned above, diffusers will not even try to download them and causes loading errors down the road.

To reproduce:

  • Use Windows
  • Call StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base") and wait until the pipeline is loaded
  • Delete the feature_extractor folder from the pipeline cache folder C:\Users\<YOU>\.cache\huggingface\hub\models--stabilityai--stable-diffusion-2-1-base\snapshots\<HASH>
  • Call StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base") again and observe the error

Fix:
I've tried changing the line above to below and it seems to fix the bug for me. This should be safe for non-Windows platforms as well as that's how path joining works for them in the first place.

allow_patterns += [f"{k}/*" for k in folder_names if k not in model_folder_names]

Reproduction

See above

Logs

No response

System Info

diffusers 0.16.1

@haowang1013 haowang1013 added the bug Something isn't working label May 10, 2023
@patrickvonplaten
Copy link
Contributor

Hey @haowang1013,

Thanks for the issue! I see, to my eyes replacing:

os.path.join(k, "*") 

with:

f"{k}/*"

seems to be the right move. @Wauplin is this fine in your experience?

@Wauplin
Copy link
Collaborator

Wauplin commented May 12, 2023

Yes that's exactly how to do it!

os.path.join is useful when filtering paths on the disk (to support both Windows and Unix) so for example in upload_folder. On the contrary, in snapshot_download you are filtering paths of files on the Hub which are always in Unix format (`f"whatever/*").

@patrickvonplaten
Copy link
Contributor

Great thanks! @haowang1013 would you like to open a PR to fix it? :-)

@vimarshc
Copy link
Contributor

I'd love to take this up!

@vimarshc
Copy link
Contributor

vimarshc commented May 16, 2023

Hey @Wauplin, curious about how you think I should add a test specific to Windows. Unable to find any entry in .yml (assuming GH action is encoded here) files that run on Windows.

@Wauplin
Copy link
Collaborator

Wauplin commented May 16, 2023

@patrickvonplaten is there some tests running on Windows in the CI?

FYI @vimarshc in huggingface_hub (on which I'm working the most) we have a CI dedicated to Windows (see config). Python-wise, if you want to check if the test is running on a Windows machine, you can do if os.name == "nt: .... This allow you to test specific output depending on the machine. But I don't think you'll need this as the tests should run on both Unix and Windows in your case.

@patrickvonplaten
Copy link
Contributor

Yeah we should probably test on Windows very soon cc @sayakpaul

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Jun 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

4 participants