Skip to content

Fix gptq device_map = "cpu"#1662

Merged
fxmarty merged 4 commits into
huggingface:mainfrom
SunMarc:fix_gptq_cpu
Feb 6, 2024
Merged

Fix gptq device_map = "cpu"#1662
fxmarty merged 4 commits into
huggingface:mainfrom
SunMarc:fix_gptq_cpu

Conversation

@SunMarc
Copy link
Copy Markdown
Member

@SunMarc SunMarc commented Jan 22, 2024

What does this do ?

This PR fixes the case where the user passes device_map = "cpu". In that case, we still need to move each block to the gpu 0 on our own since we didn't add hooks.
Fixes 28632

@SunMarc SunMarc requested a review from fxmarty January 22, 2024 16:48
Copy link
Copy Markdown
Contributor

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we still need to move each block to the gpu 0 on our own since we didn't add hooks.

Do you mean that accelerate does not add hooks in case we have a single device in device_map?

What if the user does not have a GPU?

Comment thread tests/gptq/test_quantization.py Outdated
cache_block_outputs = True
modules_to_quantize_inside_block = None

device_map_for_quantization = {"": 0}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, thanks for the advice.

@SunMarc
Copy link
Copy Markdown
Member Author

SunMarc commented Jan 31, 2024

Do you mean that accelerate does not add hooks in case we have a single device in device_map?

Yes, that's right. We added initially them, but we removed them since it was confusing for most users.

What if the user does not have a GPU?

You need a GPU to quantize the model. It will trigger an error

@fxmarty fxmarty merged commit 2c81219 into huggingface:main Feb 6, 2024
young-developer pushed a commit to young-developer/optimum that referenced this pull request May 10, 2024
* fix gptq cpu device_map

* fix test

* remove default dict
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't quantize gptq model on CPU runtime?

2 participants