-
Notifications
You must be signed in to change notification settings - Fork 2k
add GLORA #2568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add GLORA #2568
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, this looks already quite mature. I especially like that you've gone through the trouble to add tests! I've added some comments on things that I think can be improved still.
I think it would make sense to also add a test case in test_custom_models.py
since this will give a good coverage over the existing functionality in PEFT. Feel free to ask if you need assistance in doing so.
As I understand the paper they use random sampling + evolutionary search to select the best configuration for the different parameters (i.e., how to structure, say, A
: LoRA, vector, scalar, none). PEFT aims to not interfere with the training process, as such I would suggest to not use random sampling of this configuration during init but to give the user the option to set the config once (via GloraConfig
, similar to how eval_config
works). If the user is keen to try search methods, such as evolutionary search, they can still use either GloraModel.set_adapter_config
or, this would need to be added, something like glora.Linear.set_adapter_config
, so that the user can do this for each layer individually (to support the ES case). Maybe there's a better name instead of set_adapter_config
to set it apart from PEFT adapters. WDYT?
src/peft/tuners/glora/layer.py
Outdated
try: | ||
rank = int(config.split("_")[1]) | ||
except Exception: | ||
rank = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should simply raise an exception since it indicates an user error. Assuming a default will lead to unexpected behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try: | |
rank = int(config.split("_")[1]) | |
except Exception: | |
rank = 4 | |
rank = int(config.split("_")[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(leaving this comment for future me to reflect on this one)
IMO rank should be extracted from config["r"]
I may need to go back to the original implementation and check this in more depth
if self.eval_config is None: | ||
warnings.warn("eval_config not set for GLora layer, using a random config for merge.") | ||
path_config = random.choice(self.configs) | ||
else: | ||
path_config = self.eval_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should raise an error explaining what to do to fix it instead of simply assuming a random config which might or might not work.
Ideally there's a test in tests/test_glora.py
to test this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better if we move eval_config
to the GLoraConfig
class and set a default value for that over there
def tearDown(self): | ||
gc.collect() | ||
torch.cuda.empty_cache() | ||
gc.collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a test file explicitly for GPU/accelerator tests - all other tests assume to run on CPU, therefore this is not necessary.
@unittest.skipIf( | ||
not torch.cuda.is_available() | ||
or not hasattr(torch.cuda, "is_bf16_supported") | ||
or not torch.cuda.is_bf16_supported(), | ||
"BF16 not supported or no CUDA", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To also support XPU/Intel:
from accelerate.utils.imports import is_xpu_available
from accelerate.utils.imports import is_bf16_available
[...]
@unittest.skipIf(
not torch.cuda.is_available()
or not is_xpu_available()
or not is_bf16_available()
"BF16 not supported or no CUDA/XPU",
)
return ["decoded text"] | ||
|
||
|
||
class GLORATester(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great that you added tests! We're in the process of migrating existing tests to pytest. It's totally OK to keep these as is but if you want to convert these already to pytest I wouldn't complain :)
except AttributeError: | ||
return getattr(self.model, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In LoRA we catch this case in reference to #1892, I think this code is susceptible to this as well.
except AttributeError: | |
return getattr(self.model, name) | |
except AttributeError: | |
if name == "model": | |
raise | |
return getattr(self.model, name) |
if isinstance(target, GLoraLinear): | ||
if target.eval_config is None: | ||
raise ValueError( | ||
f"eval_config not set for GLoraLinear layer {key}. Cannot merge deterministically. Please call model.set_adapter_eval_config(...) before merging." | ||
) | ||
|
||
target.merge() | ||
new_module = nn.Linear(target.in_features, target.out_features, bias=(target.bias is not None)) | ||
new_module.weight.data = target.weight.data.clone() # Get merged weight | ||
if target.bias is not None: | ||
new_module.bias.data = target.bias.data.clone() # Get merged bias | ||
|
||
self._replace_module(parent, target_name, new_module.to(target.weight.device), target) | ||
|
||
if isinstance(target, ModulesToSaveWrapper): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to do implement this similar to how LoRA does it:
if hasattr(target, "unload_and_optionally_merge_module"):
# if layers have special unloading method, like MultiheadAttention, use that
unloaded_module = target.unload_and_optionally_merge_module(
merge=merge, safe_merge=safe_merge, adapter_names=adapter_names
)
self._replace_module(parent, target_name, unloaded_module, target)
elif hasattr(target, "base_layer"):
if merge:
target.merge(safe_merge=safe_merge, adapter_names=adapter_names)
self._replace_module(parent, target_name, target.get_base_layer(), target)
Of course this assumes that the target (e.g., glora.Linear
) implements unload_and_optionally_merge_module
but I would have suggested to implement it anyway since it is a good interface to have.
def get_peft_config_as_dict(self, inference: bool = False) -> dict[str, Any]: | ||
config_dict = {} | ||
for adapter_name, peft_config_obj in self.peft_config.items(): | ||
config = asdict(peft_config_obj) | ||
if inference: | ||
config["inference_mode"] = True | ||
for k, v in config.items(): | ||
if isinstance(v, Enum): | ||
config[k] = v.value | ||
config_dict[adapter_name] = config | ||
return config_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be discussed in another PR, as it's not part of the BaseTuner and is part of each model in the library.
IMO we should make a different PR to shift this method to the BaseTuner.
def _find_and_replace(self, adapter_name: str): | ||
glora_config = self.peft_config[adapter_name] | ||
is_target_modules_in_base_model = False | ||
key_list = [key for key, _ in self.model.named_modules()] # Cache keys | ||
|
||
for key in key_list: | ||
if not self._check_target_module_exists(glora_config, key): | ||
continue | ||
|
||
is_target_modules_in_base_model = True | ||
parent, target, target_name = _get_submodules(self.model, key) | ||
|
||
if isinstance(target, GLoraLinear): | ||
warnings.warn( | ||
f"Module {key} is already a GLoraLinear. Skipping replacement for new adapter '{adapter_name}'. Multiple GLORA adapters on the same layer might need explicit support in GLoraLinear." | ||
) | ||
elif isinstance(target, nn.Linear): | ||
new_module = self._create_new_module(glora_config, adapter_name, target) | ||
self._replace_module(parent, target_name, new_module, target) | ||
|
||
if not is_target_modules_in_base_model: | ||
raise ValueError( | ||
f"Target modules {glora_config.target_modules} not found in the base model. " | ||
f"Please check the target modules and try again." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This functionality is already implemented in BaseTuner.inject_adapter
. I think _find_and_replace
as well as add_adapter
can be removed entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for the review!
will ping you once I resolve all the pointers
src/peft/tuners/glora/layer.py
Outdated
try: | ||
rank = int(config.split("_")[1]) | ||
except Exception: | ||
rank = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try: | |
rank = int(config.split("_")[1]) | |
except Exception: | |
rank = 4 | |
rank = int(config.split("_")[1]) |
def get_peft_config_as_dict(self, inference: bool = False) -> dict[str, Any]: | ||
config_dict = {} | ||
for adapter_name, peft_config_obj in self.peft_config.items(): | ||
config = asdict(peft_config_obj) | ||
if inference: | ||
config["inference_mode"] = True | ||
for k, v in config.items(): | ||
if isinstance(v, Enum): | ||
config[k] = v.value | ||
config_dict[adapter_name] = config | ||
return config_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be discussed in another PR, as it's not part of the BaseTuner and is part of each model in the library.
IMO we should make a different PR to shift this method to the BaseTuner.
TRANSFORMERS_MODELS_TO_GLORA_TARGET_MODULES_MAPPING = ( | ||
TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING # need to check this later | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not test this yet, I was waiting for the workflow to run and see how it goes, this choice so far is based on intuition and nothing more, will inspect this in more detail further in the future.
based on main...Arnav0400:peft:main
and #1510 as well as #780