-
Notifications
You must be signed in to change notification settings - Fork 2k
feat(lokr, loha): add 1x1 Conv2d and Conv1d support #2515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this great PR, it is much appreciated.
I saw some small changes still being required, please check the comments. Also, to accept Conv1d
, it also needs to be added here:
peft/src/peft/tuners/lycoris_utils.py
Line 263 in d3ff133
if isinstance(target_base_layer, torch.nn.Conv2d): |
Then, it's important to add unit tests to ensure that 1x1 conv and Conv1d
work as expected. For this, let's add the following test cases:
(
"Conv1D LoHa",
"Conv1d",
LoHaConfig,
{"target_modules": ["conv1d"]},
),
("Conv2d 1x1 LoHa", "Conv2d1x1", LoHaConfig, {"target_modules": ["conv2d"]}),
...
(
"Conv1D LoKr",
"Conv1d",
LoKrConfig,
{"target_modules": ["conv1d"]},
),
("Conv2d 1x1 LoKr", "Conv2d1x1", LoKrConfig, {"target_modules": ["conv2d"]}),
We also need to define the ModelConv2d1x1
, e.g. like this:
class ModelConv2D1x1(nn.Module):
def __init__(self):
super().__init__()
self.conv2d = nn.Conv2d(1, 10, 3)
self.relu = nn.ReLU()
self.flat = nn.Flatten()
self.lin0 = nn.Linear(10, 2)
self.sm = nn.LogSoftmax(dim=-1)
self.dtype = torch.float
def forward(self, X):
X = X.to(self.dtype)
X = X.reshape(-1, 5, 3, 3)
X = self.conv2d(X)
X = self.relu(X)
X = self.flat(X)
X = self.lin0(X)
X = self.sm(X)
return X
Ensure to add this snippet here:
if model_id == "Conv2d1x1":
return ModelConv2D1x1().to(torch_dtype)
gentle ping @grewalsk |
support to LyCORIS adapters - Update lycoris_utils.py to support Conv1d layers - Add ModelConv2D1x1 and ModelConv1D test classes - Add test cases for Conv1d and Conv2d 1x1 with LoHa and LoKr - Update imports in loha/model.py and lokr/model.py
Hi @BenjaminBossan, could you take a quick look? I removed the conflicting ModelConv2DGroups class and cleared the merge-markers locally, but it still shows up in the PR diff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, I'll take over the review for now :)
I've had some questions/remarks but nothing major. The merge conflict markers are not visible to me, so that seems resolved.
…onsistency Clarify existing optimal handling for 1x1 Conv2d and Conv1d layers; confirmed is already correctly disabled in files, addressing reviewer feedback on potential dead code. Standardize test case naming in for consistency: - Updated 'LoHa' to 'LOHA' and 'LoKr' to 'LOKR' (all caps). - Aligned Conv1D/Conv1d naming with PyTorch conventions for clarity.
does this address ur concerns or should I add something? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for the quick update. I think this looks good to go :)
Let's see what the CI has to say but if there are no further issues, we can proceed merging this.
@grewalsk could you run |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
It seems that we have two |
Renamed the second ModelConv1D class (kernel_size=1) to ModelConv1DKernel1 and updated MockTransformerWrapper mapping to avoid class name conflicts. for the 1x1 support
Is this good? @githubnemo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You checked in a merge conflict, if you could fix that and run make style
once again, I think we're good to go
.gitignore
Outdated
<<<<<<< Updated upstream | ||
# method_comparison logs | ||
method_comparison/MetaMathQA/cancelled_results/ | ||
method_comparison/MetaMathQA/temporary_results/ | ||
======= | ||
**/.claude/settings.local.json | ||
>>>>>>> Stashed changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<<<<<<< Updated upstream | |
# method_comparison logs | |
method_comparison/MetaMathQA/cancelled_results/ | |
method_comparison/MetaMathQA/temporary_results/ | |
======= | |
**/.claude/settings.local.json | |
>>>>>>> Stashed changes | |
# method_comparison logs | |
method_comparison/MetaMathQA/cancelled_results/ | |
method_comparison/MetaMathQA/temporary_results/ |
I think its good now. @githubnemo |
@grewalsk It seems that there are some failing tests regarding LoKr/LoHa. Could you investigate? You can reproduce those runs locally by running |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
not stale |
What does this PR do?
This PR enhances the LoKr and LoHa adapter implementations within PEFT by adding proper support for:
nn.Conv2d
withkernel_size=(1,1)
)nn.Conv1d
layers (specifically includingkernel_size=1
).This allows LoKr/LoHa adapters to be correctly applied to a wider range of modern architectures that heavily utilize these layer types (e.g., ResNet bottlenecks, MobileNet pointwise convolutions, various Transformer blocks). The implementation aims for optimized handling, inspired by LoRA's 1x1 optimization, while maintaining consistency with existing LyCORIS patterns in PEFT. Parts of the implementation logic, particularly for parameter factorization and layer adaptation, were adapted from the KohakuBlueleaf/LyCORIS library (e.g.,
lycoris/modules/loha.py
), consistent with existing acknowledgements within the PEFT codebase.This includes:
Conv1d
adapter layer classes for both LoKr and LoHa, mirroringConv2d
.layers_mapping
inLoKrModel
andLoHaModel
to recognizeConv1d
.create_adapter_parameters
methods in LoKr/LoHa to correctly initialize parameters based onConv1d
weight shapes.update_layer
methods in LoKr/LoHa to:Conv1d
layers.Conv2d
andkernel_size=1
Conv1d
, notably disablinguse_effective_conv2d
where appropriate for direct matrix handling.is_1x1_conv2d
,is_1_conv1d
) inget_delta_weight
methods as hooks for potential future computation optimizations (without altering current paths).Before submitting
Fixes #1935