Skip to content

feat(lokr, loha): add 1x1 Conv2d and Conv1d support #2515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

grewalsk
Copy link

What does this PR do?
This PR enhances the LoKr and LoHa adapter implementations within PEFT by adding proper support for:

  1. 1x1 Convolutions (nn.Conv2d with kernel_size=(1,1))
  2. nn.Conv1d layers (specifically including kernel_size=1).

This allows LoKr/LoHa adapters to be correctly applied to a wider range of modern architectures that heavily utilize these layer types (e.g., ResNet bottlenecks, MobileNet pointwise convolutions, various Transformer blocks). The implementation aims for optimized handling, inspired by LoRA's 1x1 optimization, while maintaining consistency with existing LyCORIS patterns in PEFT. Parts of the implementation logic, particularly for parameter factorization and layer adaptation, were adapted from the KohakuBlueleaf/LyCORIS library (e.g., lycoris/modules/loha.py), consistent with existing acknowledgements within the PEFT codebase.

This includes:

  • New Conv1d adapter layer classes for both LoKr and LoHa, mirroring Conv2d.
  • Updated layers_mapping in LoKrModel and LoHaModel to recognize Conv1d.
  • Enhanced create_adapter_parameters methods in LoKr/LoHa to correctly initialize parameters based on Conv1d weight shapes.
  • Refactored update_layer methods in LoKr/LoHa to:
    • Detect Conv1d layers.
    • Implement specific logic for 1x1 Conv2d and kernel_size=1 Conv1d, notably disabling use_effective_conv2d where appropriate for direct matrix handling.
    • Ensure correct shape calculations for factorization.
  • Added detection flags (is_1x1_conv2d, is_1_conv1d) in get_delta_weight methods as hooks for potential future computation optimizations (without altering current paths).
  • Maintained backward compatibility; changes are additive and do not affect existing functionality for other layer types or kernel sizes.
  • Followed established PEFT/LyCORIS coding patterns for consistency.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guidelines?
  • Did you write any new necessary tests?
  • Did you run the tests?
  • Did you run the pre-commit hooks?
  • Did you check the documentation builds correctly?

Fixes #1935

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this great PR, it is much appreciated.

I saw some small changes still being required, please check the comments. Also, to accept Conv1d, it also needs to be added here:

if isinstance(target_base_layer, torch.nn.Conv2d):

Then, it's important to add unit tests to ensure that 1x1 conv and Conv1d work as expected. For this, let's add the following test cases:

    (
        "Conv1D LoHa",
        "Conv1d",
        LoHaConfig,
        {"target_modules": ["conv1d"]},
    ),
    ("Conv2d 1x1 LoHa", "Conv2d1x1", LoHaConfig, {"target_modules": ["conv2d"]}),

...

    (
        "Conv1D LoKr",
        "Conv1d",
        LoKrConfig,
        {"target_modules": ["conv1d"]},
    ),
    ("Conv2d 1x1 LoKr", "Conv2d1x1", LoKrConfig, {"target_modules": ["conv2d"]}),

here and here.

We also need to define the ModelConv2d1x1, e.g. like this:

class ModelConv2D1x1(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv2d = nn.Conv2d(1, 10, 3)
        self.relu = nn.ReLU()
        self.flat = nn.Flatten()
        self.lin0 = nn.Linear(10, 2)
        self.sm = nn.LogSoftmax(dim=-1)
        self.dtype = torch.float

    def forward(self, X):
        X = X.to(self.dtype)
        X = X.reshape(-1, 5, 3, 3)
        X = self.conv2d(X)
        X = self.relu(X)
        X = self.flat(X)
        X = self.lin0(X)
        X = self.sm(X)
        return X

Ensure to add this snippet here:

        if model_id == "Conv2d1x1":
            return ModelConv2D1x1().to(torch_dtype)

@BenjaminBossan
Copy link
Member

gentle ping @grewalsk

Kabir Grewal added 3 commits May 10, 2025 20:19
  support to LyCORIS adapters

  - Update lycoris_utils.py to support Conv1d
  layers
  - Add ModelConv2D1x1 and ModelConv1D test
  classes
  - Add test cases for Conv1d and Conv2d 1x1 with
   LoHa and LoKr
  - Update imports in loha/model.py and
  lokr/model.py
@grewalsk
Copy link
Author

Hi @BenjaminBossan, could you take a quick look?

I removed the conflicting ModelConv2DGroups class and cleared the merge-markers locally, but it still shows up in the PR diff.

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, I'll take over the review for now :)

I've had some questions/remarks but nothing major. The merge conflict markers are not visible to me, so that seems resolved.

…onsistency

Clarify existing optimal handling for 1x1 Conv2d and Conv1d layers;
confirmed  is already correctly disabled in  files,
addressing reviewer feedback on potential dead code.

Standardize test case naming in  for consistency:
- Updated 'LoHa' to 'LOHA' and 'LoKr' to 'LOKR' (all caps).
- Aligned Conv1D/Conv1d naming with PyTorch conventions for clarity.
@grewalsk
Copy link
Author

does this address ur concerns or should I add something?

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the quick update. I think this looks good to go :)

Let's see what the CI has to say but if there are no further issues, we can proceed merging this.

@githubnemo githubnemo dismissed BenjaminBossan’s stale review May 28, 2025 16:17

Took over review

@githubnemo
Copy link
Collaborator

@grewalsk could you run make style from the top of the repository and push the changes?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@githubnemo
Copy link
Collaborator

githubnemo commented May 28, 2025

It seems that we have two ModelConv1D classes in test_custom_models. Could you fix that? :)
I think there was already a class called ModelConv1D with kernel size 2.

Renamed the second ModelConv1D class (kernel_size=1) to ModelConv1DKernel1 and updated MockTransformerWrapper mapping to avoid class name conflicts. for the 1x1 support
@grewalsk
Copy link
Author

Is this good? @githubnemo

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You checked in a merge conflict, if you could fix that and run make style once again, I think we're good to go

.gitignore Outdated
Comment on lines 143 to 149
<<<<<<< Updated upstream
# method_comparison logs
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/
=======
**/.claude/settings.local.json
>>>>>>> Stashed changes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<<<<<<< Updated upstream
# method_comparison logs
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/
=======
**/.claude/settings.local.json
>>>>>>> Stashed changes
# method_comparison logs
method_comparison/MetaMathQA/cancelled_results/
method_comparison/MetaMathQA/temporary_results/

@grewalsk
Copy link
Author

grewalsk commented Jun 3, 2025

I think its good now. @githubnemo

@githubnemo
Copy link
Collaborator

@grewalsk It seems that there are some failing tests regarding LoKr/LoHa. Could you investigate?

You can reproduce those runs locally by running pytest -k loha -k lokr tests/.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@githubnemo
Copy link
Collaborator

not stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Call for contributions] help us improve LoKr, LoHa, and other LyCORIS
4 participants