Skip to content

Conversation

@mratsim
Copy link
Contributor

@mratsim mratsim commented Dec 28, 2025

Draft & support for issue/debugging: do not merge

SUMMARY:
Calibration script for Minimax M2 / Minimax M2.1

TEST PLAN:
Currently cannot quantize, issue TBD

Hello team, I'm trying to quantize the new Minimax M2.1, https://huggingface.co/MiniMaxAI/MiniMax-M2.1. For now I've only added the modeling file as I'm stuck with an autowrapper issue (issue #2172). Feel free to comment though if this looks okay.

@github-actions
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @mratsim, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This draft pull request introduces the foundational calibration logic required for quantizing the Minimax M2.1 model, specifically targeting its Mixture-of-Experts (MoE) components. It aims to prepare the codebase for future quantization by adding a dedicated calibration module, although the author notes current challenges with an autowrapper issue preventing full quantization.

Highlights

  • New Calibration Module: Introduces a new CalibrationMiniMaxM2SparseMoeBlock class designed for calibrating Minimax M2 and M2.1 models.
  • All Expert Calibration Support: The new module enables the calibration of all experts within a MiniMaxM2SparseMoeBlock by running each expert on the full input to collect comprehensive statistics.
  • MoE Integration: This calibration module is registered as a MoECalibrationModule, specifically targeting sparse Mixture-of-Experts (MoE) blocks.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a calibration module for MiniMaxM2SparseMoeBlock. The implementation looks mostly correct for its purpose of running all experts during calibration. However, I've found a few critical issues that are likely causing it to fail, including undefined type hints in the constructor and a reference to an undefined variable in the forward method. I've also identified a potential issue with tensor manipulation that could cause a crash depending on the model's top_k configuration. My suggestions should help resolve these problems.

top_k_index, num_classes=self.num_experts
).permute(2, 1, 0)

for expert_idx in range(num_experts):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The variable num_experts is not defined within the forward method's scope, which will result in a NameError at runtime. You have self.num_experts available as an instance attribute, which was initialized from the config. You should use self.num_experts here.

Suggested change
for expert_idx in range(num_experts):
for expert_idx in range(self.num_experts):

Signed-off-by: Mamy Ratsimbazafy <[email protected]>
@mratsim
Copy link
Contributor Author

mratsim commented Dec 28, 2025

Scratching that PR, somehow my fork was stuck on November 6 before the new changes to MoECalibration context and I just hacked happily using that.

Actually the only change was @register_moe_calibration("MiniMaxM2SparseMoeBlock") vs @MoECalibrationModule.register("MiniMaxM2SparseMoeBlock")

@mratsim mratsim force-pushed the minimax-m2 branch 3 times, most recently from 012187e to f60daac Compare December 28, 2025 02:00
…ggingface/accelerate/utils/modeling.py find_tied_parameters

Signed-off-by: Mamy Ratsimbazafy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant