Skip to content

chore(deps): bump transformers from 5.6.2 to 5.7.0 in /cmd/runtimes/deepspeed#3479

Merged
google-oss-prow[bot] merged 1 commit into
masterfrom
dependabot/pip/cmd/runtimes/deepspeed/transformers-5.7.0
May 5, 2026
Merged

chore(deps): bump transformers from 5.6.2 to 5.7.0 in /cmd/runtimes/deepspeed#3479
google-oss-prow[bot] merged 1 commit into
masterfrom
dependabot/pip/cmd/runtimes/deepspeed/transformers-5.7.0

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github May 5, 2026

Bumps transformers from 5.6.2 to 5.7.0.

Release notes

Sourced from transformers's releases.

Release v5.7.0

New Model additions

Laguna

Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid MoE router with auxiliary-loss-free load balancing that uses element-wise sigmoid of gate logits plus learned per-expert bias for router scoring.

Links: Documentation

DEIMv2

DEIMv2 (DETR with Improved Matching v2) is a real-time object detection model that extends DEIM with DINOv3 features and spans eight model sizes from X to Atto for diverse deployment scenarios. It uses a Spatial Tuning Adapter (STA) for larger variants to convert DINOv3's single-scale output into multi-scale features, while ultra-lightweight models employ pruned HGNetv2 backbones. The unified design achieves superior performance-cost trade-offs, with DEIMv2-X reaching 57.8 AP with only 50.3M parameters and DEIMv2-S being the first sub-10M model to exceed 50 AP on COCO.

Links: Documentation | Paper

Attention

Several attention-related bugs were fixed across multiple models, including a cross-attention cache type error in T5Gemma2 for long inputs, incorrect cached forward behavior in Qwen3.5's gated-delta-net linear attention, and a crash in GraniteMoeHybrid when no Mamba layers are present. Attention function dispatch was also updated to align with the latest model implementations.

Tokenizers

There was a bug in AutoTokenizer that caused the wrong tokenizer class to be initialized. This caused regressions in models like DeepSeek R1.

Generation

Continuous batching generation received several fixes and improvements, including correcting KV deduplication and memory estimation for long sequences (16K+), and removing misleading warnings about num_return_sequences and other unsupported features that were incorrectly firing even when functionality worked correctly. Documentation for per-request sampling parameters was also added.

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 5.6.2 to 5.7.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v5.6.2...v5.7.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 5.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels May 5, 2026
Copilot AI review requested due to automatic review settings May 5, 2026 02:39
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels May 5, 2026
@dependabot dependabot Bot review requested due to automatic review settings May 5, 2026 02:39
@google-oss-prow google-oss-prow Bot requested review from jinchihe and kuizhiqing May 5, 2026 02:39
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

🎉 Welcome to the Kubeflow Trainer! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

  • If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards.
  • Our team will review your PR soon! cc @kubeflow/kubeflow-trainer-team

Join the community:

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

Copy link
Copy Markdown
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@google-oss-prow
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow Bot merged commit 694d954 into master May 5, 2026
39 of 44 checks passed
@google-oss-prow google-oss-prow Bot deleted the dependabot/pip/cmd/runtimes/deepspeed/transformers-5.7.0 branch May 5, 2026 10:43
@google-oss-prow google-oss-prow Bot added this to the v2.3 milestone May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dependencies Pull requests that update a dependency file lgtm python Pull requests that update Python code size/XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant