Skip to content

Latest commit

 

History

History
501 lines (309 loc) · 28.2 KB

File metadata and controls

501 lines (309 loc) · 28.2 KB

Changelog

All notable changes to this Helm chart will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

0.36.0 - 2026-05-14

Changed

  • Updated default container tags to May 2026 release (release-260514). Refer to the main Deepgram changelog for additional details.

0.35.1 - 2026-05-01

Changed

  • Removed NVIDIA_VISIBLE_DEVICES=all and NVIDIA_DRIVER_CAPABILITIES=compute,utility from the Engine deployment template and from all Docker and Podman compose files. The release-260430 Engine image bakes these env vars into the image, so deployment-side injection is unnecessary and may interact unexpectedly with CUDA_VISIBLE_DEVICES. Existing customers who manually added these env vars to their values files or compose overrides can safely remove them.

0.35.0 - 2026-04-30

Added

  • Set NVIDIA_VISIBLE_DEVICES=all and NVIDIA_DRIVER_CAPABILITIES=compute,utility on Engine pods when GPUs are requested. The same env vars were added to all Docker and Podman compose files. (Reverted in 0.35.1 — see notes there.)
  • Aura-2 Speed and Pronunciation Controls require an updated voice-pack model file. Engine release-260430 adds support for the speed parameter and pronunciation controls on Aura-2 English voices. These features are gated on an updated voice-pack (2025-04-15.4, UUID 0ec06c9b-0aa0-44d0-a001-3ec57d32229e). The chart's sample values files (04-aura-2-setup, 05-voice-agent-aws, 06-aura-2-polyglot-setup) have referenced this UUID since chart 0.34.0, but customers running an older voice-pack on disk will receive 400 Bad Request on speed=* requests. Refresh the voice-pack model file in your models directory to enable these features — contact your Deepgram representative for the download link.
  • Added engine.agentOverrides to support per-engine-type resource overrides when agent.enabled: true. Defaults to gpu: 2 for the agent-text-to-speech engine, which is required for Aura-2 TTS. Other engine types continue to use the global engine.resources values.
    • Breaking change for existing Voice Agent deployments: Upgrading will cause the agent-text-to-speech pod to request 2 GPUs by default. If you are not using Aura-2 and want to keep 1 GPU, add the following to your values:
      engine:
        agentOverrides:
          agent-text-to-speech:
            resources:
              requests:
                gpu: 1
              limits:
                gpu: 1

Changed

  • Updated default container tags to April 2026 release (release-260430). Refer to the main Deepgram changelog for additional details.

Fixed

  • Modified t2c & c2a Aura-2 TTS UUIDs throughout repo to be consistent with latest models.
  • Fixed Voice Agent engine deployments where all three engine types (agent-speech-to-text, agent-text-to-speech, agent-end-of-turn) were stuck in ContainerCreating due to missing per-type ConfigMaps. Each engine type now gets its own ConfigMap.
  • Fixed RBAC role for Voice Agent engine pods to grant access to the per-type ConfigMap names created when agent.enabled: true.

0.34.0 - 2026-04-16

Added

  • Added Flux STT configuration variants for English (engine.flux-en.toml) and Multilingual (engine.flux-multi.toml), plus a shared api.flux.toml
    • engine.flux-multi.toml sets model_name = "flux-general-multi" explicitly to load Multilingual Flux instead of English Flux.
  • Added engine.flux.model_name value to select which Flux model to load. Defaults to flux-general-en; set to flux-general-multi for the Multilingual model.
  • Added engine.runtimeClassName value to configure a Kubernetes RuntimeClass on Engine pods
  • Added a 15-minute Time-to-Live (TTL) to the model download pod, configurable via ttlSecondsAfterFinished
    • This helps to fix an issue where a lock is held on the PersistentVolume resource, when attempting to delete the Deepgram Kubernetes namespace
  • Added engine.resources.gpuResourceName to make the GPU resource name configurable for MIG support

Changed

  • Updated default container tags to April 2026 release (release-260416). Refer to the main Deepgram changelog for additional details.

Fixed

  • Standardized sample values files to fix drift from the reference configuration (missing resource limits, null features: overrides, inconsistent global section, incorrect model list format)

0.33.0 - 2026-04-02

Changed

  • Updated default container tags to April 2026 release (release-260402). Refer to the main Deepgram changelog for additional details.

0.32.0 - 2026-03-19

Added

  • Added billing.server.certificatesPort to expose the /v1/certificates endpoint on the Billing container

Changed

  • Updated default container tags to March 2026 release (release-260319). Refer to the main Deepgram changelog for additional details.

0.31.0 - 2026-03-05

Changed

  • Updated default container tags to March 2026 release (release-260305). Refer to the main Deepgram changelog for additional details.
  • Removed the [half_precision] section from engine.toml files
    • Deepgram sets this automatically and does not require customers to manually configure it

0.30.0 - 2026-02-12

Added

  • Added extraEnv support to API, Engine, License Proxy, and Billing containers for passing custom environment variables via values.yaml.

Changed

  • Updated default container tags to February 2026 release (release-260212). Refer to the main Deepgram changelog for additional details.

0.29.1 - 2026-02-02

Changed

  • Updated the default value for flux.max_streams to be a placeholder value of 0. This parameter must be set for production workloads based on the GPU type used in the deployment.

0.29.0 - 2026-01-29

Changed

  • Updated default container tags to January 2026 release (release-260129). Refer to the main Deepgram changelog for additional details.

0.28.0 - 2026-01-27

Added

  • Added Flux max_streams setting to Engine configuration

0.27.1 - 2026-01-21

Added

  • Added Billing container support for airgapped deployments

0.27.0 - 2026-01-15

Added

  • Added engine.health.gpuRequired to indicate whether GPU availability is required to be considered healthy. Defaults to false.
  • Added support for Aura-2 Polyglot TTS (Dutch, German, French, Italian, Japanese)

Changed

  • Updated default container tags to January 2026 release (release-260115). Refer to the main Deepgram changelog for additional details. The changelog describes the details of a TTS breaking change for which we recommend using a blue-green deployment strategy.

0.26.0 - 2025-12-29

Changed

  • Updated default container tags to December 2025 release (release-251229). Refer to the main Deepgram changelog for additional details.

0.25.0 - 2025-12-10

Added

  • Added an optional engine.lifecycle.postStart.command hook to run custom commands after Engine container startup.

Changed

  • Updated default container tags to December 2025 release (release-251210). Refer to the main Deepgram changelog for additional details.

0.24.0 - 2025-11-18

Added

  • Added use_v2_language_detection feature flag to support 36-language detection (disabled by default).
  • Added optional EFS StorageClass creation with engine.modelManager.volumes.aws.efs.storageClass.create flag to support restricted environments where StorageClass creation is not permitted.

Changed

  • Updated default container tags to November 2025 release (release-251118). Refer to the main Deepgram changelog for additional details.
  • Updated /v1/status endpoint to raise four statuses: Initializing, Ready, Healthy, and Critical. See status endpoint documentation for details.

0.23.1 - 2025-11-04

Fixed

  • Quoted Voice Agent LLM model names to fix periods breaking the TOML parser

0.23.0 - 2025-10-29

Changed

  • Updated default container tags to October 2025 release (release-251029). Refer to the main Deepgram changelog for additional details.
  • Updated sample cluster configuration files to use Kubernetes 1.33 (previously 1.30)
  • Updated Helm chart dependencies: cluster-autoscaler 9.52.1 (previously 9.46.3), prometheus-adapter 4.14.2 (previously 4.13.0)

Fixed

  • Fixed API templates to use correct additionalLabels reference
  • Fixed hard coded labels and selectors

0.22.0 - 2025-10-15

Added

  • Added Google as a 3rd party provider for Voice Agent helm chart
  • Added topologySpreadConstraints, which allows even distribution of pods from the same deployment across availability zones, among other criteria
  • Added redactUsage under api features which enables redaction of usage metadata

Changed

  • Updated default container tags to October 2025 release (release-251015). Refer to the main Deepgram changelog for additional details.
  • Set entity_redaction to true by default, so redaction is automatically enabled if a valid NER model is available

0.21.0 - 2025-09-29

Changed

  • Updated default container tags to September 2025 release (release-250929). Refer to the main Deepgram changelog for additional details.

0.20.0 - 2025-09-17

Added

  • Exposed the ability to add custom TOML sections in api.toml and engine.toml via customToml
  • Added nodeSelector support for all components (API, Engine, License Proxy) to allow scheduling pods on specific nodes.
  • Added configurable service types for API, Engine, and License Proxy services with ClusterIP as the default
  • Added support for service annotations when using LoadBalancer service type
  • Added loadBalancerSourceRanges configuration for LoadBalancer services to restrict access to specific IP CIDR ranges
  • Added externalTrafficPolicy configuration for LoadBalancer services to control traffic routing behavior
  • Updated sample configurations to demonstrate service configuration options including LoadBalancer security settings
  • Container-level security context support to Helm templates
  • Supported removing resource limits on Engine pods

Changed

  • Changed default service type from NodePort to ClusterIP for all services (API external, Engine metrics, License Proxy status)
  • Updated service templates to support configurable service types and annotations

0.19.0 - 2025-09-12

Added

  • Changes the defaults of .Values.api.features.formatEntityTags and .Values.engine.features.streamingNer to true, so that NER formatting is enabled by default. This formatting is required with Nova-3 models. See our self-hosted NER guide for further details.
  • Updated default container tags to September 2025 release (release-250912). Refer to the main Deepgram changelog for additional details.

0.18.1 - 2025-09-03

Added

  • Defined allowNonpublicEndpoints Voice Agent flag for use with custom LLM endpoints

Fixed

  • Fixed HPA replica conflicts in API and Engine deployments by conditionally removing hardcoded replicas when autoscaling is enabled

0.18.0 - 2025-08-28

Added

  • Added built-in support for Voice Agent.
  • Updated default container tags to August 2025 release (release-250828). Refer to the main Deepgram changelog for additional details.

Fixed

  • Fixed securityContext template references for API and Engine deployments
  • Fixed securityContext documentation comments

0.17.0 - 2025-08-14

Added

  • Updated default container tags to August 2025 release (release-250814). Refer to the main Deepgram changelog for additional details.

0.16.0 - 2025-07-31

Added

  • Updated default container tags to July 2025 release (release-250731). Refer to the main Deepgram changelog for additional details.

0.15.0 - 2025-07-10

Added

  • Apply additional annotations to the template section of Deployment resources.
  • Updated default container tags to July 2025 release (release-250710). Refer to the main Deepgram changelog for additional details.

0.14.0 - 2025-06-26

Added

  • Updated default container tags to June 2025 release (release-250626). Refer to the main Deepgram changelog for additional details.

0.13.0 - 2025-06-10

Added

  • Updated default container tags to June 2025 release. Refer to the main Deepgram changelog for additional details.

0.12.0 - 2025-03-31

Added

  • Updated default container tags to March 2025 release. Refer to the main Deepgram changelog for additional details.

0.11.1 - 2025-03-28

Added

0.11.0 - 2025-03-07

Added

  • Updated default container tags to March 2025 release. Refer to the main Deepgram changelog for additional details.

0.10.0 - 2025-01-30

Added

  • Updated default container tags to January 2025 release. Refer to the main Deepgram changelog for additional details.

0.9.0 - 2024-12-26

Added

  • Updated default container tags to December 2024 release. Refer to the main Deepgram changelog for additional details.

0.8.1 - 2024-12-17

Changed

  • Fixed default ratio metrics in Prometheus Adapter chart values to use 0.0 to 1.0 scale to match autoscaling documentation

0.8.0 - 2024-11-21

Added

  • Updated default container tags to November 2024 release. Refer to the main Deepgram changelog for additional details.

Fixed

  • Add the Engine Deployment tolerations to the Engine's model download Job.

0.7.0 - 2024-10-24

Added

  • Updated default container tags to October 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:

Changed

  • AWS samples updated to take advantage of new EKS accelerated AMIs, which bundle the required NVIDIA driver and toolkit instead of being installed by the NVIDIA GPU operator

0.6.0 - 2024-09-27

Added

  • Updated default container tags to September 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
    • Adds broader support in Engine container for model auto-loading during runtime.
      • Filesystems that don't support inotify, such as nfs/csi PersistentVolumes in Kubernetes, can now load and unload models during runtime without requiring a Pod restart.
  • Automatic model management on AWS now supports model removal. See the engine.modelManager.models.remove section in the values.yaml file for details.
  • Container orchestrator environment variable added to improve support.

Changed

  • Automatic model downloads on AWS are moved from engine.modelManager.models.links to engine.modelManager.models.add. The old links field is still supported, but migration is recommended.

Fixed

  • Update sample files to fix an issue with sample command for Kubernetes Secret creation storing Quay credential
    • Previous command used --from-file with the user's Docker configuration file. Some local secret managers, like Apple Keychain, scrub this file for sensitive information, which would result in an empty secret being created.

0.5.0 - 2024-08-27

Added

  • Updated default container tags to August 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
    • GA support for entity detection for pre-recorded English audio
    • GA support for improved redaction for pre-recorded English audio

Fixed

  • Fixed a misleading comment in the 03-basic-setup-onprem.yaml sample file that wrongly suggested engine.modelManager.volumes.customVolumeClaim.name should be a PersistentVolume instead of a PersistentVolumeClaim

Changed

  • Deepgram's core products are available to host both on-premises and in the cloud. Official resources have been updated to refer to a "self-hosted" product offering, instead of an "onprem" product offering, to align the product name with industry naming standards. The Deepgram Quay image repository names have been updated to reflect this.

0.4.0 - 2024-07-25

Added

  • Introduced entity detection feature flag for API containers (false by default).
  • Updated default container tags to July 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
    • Support for Deepgram's new English/Spanish multilingual code-switching model
    • Beta support for entity detection for pre-recorded English audio
    • Beta support for improved redaction for pre-recorded English audio
    • Beta support for improved entity formatting for streaming English audio

Removed

  • Removed some items nested under api.features and engine.features sections in favor of opinionated defaults.

0.3.0 - 2024-07-18

Added

  • Allow specifying custom annotations for deployments.

0.2.3 - 2024-07-15

Added

  • Sample values.yaml file for on-premises/self-managed Kubernetes clusters.

Fixed

  • Resolves a mismatch between PVC and SC prefix naming convention.
  • Resolves error when specifying custom service account names.

Changed

  • Make imagePullSecrets optional.

0.2.2-beta - 2024-06-27

Added

  • Adds more verbose logging for audio content length.
  • Keeps our software up-to-date.
  • See the changelog associated with this routine monthly release.

0.2.1-beta - 2024-06-24

Added

  • Restart Deepgram containers automatically when underlying ConfigMaps have been modified.

0.2.0-beta - 2024-06-20

Added

  • Support for managing node autoscaling with cluster-autoscaler.
  • Support for pod autoscaling of Deepgram components.
  • Support for keeping the upstream Deepgram License server as a backup even when the License Proxy is deployed. See licenseProxy.keepUpstreamServerAsBackup for details.

Changed

  • Initial installation replica count values moved from scaling.static.{api,engine}.replicas to scaling.replicas.{api,engine}.
  • License Proxy is no longer manually scaled. Instead, scaling can be indirectly controlled via licenseProxy.{enabled,deploySecondReplica}.
  • Labels for Deepgram dedicated nodes in the sample cluster-config.yaml for AWS, and the nodeAffinity sections of the sample values.yaml files. The key has been renamed from deepgram/nodeType to k8s.deepgram.com/node-type, and the values are no longer prepended with deepgram.
  • AWS EFS model download job hook delete policy changed to before-hook-creation.
  • Concurrency limit moved from API (api.concurrencyLimit.activeRequests) to Engine level (engine.concurrencyLimit.activeRequests).

0.1.1-alpha - 2024-06-03

Added

  • Various documentation improvements

0.1.0-alpha - 2024-05-31

Added

  • Initial implementation of the Helm chart.