All notable changes to this Helm chart will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.36.0 - 2026-05-14
- Updated default container tags to May 2026 release (
release-260514). Refer to the main Deepgram changelog for additional details.
0.35.1 - 2026-05-01
- Removed
NVIDIA_VISIBLE_DEVICES=allandNVIDIA_DRIVER_CAPABILITIES=compute,utilityfrom the Engine deployment template and from all Docker and Podman compose files. Therelease-260430Engine image bakes these env vars into the image, so deployment-side injection is unnecessary and may interact unexpectedly withCUDA_VISIBLE_DEVICES. Existing customers who manually added these env vars to their values files or compose overrides can safely remove them.
0.35.0 - 2026-04-30
- Set
NVIDIA_VISIBLE_DEVICES=allandNVIDIA_DRIVER_CAPABILITIES=compute,utilityon Engine pods when GPUs are requested. The same env vars were added to all Docker and Podman compose files. (Reverted in0.35.1— see notes there.) - Aura-2 Speed and Pronunciation Controls require an updated voice-pack model file. Engine
release-260430adds support for thespeedparameter and pronunciation controls on Aura-2 English voices. These features are gated on an updated voice-pack (2025-04-15.4, UUID0ec06c9b-0aa0-44d0-a001-3ec57d32229e). The chart's sample values files (04-aura-2-setup,05-voice-agent-aws,06-aura-2-polyglot-setup) have referenced this UUID since chart0.34.0, but customers running an older voice-pack on disk will receive400 Bad Requestonspeed=*requests. Refresh the voice-pack model file in your models directory to enable these features — contact your Deepgram representative for the download link. - Added
engine.agentOverridesto support per-engine-type resource overrides whenagent.enabled: true. Defaults togpu: 2for theagent-text-to-speechengine, which is required for Aura-2 TTS. Other engine types continue to use the globalengine.resourcesvalues.- Breaking change for existing Voice Agent deployments: Upgrading will cause the
agent-text-to-speechpod to request 2 GPUs by default. If you are not using Aura-2 and want to keep 1 GPU, add the following to your values:engine: agentOverrides: agent-text-to-speech: resources: requests: gpu: 1 limits: gpu: 1
- Breaking change for existing Voice Agent deployments: Upgrading will cause the
- Updated default container tags to April 2026 release (
release-260430). Refer to the main Deepgram changelog for additional details.
- Modified t2c & c2a Aura-2 TTS UUIDs throughout repo to be consistent with latest models.
- Fixed Voice Agent engine deployments where all three engine types (
agent-speech-to-text,agent-text-to-speech,agent-end-of-turn) were stuck inContainerCreatingdue to missing per-type ConfigMaps. Each engine type now gets its own ConfigMap. - Fixed RBAC role for Voice Agent engine pods to grant access to the per-type ConfigMap names created when
agent.enabled: true.
0.34.0 - 2026-04-16
- Added Flux STT configuration variants for English (
engine.flux-en.toml) and Multilingual (engine.flux-multi.toml), plus a sharedapi.flux.tomlengine.flux-multi.tomlsetsmodel_name = "flux-general-multi"explicitly to load Multilingual Flux instead of English Flux.
- Added
engine.flux.model_namevalue to select which Flux model to load. Defaults toflux-general-en; set toflux-general-multifor the Multilingual model. - Added
engine.runtimeClassNamevalue to configure a Kubernetes RuntimeClass on Engine pods - Added a 15-minute Time-to-Live (TTL) to the model download pod, configurable via
ttlSecondsAfterFinished- This helps to fix an issue where a lock is held on the PersistentVolume resource, when attempting to delete the Deepgram Kubernetes namespace
- Added
engine.resources.gpuResourceNameto make the GPU resource name configurable for MIG support
- Updated default container tags to April 2026 release (
release-260416). Refer to the main Deepgram changelog for additional details.
- Standardized sample values files to fix drift from the reference configuration (missing resource limits, null
features:overrides, inconsistent global section, incorrect model list format)
0.33.0 - 2026-04-02
- Updated default container tags to April 2026 release (
release-260402). Refer to the main Deepgram changelog for additional details.
0.32.0 - 2026-03-19
- Added
billing.server.certificatesPortto expose the/v1/certificatesendpoint on the Billing container
- Updated default container tags to March 2026 release (
release-260319). Refer to the main Deepgram changelog for additional details.
0.31.0 - 2026-03-05
- Updated default container tags to March 2026 release (
release-260305). Refer to the main Deepgram changelog for additional details. - Removed the
[half_precision]section fromengine.tomlfiles- Deepgram sets this automatically and does not require customers to manually configure it
0.30.0 - 2026-02-12
- Added
extraEnvsupport to API, Engine, License Proxy, and Billing containers for passing custom environment variables via values.yaml.
- Updated default container tags to February 2026 release (
release-260212). Refer to the main Deepgram changelog for additional details.
0.29.1 - 2026-02-02
- Updated the default value for
flux.max_streamsto be a placeholder value of0. This parameter must be set for production workloads based on the GPU type used in the deployment.
0.29.0 - 2026-01-29
- Updated default container tags to January 2026 release (
release-260129). Refer to the main Deepgram changelog for additional details.
0.28.0 - 2026-01-27
- Added Flux
max_streamssetting to Engine configuration
0.27.1 - 2026-01-21
- Added Billing container support for airgapped deployments
0.27.0 - 2026-01-15
- Added
engine.health.gpuRequiredto indicate whether GPU availability is required to be considered healthy. Defaults tofalse. - Added support for Aura-2 Polyglot TTS (Dutch, German, French, Italian, Japanese)
- Updated default container tags to January 2026 release (
release-260115). Refer to the main Deepgram changelog for additional details. The changelog describes the details of a TTS breaking change for which we recommend using a blue-green deployment strategy.
0.26.0 - 2025-12-29
- Updated default container tags to December 2025 release (
release-251229). Refer to the main Deepgram changelog for additional details.
0.25.0 - 2025-12-10
- Added an optional
engine.lifecycle.postStart.commandhook to run custom commands after Engine container startup.
- Updated default container tags to December 2025 release (
release-251210). Refer to the main Deepgram changelog for additional details.
0.24.0 - 2025-11-18
- Added
use_v2_language_detectionfeature flag to support 36-language detection (disabled by default). - Added optional EFS StorageClass creation with
engine.modelManager.volumes.aws.efs.storageClass.createflag to support restricted environments where StorageClass creation is not permitted.
- Updated default container tags to November 2025 release (
release-251118). Refer to the main Deepgram changelog for additional details. - Updated
/v1/statusendpoint to raise four statuses: Initializing, Ready, Healthy, and Critical. See status endpoint documentation for details.
0.23.1 - 2025-11-04
- Quoted Voice Agent LLM model names to fix periods breaking the TOML parser
0.23.0 - 2025-10-29
- Updated default container tags to October 2025 release (
release-251029). Refer to the main Deepgram changelog for additional details. - Updated sample cluster configuration files to use Kubernetes 1.33 (previously 1.30)
- Updated Helm chart dependencies: cluster-autoscaler 9.52.1 (previously 9.46.3), prometheus-adapter 4.14.2 (previously 4.13.0)
- Fixed API templates to use correct
additionalLabelsreference - Fixed hard coded labels and selectors
0.22.0 - 2025-10-15
- Added Google as a 3rd party provider for Voice Agent helm chart
- Added
topologySpreadConstraints, which allows even distribution of pods from the same deployment across availability zones, among other criteria - Added
redactUsageunder api features which enables redaction of usage metadata
- Updated default container tags to October 2025 release (
release-251015). Refer to the main Deepgram changelog for additional details. - Set
entity_redactiontotrueby default, so redaction is automatically enabled if a valid NER model is available
0.21.0 - 2025-09-29
- Updated default container tags to September 2025 release (
release-250929). Refer to the main Deepgram changelog for additional details.
0.20.0 - 2025-09-17
- Exposed the ability to add custom TOML sections in api.toml and engine.toml via
customToml - Added
nodeSelectorsupport for all components (API, Engine, License Proxy) to allow scheduling pods on specific nodes. - Added configurable service types for API, Engine, and License Proxy services with ClusterIP as the default
- Added support for service annotations when using LoadBalancer service type
- Added
loadBalancerSourceRangesconfiguration for LoadBalancer services to restrict access to specific IP CIDR ranges - Added
externalTrafficPolicyconfiguration for LoadBalancer services to control traffic routing behavior - Updated sample configurations to demonstrate service configuration options including LoadBalancer security settings
- Container-level security context support to Helm templates
- Supported removing resource limits on Engine pods
- Changed default service type from NodePort to ClusterIP for all services (API external, Engine metrics, License Proxy status)
- Updated service templates to support configurable service types and annotations
0.19.0 - 2025-09-12
- Changes the defaults of
.Values.api.features.formatEntityTagsand.Values.engine.features.streamingNertotrue, so that NER formatting is enabled by default. This formatting is required with Nova-3 models. See our self-hosted NER guide for further details. - Updated default container tags to September 2025 release (
release-250912). Refer to the main Deepgram changelog for additional details.
0.18.1 - 2025-09-03
- Defined
allowNonpublicEndpointsVoice Agent flag for use with custom LLM endpoints
- Fixed HPA replica conflicts in API and Engine deployments by conditionally removing hardcoded replicas when autoscaling is enabled
0.18.0 - 2025-08-28
- Added built-in support for Voice Agent.
- Updated default container tags to August 2025 release (
release-250828). Refer to the main Deepgram changelog for additional details.
- Fixed securityContext template references for API and Engine deployments
- Fixed securityContext documentation comments
0.17.0 - 2025-08-14
- Updated default container tags to August 2025 release (
release-250814). Refer to the main Deepgram changelog for additional details.
0.16.0 - 2025-07-31
- Updated default container tags to July 2025 release (
release-250731). Refer to the main Deepgram changelog for additional details.
0.15.0 - 2025-07-10
- Apply additional annotations to the template section of Deployment resources.
- Updated default container tags to July 2025 release (
release-250710). Refer to the main Deepgram changelog for additional details.
0.14.0 - 2025-06-26
- Updated default container tags to June 2025 release (
release-250626). Refer to the main Deepgram changelog for additional details.
0.13.0 - 2025-06-10
- Updated default container tags to June 2025 release. Refer to the main Deepgram changelog for additional details.
0.12.0 - 2025-03-31
- Updated default container tags to March 2025 release. Refer to the main Deepgram changelog for additional details.
0.11.1 - 2025-03-28
- Exposed configuration values to enable named-entity recognition models. See the March 2025 Deepgram Self-Hosted Changelog for more details on features powered by these models.
0.11.0 - 2025-03-07
- Updated default container tags to March 2025 release. Refer to the main Deepgram changelog for additional details.
0.10.0 - 2025-01-30
- Updated default container tags to January 2025 release. Refer to the main Deepgram changelog for additional details.
0.9.0 - 2024-12-26
- Updated default container tags to December 2024 release. Refer to the main Deepgram changelog for additional details.
0.8.1 - 2024-12-17
- Fixed default ratio metrics in Prometheus Adapter chart values to use 0.0 to 1.0 scale to match autoscaling documentation
0.8.0 - 2024-11-21
- Updated default container tags to November 2024 release. Refer to the main Deepgram changelog for additional details.
- Add the Engine Deployment tolerations to the Engine's model download Job.
0.7.0 - 2024-10-24
- Updated default container tags to October 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
- Adds new streaming websocket TTS! This is a software feature, so no new TTS models are required.
- AWS samples updated to take advantage of new EKS accelerated AMIs, which bundle the required NVIDIA driver and toolkit instead of being installed by the NVIDIA GPU operator
0.6.0 - 2024-09-27
- Updated default container tags to September 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
- Adds broader support in Engine container for model auto-loading during runtime.
- Filesystems that don't support
inotify, such asnfs/csiPersistentVolumes in Kubernetes, can now load and unload models during runtime without requiring a Pod restart.
- Filesystems that don't support
- Adds broader support in Engine container for model auto-loading during runtime.
- Automatic model management on AWS now supports model removal. See the
engine.modelManager.models.removesection in thevalues.yamlfile for details. - Container orchestrator environment variable added to improve support.
- Automatic model downloads on AWS are moved from
engine.modelManager.models.linkstoengine.modelManager.models.add. The oldlinksfield is still supported, but migration is recommended.
- Update sample files to fix an issue with sample command for Kubernetes Secret creation storing Quay credential
- Previous command used
--from-filewith the user's Docker configuration file. Some local secret managers, like Apple Keychain, scrub this file for sensitive information, which would result in an empty secret being created.
- Previous command used
0.5.0 - 2024-08-27
- Updated default container tags to August 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
- GA support for entity detection for pre-recorded English audio
- GA support for improved redaction for pre-recorded English audio
- Fixed a misleading comment in the
03-basic-setup-onprem.yamlsample file that wrongly suggestedengine.modelManager.volumes.customVolumeClaim.nameshould be aPersistentVolumeinstead of aPersistentVolumeClaim
- Deepgram's core products are available to host both on-premises and in the cloud. Official resources have been updated to refer to a "self-hosted" product offering, instead of an "onprem" product offering, to align the product name with industry naming standards. The Deepgram Quay image repository names have been updated to reflect this.
0.4.0 - 2024-07-25
- Introduced entity detection feature flag for API containers (
falseby default). - Updated default container tags to July 2024 release. Refer to the main Deepgram changelog for additional details. Highlights include:
- Support for Deepgram's new English/Spanish multilingual code-switching model
- Beta support for entity detection for pre-recorded English audio
- Beta support for improved redaction for pre-recorded English audio
- Beta support for improved entity formatting for streaming English audio
- Removed some items nested under
api.featuresandengine.featuressections in favor of opinionated defaults.
0.3.0 - 2024-07-18
- Allow specifying custom annotations for deployments.
0.2.3 - 2024-07-15
- Sample
values.yamlfile for on-premises/self-managed Kubernetes clusters.
- Resolves a mismatch between PVC and SC prefix naming convention.
- Resolves error when specifying custom service account names.
- Make
imagePullSecretsoptional.
0.2.2-beta - 2024-06-27
- Adds more verbose logging for audio content length.
- Keeps our software up-to-date.
- See the changelog associated with this routine monthly release.
0.2.1-beta - 2024-06-24
- Restart Deepgram containers automatically when underlying ConfigMaps have been modified.
0.2.0-beta - 2024-06-20
- Support for managing node autoscaling with cluster-autoscaler.
- Support for pod autoscaling of Deepgram components.
- Support for keeping the upstream Deepgram License server as a backup even when the License Proxy is deployed. See
licenseProxy.keepUpstreamServerAsBackupfor details.
- Initial installation replica count values moved from
scaling.static.{api,engine}.replicastoscaling.replicas.{api,engine}. - License Proxy is no longer manually scaled. Instead, scaling can be indirectly controlled via
licenseProxy.{enabled,deploySecondReplica}. - Labels for Deepgram dedicated nodes in the sample
cluster-config.yamlfor AWS, and thenodeAffinitysections of the samplevalues.yamlfiles. The key has been renamed fromdeepgram/nodeTypetok8s.deepgram.com/node-type, and the values are no longer prepended withdeepgram. - AWS EFS model download job hook delete policy changed to
before-hook-creation. - Concurrency limit moved from API (
api.concurrencyLimit.activeRequests) to Engine level (engine.concurrencyLimit.activeRequests).
0.1.1-alpha - 2024-06-03
- Various documentation improvements
0.1.0-alpha - 2024-05-31
- Initial implementation of the Helm chart.