Release notes 6.3.0

Release date: 21st November 2025

Summary

This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 6.3.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.

This release builds upon the 6.2.0 release - see https://lumenvox.capacity.com/article/892684/release-notes-6-2-0 .

These release notes include changes as part of the following patch releases: asr-6.2.1, asr-6.2.2, grammar-6.2.2, grammar-6.2.3, itn-6.2.1, license-6.2.1, license:6.2.1-beta, neural-tts-6.2.1, mrcp-api-6.2.1, neural-tts-6.2.2, neural-tts-6.2.3, neural-tts-6.2.4, vad-6.2.1, vad-6.2.2, vad-6.2.3, vad-6.2.4-beta, vad-6.2.5-beta

Highlights

Further enhancements to Redis to assist with scaling
- Under load, high latency in Redis mitigated by reducing pull/push notification threads to Redis improving performance and reducing no-input errors
- Notifications to stream updates within services streamlined
- ASR & VAD listeners modified to reestablish dropped connections
- Improved logging implemented to troubleshoot dropped connections
MRCP service changes made to address TLS under load producing inconsistent results. NOTE: If running with TLS enabled interactions on a single MRCP-API instance must be limited to 100-150
New TTS voices for Galician & Valencian & revised voice for Portuguese
SSML supported added for Swedish & Dutch TTS voices
Grammar modifications
- Modified cache hit rate to improve loading of grammars (resolve cache misses)
- Reduce URL fetch roundtrips during loading
- Implement grammar cache versioning mechanism
TTS tester in the Analysis portal modified to allow for TTS synthesis with plain text or SSML
Prometheus metrics added to track active interaction requests for various products at a session level
- session_active_asr_requests
- session_active_tts_requests
- session_active_cpa_requests
- session_active_amd_requests
- session_active_transcription_requests
- session_active_enhanced_transcription_requests
Enhancements made CPA and AMD to cater for new Apple Call Screening
Improvements made for scaling of CPA & AMD interactions
Continued security patch updates
The admin portal UI has been updated with Capacity branding

For TTS, we recommend that text not exceed 4mb
For Transcription, we recommend that users not exceed 120 minutes of transcribed audio due to gRPC size limits.

What’s new on LumenVox Cloud 6.3.0

New features

ITN updated to return start and end time of redacted words
New TTS voices for Galician & Valencian & revised voice for Portuguese (Bel)
TTS tester in the Analysis portal modified to allow for TTS synthesis with plain text or SSML

Updates

Further enhancements to Redis to assist with scaling
- Under load, high latency in Redis mitigated by reducing pull/push notification threads to Redis improving performance and reducing no-input errors
- Notifications to stream updates within services streamlined
- ASR & VAD listeners modified to reestablish dropped connections
- Improved logging implemented to troubleshoot dropped connections
MRCP service changes made to address TLS under load producing inconsistent results. NOTE: If running with TLS enabled interactions on a single MRCP-API instance must be limited to 100-150
SSML supported added for Swedish & Dutch TTS voices
Grammar modifications
- Modified cache hit rate to improve loading of grammars (resolve cache misses)
- Reduce URL fetch roundtrips during loading
- Implement grammar cache versioning mechanism
Prometheus metrics added to track active interaction requests for various products at a session level
- session_active_asr_requests
- session_active_tts_requests
- session_active_cpa_requests
- session_active_amd_requests
- session_active_transcription_requests
- session_active_enhanced_transcription_requests
Enhancements made CPA and AMD to cater for new Apple Call Screening
Improvements made for scaling of CPA & AMD interactions
Security patches made to address high & critical security scanned vulnerabilities
Updated the Helm chart so that the management-api pod receives the same RabbitMQ TLS settings as the other services by default.
Enhancements made to the new 6.2 enhanced transcription functionality to address semantic integration issues for Apple
Changes made to apply audio score in the absence of the GPU ASR model
Enhancements made to improve potential blocking scenarios by increasing the audio buffer size and to some internal go-routines.
Additional Prometheus metrics to track reported no-input errors”
- vad_stream_subscribe_duration – Histogram of latencies for Redis stream subscription, which monitors subscription latency. The expected value here is less than a millisecond.
- vad_transcoding_duration — Histogram of latencies for transcoding audio chunks.
- vad_processing_duration — Histogram of latencies for engine processing of audio chunks.
Updated logging to log deployment ID more often within log files which should assist with future troubleshooting
General enhancements to license reporting & refactoring of the license cache
The following Go SDK enhancements have been made
- OAuth & support for session grammar loads added to Go SDK
- Add error detection for finalized calls
- Add support for InteractionBeginProcessing
- Updated non-audio examples to use NO_AUDIO_RESOURCE
Added transcoder to VAD to handle any sample rate
Resolved TTS issue with lexicon not working if SSML is missing the 'xml:id' attribute
Resolved grammar caching issues when switching from 4.X to 5.X ASR model
ITN modified to not fully capitalize email addresses, names and addresses
CPA modified to cater for new Apple call screening comfort music which was triggering a false positive beep detection
AMD issue where the beep tone does not exist but we detected a beep with high confidence score resolved
Vulnerability scan issues on the admin & deployment portal resolved
Various minor updates to the Admin & Deployment portal made e.g. updating help information, fixes to analysis reports, and improvements to user functions in the analysis portal
RabbitMQ TLS warning messages in Management API pods resolved
Resolved Built-in DTMF Boolean grammar not working through MRCP
Issue with IPA phonemes not being applied in empty tags for neural TTS resolved
Issues with incorrect semantic values being returned with enhanced transcription resolved
Empty grammars now report an error and fail to compile
Exception in http handler of grammar service resolved
TTS issue reported referencing pre-recorded audio in SSML resolved
TTS issue with empty audio tag not working resolved
Issue with Megan voices not working if this is the only neural TTS voice installed resolved
Improvements made to the English model to improve accuracy on spelled out email addresses
ITN & VAD memory leak resolved
Issue with grammar service crashes when an unreachable grammar URL is specified resolved
Large audio files can now be archived
Issues with capitalization in ITN reported fixed
Fixed session configuration default setting for sessionArchiving which was not displaying
Grammar Counter grammar_active_load_requests becoming negative resolved
The admin portal UI has been updated with Capacity branding

Installation notes

The following helm chart can be used

Helm Chart

Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.

Run the following command helm repo update to update the helm charts

Note: if using TTS we recommend you add the following toggle into the values file legacyEnabled. To enable legacy TTS this must be set to True, and False to enable the new neural TTS. The new neural TTS voices must be loaded in the values file in order for the models to be retrieved from S3

 ttsLanguages:
   - name: "en_us" 
     legacyEnabled: false
     voices:
       - name: "jeff"    
         version: "4.0.0"       
       - name: "megan"         
         version: "4.0.0"

Note if installing from 4.7 or below: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom license guid

If installing for MRCP - note that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.

If installing ITN, the following changes are required in the helm charts:

itnLanguages:
  - name: "en"
  - name: "es"

Key installation guide changes:

LumenVox recommends that a minimum of version 1.31 Kubernetes.

The following versions of required software are supported:

PostgreSQL: 17.4 (recommended)
MongoDB: 8.0.11 (minimum)
Redis: 8.0.3 (minimum)
RabbitMQ: 4.1.1(recommended)
NGINX: 1.12.4
Linkerd: edge 25.8.4

The full software support matrix can be found here: LumenVox Kubernetes software support matrix

Upgrade procedures

Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above if upgrading from 4.7.

If you are performing an upgrade, you need ensure that your NGINX versions are updated from 1.11 to 1.12.

If upgrading neural TTS from version 6.0.0, the TTS cache folder and the Neural TTS models folder needs to be cleared. Reach out to LumenVox support should you have any questions support@lumenvox.com

If installing ITN, the following changes are required in the helm charts:

itnLanguages:       
   - name: "en"        
   - name: "es"

Updated API guide

APIs for all speech products available on version 6.2 can be obtained here: https://developer.lumenvox.com/6.3.0/

Information for voice biometric products relates to version 3.4.0-3.4.3

Model versions as part of the release

ASR - 4.1.0 (4.1.1 English)
5.0.0 New encoder & decoder English model released offering better accuracy but will require additional ASR pod scaling
TTS - 4.0 (Neural TTS) sample rate 24 & 16 - can be down sampled to 8kHz (note change). Legacy TTS models will still run under version 1.0. Further voice enhancements are currently being released in TTS voices version 4.0.1 so LumenVox recommends that clients cater for 4.0.X)
VB - 2.1.15
VB incorporates Selene 2.4.3 which was integrated into the Container stack

Model version changes

4.0.0 Neural TTS model released for Galician (Xela) and Valencian (Marta)
4.0.0 Neural TTS voices for Swedish & updated to support SSML

Was this article helpful?