Release notes 6.1.0
Release date: 8th June 2025
Summary
This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 6.1.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.
This release builds upon the 6.0.0 release - see LumenVox Containers Release Notes 6.0.0.
Highlights
- Neural TTS now incorporates partial results allowing for faster TTS synthesis 
- New TTS Prometheus metrics added for time-to-first-byte 
- ASR grammar cache housekeeping introduced to cleanup old grammar caches 
- Grammar ripcords for large grammars now configurable at a global/session/interaction level 
- Issue resolved for instances where VAD event offsets are longer than the audio duration in Redis resulting in empty transcription results 
- Various NLU issues resolved 
- Neural TTS memory leak issue resolved 
- For TTS, we recommend that text not exceed 4mb
- For Transcription, we recommend that users not exceed 120 minutes of transcribed audio due to gRPC size limits.
What’s new on LumenVox Cloud 6.1.0
New features
- Partial results enabled for Neural TTS to speed up the return of TTS synthesis results, this is set in ttsSettings.enable_partial_results 
- New TTS Prometheus metric added for time-to-first-byte - tts_first_result_time_max 
- tts_first_result_time_min 
- tts_first_result_time_dist 
 
- ASR grammar cache housekeeping introduced to cleanup old grammar caches. By default, this new housekeeping/cleanup mechanism will scan the folders every 10 minutes and delete any files older than 1 month since they were last modified. Overrides can be applied within the environment variables if required 
- Grammar ripcord now added to configurations for global, session or interaction settings and can be set within the deployment portal. If the grammar is above a specified size, then a grammar load failure is raised ( - GrammarSettings.grammar_threshold)
Updates
- Issue resolved for instances where VAD event offsets are longer than the audio duration in Redis resulting in empty transcription results 
- Various NLU issues resolved e.g. - Unable to process large text requests 
- Language -translation: If the input text contains "()" it prevents the full text from being translated 
- Diarization and Language ID returning negative Prometheus request counters 
- Resolve issues with translate_from_language 
 
- Issue with reporting-api crashing on certain requests when x-scopes is not included rectified 
- SSML markers being sent out of the MRCP faster than the outbound TTS audio stream resolved, they are now sent out when the corresponding part of the TTS stream is sent 
- Neural TTS memory leak issue resolved 
Installation notes
The following helm chart can be used
Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.
Run the following command helm repo update to update the helm charts
Note: if using TTS we recommend you add the following toggle into the values file legacyEnabled. To enable legacy TTS this must be set to True, and False to enable the new neural TTS. The new neural TTS voices must be loaded in the values file in order for the models to be retrieved from S3
ttsLanguages: - name: "en_us" legacyEnabled: false voices: - name: "jeff" version: "4.0.0" - name: "megan" version: "4.0.0"
Note if installing from 4.7 or below: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom license guid
If installing for MRCP - note that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.
Key installation guide changes:
LumenVox now recommends that a minimum of version 1.30 Kubernetes is installed.
Upgrade procedures
Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above if upgrading from 4.7.
If you are performing an upgrade, you need ensure that your NGINX versions are updated from 1.11 to 1.12.
If upgrading and using Neural TTS utilizing MRCP, edit the .env file making the following changes:
PRODUCT_VERSION=6.1MEDIA_SERVER__ENABLE_TTS_PARTIAL_STREAMING=1
Edit the docker-compose.yml file making the following changes:
MEDIA_SERVER__ENABLE_TTS_PARTIAL_STREAMING: "${MEDIA_SERVER__ENABLE_TTS_PARTIAL_STREAMING}"If upgrading neural TTS from version 6.0.0, the TTS cache folder and the Neural TTS models folder needs to be cleared. Reach out to LumenVox support should you have any questions support@lumenvox.com
Updated API guide
APIs for all speech products available on version 6.1 can be obtained here: 
    
      
    LumenVox API Documentation      
Information for voice biometric products relates to version 3.4.0-3.4.3
Model versions as part of the release
ASR - 4.1.0
TTS - 4.0 (Neural TTS) sample rate 24 & 16 - can be down sampled to 8kHz (note change). Legacy TTS models will still run under version 1.0. Further voice enhancements are currently being released in TTS voices version 4.0.1 so LumenVox recommends that clients cater for 4.0.X)
VB - 2.1.15
VB incorporates Selene 2.4.3 which was integrated into the Container stack
