Release notes 6.0.0
Release date: 2nd May 2025
Summary
This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 6.0.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.
This release builds upon the 5.4.0 release - see LumenVox Containers Release Notes 5.4.0.
Highlights
New Neural TTS released
Far more natural/human sounding voices
80 new voices across 30 languages/dialects - new voices can be accessed here Text To Speech Software
Featuring over 8 emotive and bubbly voices for contact center use
Clients now have the ability to create custom voices or personas
Fully supports SSML markups
Includes caching to improve on synthesis performance
Allows for full prosody control (not available in legacy)
New Voice of Capacity (Carol) available for Capacity Voice/ Agent Assistant client use
New NLU service released including the following text input-based products
Sentiment Analysis - determine overall sentiment of speaker
Call Summarization including topic detection and outcome detection
Language Detection
Language Translation (including auto language detection)
Speaker Diarization for identifying different speakers in mono audio files
Language Detection - audio based
ASR & Transcription models released for Danish, Norwegian and Swedish
ITN redaction support released for Spanish & French
Provide final results flag for partial results - continuous transcription (Apple request)
Improvements made to CPA handling of SIT tones and additional beep & malformed beep tones.
Configurations & deployment pods refactored to improve performance
Management API has been refactored to improve performance
OAuth added for gRPC and MRCP for Capacity Speech SaaS offering
Minimal Helm Chart stack for CPA customers who don’t require installation of ASR & TTS
Changes made to helm charts to support latest versions of NGINX and included custom annotation fields
For TTS, we recommend that text not exceed 4mb
For Transcription, we recommend that users not exceed 120 minutes of transcribed audio due to gRPC size limits.
What’s new on LumenVox Cloud 6.0.0
New features
New Neural TTS released
New NLU including Sentiment Analysis, Call Summarization (including topic detection and outcome detection), Language Detection and Language Translation
Speaker Diarization
Language Detection - audio based
ASR & Transcription models released for Danish, Norwegian and Swedish
Improved French ASR/transcription encoder model released
ITN redaction support released for Spanish & French
Provide final results flag for partial results - continuous transcription
OAuth added for gRPC and MRCP for Capacity Speech SaaS offering
Minimal Helm Chart stack for CPA customers who don’t require installation of ASR & TTS
Changes made to helm charts to support latest versions of NGINX and included custom annotation fields
Add RabbitMQ diagnostic to deployment portal diagnostics check
Updates
Improved French ASR/transcription encoder model released
Improvements made to CPA handling of SIT tones and additional beep & malformed beep tones
Configurations & deployment pods refactored to improve performance
Management API has been refactored to improve performance
Added en-IE (Ireland) dialect support for ASR/Transcription
Australian English Neural TTS voice load failure resolved
Admin portal - fix deployment list ordering
ITN - fix timeout errors not reported to API
Fix issues with fine tune model in continuous transcription when running multiple session pods
Transcription freezes when processing large audio files e.g. 2 hours resolved
Resolved issues with inconsistent behavior in enhanced transcription
Resolved issue with SSN redaction inaccuracies in longer phrases.
Improved TTS connectivity & error handling with RabbitMQ
Improved handling of deleted deployments in the admin portal
Resolved MRCP issues in calling new neural TTS
Created welcome screen to create deployments (Admin portal)
Resolved transcription/ITN issue where words sentences were been transcribed twice or returning “0”
Resolved “filename is too long” error message in TTS
New decoder model released for Dutch to resolve issues with transcription results (replace existing 4.1.0)
Change made to only allow selection of 8Khz sample rate for Ulaw & Mulaw TTS synthesis using the deployment portal SSML tool
Installation notes
The following helm chart can be used
Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.
Run the following command helm repo update to update the helm charts
Note: if using TTS we recommend you add the following toggle into the values file legacyEnabled. To enable legacy TTS this must be set to True, and False to enable the new neural TTS. The new neural TTS voices must be loaded in the values file in order for the models to be retrieved from S3
If installing from 4.7 or below: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom license guid
If installing for MRCP - note that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.
Key installation guide changes:
LumenVox now recommends that a minimum of version 1.30 Kubernetes is installed.
Upgrade procedures
Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above if upgrading from 4.7.
If you are performing an upgrade, you need ensure that your NGINX versions are updated from 1.11 to 1.12.
Updated API guide
APIs for all speech products available on version 6.0 can be obtained here: LumenVox API Documentation
Information for voice biometric products relates to version 3.4.0-3.4.3
Model versions as part of the release
ASR - 4.1.0 (4.1.2 Acoustic model released for French to enhance recognition with audio containing single words)
TTS - 4.0 (Neural TTS) sample rate 24 & 16 - can be down sampled to 8kHz (note change). Legacy TTS models will still run under version 1.0. Further voice enhancements are currently being released in version 4.0.1 so LumenVox recommends that clients cater for 4.0.X)
VB - 2.1.15
VB incorporates Selene 2.4.3 which was integrated into the Container stack
Model version changes
4.1.2 Acoustic model released for French
4.1.0 Dutch language model update