Release notes 6.0.0

Release date: 2nd May 2025

Summary

This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 6.0.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.

This release builds upon the 5.4.0 release - see LumenVox Containers Release Notes 5.4.0.

Highlights

New Neural TTS released
- Far more natural/human sounding voices
- 80 new voices across 30 languages/dialects - new voices can be accessed here Text To Speech Software
- Featuring over 8 emotive and bubbly voices for contact center use
- Clients now have the ability to create custom voices or personas
- Fully supports SSML markups
- Includes caching to improve on synthesis performance
- Allows for full prosody control (not available in legacy)
- New Voice of Capacity (Carol) available for Capacity Voice/ Agent Assistant client use
New NLU service released including the following text input-based products
- Sentiment Analysis - determine overall sentiment of speaker
- Call Summarization including topic detection and outcome detection
- Language Detection
- Language Translation (including auto language detection)
Speaker Diarization for identifying different speakers in mono audio files
Language Detection - audio based
ASR & Transcription models released for Danish, Norwegian and Swedish
ITN redaction support released for Spanish & French
Provide final results flag for partial results - continuous transcription (Apple request)
Improvements made to CPA handling of SIT tones and additional beep & malformed beep tones.
Configurations & deployment pods refactored to improve performance
Management API has been refactored to improve performance
OAuth added for gRPC and MRCP for Capacity Speech SaaS offering
Minimal Helm Chart stack for CPA customers who don’t require installation of ASR & TTS
Changes made to helm charts to support latest versions of NGINX and included custom annotation fields

For TTS, we recommend that text not exceed 4mb
For Transcription, we recommend that users not exceed 120 minutes of transcribed audio due to gRPC size limits.

What’s new on LumenVox Cloud 6.0.0

New features

New Neural TTS released
New NLU including Sentiment Analysis, Call Summarization (including topic detection and outcome detection), Language Detection and Language Translation
Speaker Diarization
Language Detection - audio based
ASR & Transcription models released for Danish, Norwegian and Swedish
Improved French ASR/transcription encoder model released
ITN redaction support released for Spanish & French
Provide final results flag for partial results - continuous transcription
OAuth added for gRPC and MRCP for Capacity Speech SaaS offering
Minimal Helm Chart stack for CPA customers who don’t require installation of ASR & TTS
Changes made to helm charts to support latest versions of NGINX and included custom annotation fields
Add RabbitMQ diagnostic to deployment portal diagnostics check

Updates

Improved French ASR/transcription encoder model released
Improvements made to CPA handling of SIT tones and additional beep & malformed beep tones
Configurations & deployment pods refactored to improve performance
Management API has been refactored to improve performance
Added en-IE (Ireland) dialect support for ASR/Transcription
Australian English Neural TTS voice load failure resolved
Admin portal - fix deployment list ordering
ITN - fix timeout errors not reported to API
Fix issues with fine tune model in continuous transcription when running multiple session pods
Transcription freezes when processing large audio files e.g. 2 hours resolved
Resolved issues with inconsistent behavior in enhanced transcription
Resolved issue with SSN redaction inaccuracies in longer phrases.
Improved TTS connectivity & error handling with RabbitMQ
Improved handling of deleted deployments in the admin portal
Resolved MRCP issues in calling new neural TTS
Created welcome screen to create deployments (Admin portal)
Resolved transcription/ITN issue where words sentences were been transcribed twice or returning “0”
Resolved “filename is too long” error message in TTS
New decoder model released for Dutch to resolve issues with transcription results (replace existing 4.1.0)
Change made to only allow selection of 8Khz sample rate for Ulaw & Mulaw TTS synthesis using the deployment portal SSML tool

Installation notes

The following helm chart can be used

Helm Chart

Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.

Run the following command helm repo update to update the helm charts

Note: if using TTS we recommend you add the following toggle into the values file legacyEnabled. To enable legacy TTS this must be set to True, and False to enable the new neural TTS. The new neural TTS voices must be loaded in the values file in order for the models to be retrieved from S3

If installing from 4.7 or below: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom license guid

If installing for MRCP - note that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.

Key installation guide changes:

LumenVox now recommends that a minimum of version 1.30 Kubernetes is installed.

Upgrade procedures

Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above if upgrading from 4.7.

If you are performing an upgrade, you need ensure that your NGINX versions are updated from 1.11 to 1.12.

Updated API guide

APIs for all speech products available on version 6.0 can be obtained here: LumenVox API Documentation

Information for voice biometric products relates to version 3.4.0-3.4.3

Model versions as part of the release

ASR - 4.1.0 (4.1.2 Acoustic model released for French to enhance recognition with audio containing single words)

TTS - 4.0 (Neural TTS) sample rate 24 & 16 - can be down sampled to 8kHz (note change). Legacy TTS models will still run under version 1.0. Further voice enhancements are currently being released in version 4.0.1 so LumenVox recommends that clients cater for 4.0.X)

VB - 2.1.15

VB incorporates Selene 2.4.3 which was integrated into the Container stack

Model version changes

4.1.2 Acoustic model released for French

4.1.0 Dutch language model update

Was this article helpful?