Release notes 6.2.0

Release date: 1st July 2025

Summary

This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 6.2.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.

This release builds upon the 6.1.0 release - see https://lumenvox.capacity.com/article/891698/release-notes-6-1-0.

Highlights

Modifications to VAD to calculate noise floor at various intervals implemented (Apple request)
Modifications made to enhanced transcription (Apple request)
ASR English encoder model files updated to resolve issues with models causing ASR pod restarts.
New Next Generation English encoder & decoder ASR model available
New Japanese encoder model available
New emotive and bubbly voice for Spanish Female Neural TTS released (Eva)
TTS voices for Portuguese and Italian updated to support SSML
Health checks in deployment portal modified to return version number & roundtrip time calculation for external services
Archiving of multichannel-audio files now supported

For TTS, we recommend that text not exceed 4mb
For Transcription, we recommend that users not exceed 120 minutes of transcribed audio due to gRPC size limits.

What’s new on LumenVox Cloud 6.2.0

New features

Modifications to VAD to calculate noise floor levels at various intervals to assist with noise artefacts causing incorrect barge in/barge out implemented. No configuration changes are required
Modifications to enhanced transcription to identify all parse matches in the selected grammars and return all NLSML results. The user can determine which of the results they wish to use and can combine them if desired (e.g. longest match only or highest scoring result). This feature can be configured in grammar settings using max_enhanced_length (Maximum number of words to parse for enhanced transcription). This does not apply to legacy enhanced transcription. Range: 1-Unlimited, Default: 6
ASR English model files updated to resolve issues with models causing ASR pod restarts. Clients using the English models 4.0,0 - 4.1.0 should update the model to 4.1.1
New Next Generation English encoder & decoder ASR & Transcription model available. The new model provides improved accuracy but will require additional scaling of ASR pods (version 5.0.0)
New Japanese encoder model available
New emotive and bubbly voice for Spanish Female Neural TTS released (Eva)
TTS voices for Portuguese and Italian updated to support SSML
Health checks in deployment portal modified to return version number & roundtrip time calculation for external services
Archiving of multichannel-audio files supported. Clients can also create analysis sets and see different interactions created on the different audio channels

Updates

Helm charts modified to cater for separate ITN language specifications
Issues with importing encrypted analysis sets from older versions resolved
Version string added to the SIP responses and to the MRCP log output
Issue with resource pod not properly unpacking the dist_package_model_asr file after helm uninstall/install or upgrade resolved
General updates to internal go modules made to address scanned vulnerabilities
NLU - Language code vs. language name translation result inconsistency resolved
For CPA / AMD interactions where configuration settings are overridden by the client are now displayed at an interaction level in the analysis portal
Issue with server failures with status code 13 and a Redis key access error resolved

Installation notes

The following helm chart can be used

Helm Chart

Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.

Run the following command helm repo update to update the helm charts

Note: if using TTS we recommend you add the following toggle into the values file legacyEnabled. To enable legacy TTS this must be set to True, and False to enable the new neural TTS. The new neural TTS voices must be loaded in the values file in order for the models to be retrieved from S3

 ttsLanguages:
    - name: "en_us" 
      legacyEnabled: false
      voices:
         - name: "jeff"    
           version: "4.0.0"       
         - name: "megan"         
           version: "4.0.0"

Note if installing from 4.7 or below: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom license guid

If installing for MRCP - note that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.

If installing ITN, the following changes are required in the helm charts:

itnLanguages:       - name: "en"       - name: "es"

Key installation guide changes:

LumenVox recommends that a minimum of version 1.30 Kubernetes.

The following versions of required software are supported:

PostgreSQL: 17.2.0 (recommended)
MongoDB: 8.0.6 (minimum)
Redis: 7.4.1 (minimum)
RabbitMQ: 4.0.8 (recommended)
NGINX: 1.12
Linkerd: 2.14.9

The full software support matrix can be found here: LumenVox Kubernetes software support matrix

Upgrade procedures

Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above if upgrading from 4.7.

If you are performing an upgrade, you need ensure that your NGINX versions are updated from 1.11 to 1.12.

If upgrading neural TTS from version 6.0.0, the TTS cache folder and the Neural TTS models folder needs to be cleared. Reach out to LumenVox support should you have any questions support@lumenvox.com

If installing ITN, the following changes are required in the helm charts:

itnLanguages:       - name: "en"       - name: "es"

Updated API guide

APIs for all speech products available on version 6.2 can be obtained here: https://developer.lumenvox.com/6.2.0/

Information for voice biometric products relates to version 3.4.0-3.4.3

Model versions as part of the release

ASR - 4.1.0 (4.1.1 has been released for English). 4.1.1 addresses issues in model which can cause ASR pods to crash
5.0.0 New encoder & decoder English model released offering better accuracy but will require additional ASR pod scaling
TTS - 4.0 (Neural TTS) sample rate 24 & 16 - can be down sampled to 8kHz (note change). Legacy TTS models will still run under version 1.0. Further voice enhancements are currently being released in TTS voices version 4.0.1 so LumenVox recommends that clients cater for 4.0.X)
VB - 2.1.15
VB incorporates Selene 2.4.3 which was integrated into the Container stack

Model version changes

4.1.1 New encoder model released for English to resolve pod restarts
5.0.0 New encoder & decoder English model released offering better accuracy but will require additional ASR pod scaling
4.1.0 New Japanese encoder model released
4.0.0 Neural TTS model released for Spanish Female - Eva
4.0.0 Neural TTS voices for Italian and Portuguese updated to support SSML

Was this article helpful?