Release notes 6.1.0

Release date: 8th June 2025

Summary

This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 6.1.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases. 

This release builds upon the 6.0.0 release - see LumenVox Containers Release Notes 6.0.0.

Highlights

  1. Neural TTS now incorporates partial results allowing for faster TTS synthesis

  2. New TTS Prometheus metrics added for time-to-first-byte 

  3. ASR grammar cache housekeeping introduced to cleanup old grammar caches

  4. Grammar ripcords for large grammars now configurable at a global/session/interaction level

  5. Issue resolved for instances where VAD event offsets are longer than the audio duration in Redis resulting in empty transcription results

  6. Various NLU issues resolved

  7. Neural TTS memory leak issue resolved

What’s new on LumenVox Cloud 6.1.0 

New features

  • Partial results enabled for Neural TTS to speed up the return of TTS synthesis results, this is set in ttsSettings.enable_partial_results

  • New TTS Prometheus metric added for time-to-first-byte 

    • tts_first_result_time_max 

    • tts_first_result_time_min

    • tts_first_result_time_dist

  • ASR grammar cache housekeeping introduced to cleanup old grammar caches. By default, this new housekeeping/cleanup mechanism will scan the folders every 10 minutes and delete any files older than 1 month since they were last modified. Overrides can be applied within the environment variables if required


| Name                                                 | Description                                   | Default Value |
|------------------------------------------------------|-----------------------------------------------|---------------|
| `GRAMMAR_SETTINGS__FILE_CACHE_CLEANUP_PURGE_MINUTES` | Files older than 1 month deleted (0= disable) | 43200         |
| `GRAMMAR_SETTINGS__FILE_CACHE_CLEANUP_SLEEP_SECONDS` | Sleep time between scans                      | 600           |
  • Grammar ripcord now added to configurations for global, session or interaction settings and can be set within the deployment portal. If the grammar is above a specified size, then a grammar load failure is raised (GrammarSettings.grammar_threshold)

Updates

  • Issue resolved for instances where VAD event offsets are longer than the audio duration in Redis resulting in empty transcription results

  • Various NLU issues resolved e.g. 

    • Unable to process large text requests

    • Language -translation:  If the input text contains "()" it prevents the full text from being translated 

    • Diarization and Language ID returning negative Prometheus request counters

    • Resolve issues with translate_from_language

  • Issue with reporting-api crashing on certain requests when x-scopes is not included rectified

  • SSML markers being sent out of the MRCP faster than the outbound TTS audio stream resolved, they are now sent out when the corresponding part of the TTS stream is sent

  • Neural TTS memory leak issue resolved

Installation notes

The following helm chart can be used

Helm Chart

Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.

Run the following command helm repo update to update the helm charts

 

Note: if using TTS we recommend you add the following toggle into the values file legacyEnabled. To enable legacy TTS this must be set to True, and False to enable the new neural TTS. The new neural TTS voices must be loaded in the values file in order for the models to be retrieved from S3 

 ttsLanguages:
    - name: "en_us" 
      legacyEnabled: false
      voices:
         - name: "jeff"    
           version: "4.0.0"       
         - name: "megan"         
           version: "4.0.0"

 

Key installation guide changes:

LumenVox now recommends that a minimum of version 1.30 Kubernetes is installed.

Upgrade procedures

Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above if upgrading from 4.7.

If you are performing an upgrade, you need ensure that your NGINX versions are updated from 1.11 to 1.12.

If upgrading and using Neural TTS utilizing MRCP,  edit the .env file making the following changes:

PRODUCT_VERSION=6.1  MEDIA_SERVER__ENABLE_TTS_PARTIAL_STREAMING=1

Edit the docker-compose.yml file making the following changes:

MEDIA_SERVER__ENABLE_TTS_PARTIAL_STREAMING: "${MEDIA_SERVER__ENABLE_TTS_PARTIAL_STREAMING}"

Updated API guide

APIs for all speech products available on version 6.1 can be obtained here: LumenVox API Documentation      

Information for voice biometric products relates to version 3.4.0-3.4.3 

Model versions as part of the release

ASR - 4.1.0

TTS - 4.0 (Neural TTS) sample rate 24 & 16 - can be down sampled to 8kHz (note change). Legacy TTS models will still run under version 1.0. Further voice enhancements are currently being released in TTS voices version 4.0.1 so LumenVox recommends that clients cater for 4.0.X)

VB - 2.1.15

VB incorporates Selene 2.4.3 which was integrated into the Container stack


Was this article helpful?
Copyright (C) 2001-2025, Ai Software, LLC d/b/a LumenVox