Release notes 5.0.0

Release date: 20th June 2024  

Summary

This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 5.0.0 release (this incorporates 4.7.1 to 4.7.10 changes). This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.

This release builds upon the 4.7.0 release - see LumenVox Containers Release Notes 4.7.0 - Product Management - Confluence (atlassian.net)

Highlights

  1. Infosec improvements made to address Centos07 end of life

  2. Improved rabbitmq messaging to enhance performance and support multi-tenancy

  3. Improvements to redis to enhance performance and support multi-tenancy

  4. Improvements to Postgres with database restructure to support for multi-tenancy

  5. Session manager was refactored to improve performance

  6. Analysis portal now caters for the exporting & importing of analysis sets

  7. All built in ABNF grammars converted to GRXML 

  8. new media types supported for transcription: flac, mp4, mp3, opus, m4a. The release also supports the generation of .SRT and .VTT outputs for use subtitling

  9. Multi-channel audio transcription is catered for

  10. Platform now caters for 44khz and 48khz audio files processing

  11. A new Grammar & SSML test tool has also been added

 

*Note: For TTS, we recommend that text not exceed 4mb (this is roughly 1300 Characters with spaces or around 250 words)

**Note: For Transcription, we recommend that users not exceed 90 minutes of transcribed audio due to gRPC size limits.

Whatโ€™s new on LumenVox Cloud 5.0.0 

New Features

  • Analysis portal now caters for the exporting & importing of analysis sets

  • All built in ABNF grammars converted to GRXML 

  • new media types supported for transcription: flac, mp4, mp3, opus, m4a. The release also supports the generation of .SRT and .VTT outputs for use subtitling

  • Multi-channel audio transcription is catered for

  • Platform now caters for 44khz and 48khz audio files processing

  • A new Grammar & SSML test tool has also been added

Updates

  • Long audio files can be played in the analysis portal

  • Fix made to Session Prometheus metric "session_active_requests" 

  • TTS container memory leak resolved

  • Added decode/recognition timeout to ASR/Transcription requests

  • MRCP API - replaced conf file settings with environment variables e.g. to enable compatibility mode.

  • LV_API_SETTINGS__INCLUDE_TRANSCODER_LOGS env variable added to get more logs for other media formats if required

  • Http health check added to assess health of the LumenVox-api grpc

  • Additional caching added to ASR and grammar manager to handle large grammars.

Installation notes

The following helm chart can be used

Helm Chart

Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.

Note: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom licesnse guid

If installing for MRCP - notes that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.

Upgrade procedures

Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above

Updated API guide

APIs for all speech products available on version 5.0 can be obtained here: https://developer.lumenvox.com/4.7.0/   

Information for voice biometric products relates to version 3.4.0-3.4.3 

Model versions as part of the release

ASR - 4.1.0

TTS - 1.0 sample rate 22

VB - 2.1.15

VB incorporates Selene 2.4.3 which was integrated into the Container stack

Model version changes

None


Was this article helpful?
Copyright (C) 2001-2024, Ai Software, LLC d/b/a LumenVox