Release notes 5.0.0
Release date: 20th June 2024
Summary
This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 5.0.0 release (this incorporates 4.7.1 to 4.7.10 changes). This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.
This release builds upon the 4.7.0 release - see LumenVox Containers Release Notes 4.7.0 - Product Management - Confluence (atlassian.net)
Highlights
Infosec improvements made to address Centos07 end of life
Improved rabbitmq messaging to enhance performance and support multi-tenancy
Improvements to redis to enhance performance and support multi-tenancy
Improvements to Postgres with database restructure to support for multi-tenancy
Session manager was refactored to improve performance
Analysis portal now caters for the exporting & importing of analysis sets
All built in ABNF grammars converted to GRXML
new media types supported for transcription: flac, mp4, mp3, opus, m4a. The release also supports the generation of .SRT and .VTT outputs for use subtitling
Multi-channel audio transcription is catered for
Platform now caters for 44khz and 48khz audio files processing
A new Grammar & SSML test tool has also been added
*Note: For TTS, we recommend that text not exceed 4mb (this is roughly 1300 Characters with spaces or around 250 words)
**Note: For Transcription, we recommend that users not exceed 90 minutes of transcribed audio due to gRPC size limits.
Whatโs new on LumenVox Cloud 5.0.0
New Features
Analysis portal now caters for the exporting & importing of analysis sets
All built in ABNF grammars converted to GRXML
new media types supported for transcription: flac, mp4, mp3, opus, m4a. The release also supports the generation of .SRT and .VTT outputs for use subtitling
Multi-channel audio transcription is catered for
Platform now caters for 44khz and 48khz audio files processing
A new Grammar & SSML test tool has also been added
Updates
Long audio files can be played in the analysis portal
Fix made to Session Prometheus metric "session_active_requests"
TTS container memory leak resolved
Added decode/recognition timeout to ASR/Transcription requests
MRCP API - replaced conf file settings with environment variables e.g. to enable compatibility mode.
LV_API_SETTINGS__INCLUDE_TRANSCODER_LOGS env variable added to get more logs for other media formats if required
Http health check added to assess health of the LumenVox-api grpc
Additional caching added to ASR and grammar manager to handle large grammars.
Installation notes
The following helm chart can be used
Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.
Note: There have been helm charts changes - please ensure that if you have custom helm charts that you take note of all the changes before installing/upgrading e.g. licensing has moved from common to global - looking for custom licesnse guid
If installing for MRCP - notes that the conf file settings for MRCP API have been replaced with environment variables e.g. to enable compatibility mode.
Upgrade procedures
Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss. See notes above
Updated API guide
APIs for all speech products available on version 5.0 can be obtained here: https://developer.lumenvox.com/4.7.0/
Information for voice biometric products relates to version 3.4.0-3.4.3
Model versions as part of the release
ASR - 4.1.0
TTS - 1.0 sample rate 22
VB - 2.1.15
VB incorporates Selene 2.4.3 which was integrated into the Container stack
Model version changes
None