Release notes 4.2.0
Release date: 30th June 2023
Summary
This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 4.2.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.
Highlights
Implementation of Text Normalization for English & Spanish transcription results including:
Inverse text normalization (performs the following βspoken ->writtenβ form conversions e.g. first to 1st)
Punctuation & capitalization
Sensitive information redaction (e.g. passwords, pins, bank accounts, names)
Implementation of new ASR engine providing better performance and accuracy. The new engine also offers enhanced features like aliases, dialect specific processing, improved scoring algorithm, enhanced transcription, phrase lists and weighting. The new engine is much faster utilizing less resources than its predecessor thus making scaling more manageable. Note that due to the new scoring mechanism, clients are advised to review any existing thresholds set.
Implementation of new Grammar management service, which co-ordinates the grammar compilation, caching, parsing & DTMF processing for ASR & Transcription transactions. It is also instrumental in CPA & AMD transactions; grammar storage, retrieval & housekeeping and handling of global grammars and phrase lists.
Introduction of Final Status field for all product transactions.
Introduction of Enhanced Transcription allowing clients to utilize grammars within transcription for NLP type processing.
Introduction of aliases and lexicons within grammar processing (ASR & Enhanced Transcription)
Introduction of dialect spelling in Transcription. Only English US/GB
Enhanced licensing counters available
Database consolidation under a single database for Postgres and Mongo. Clients also have the option to customize the database names within the connection strings.
Additional Prometheus counters added
Improvements to structured container logging
*Note: For TTS, we recommend that text not exceed 4mb (this is roughly 1300 Characters with spaces or around 250 words)
**Note: For Transcription, we recommend that users not exceed 90 minutes of transcribed audio due to gRPC size limits.
Whatβs new on LumenVox Cloud 4.2.0
New Features
Text normalization
New ASR model implemented for ASR & Transcription transactions
New Grammar management service introduced
Final status field introduced for all products
Enhanced Transcription implemented
Aliases and lexicons made available
Introduction of dialect spelling
Updates
Improvements have been made to the following:
Prometheus counters enhanced
Structured logging enhanced
Licensing counters enhanced
Database consolidation under a single database for Postgres and Mongo.
Installation notes
The following helm chart can be used
Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.
Upgrade procedures
Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss.
Updated API guide
APIs for all speech products available on version 4.2 can be obtained here: LumenVox API Documentation
Information for voice biometric products relates to version 3.4.0-3.4.3
Model versions as part of the release
ASR - 4.1.0
TTS - 1.0 sample rate 22
VB - 2.1.15
VB incorporates Selene 2.4.3 which was integrated into the Container stack
Model version changes
Major model upgrade for ASR & Transcription