Release notes 4.2.0

Release date: 30th June 2023

Summary

This page highlights all the changes, new features and bugs addressed within the LumenVox Containers version 4.2.0 release. This change affects Speech products. This version is not available for Voice biometric products, this will be made available in upcoming releases.

Highlights

  1. Implementation of Text Normalization for English & Spanish transcription results including:

    1. Inverse text normalization (performs the following β€˜spoken ->written’ form conversions e.g. first to 1st)

    2. Punctuation & capitalization

    3. Sensitive information redaction (e.g. passwords, pins, bank accounts, names)

  2. Implementation of new ASR engine providing better performance and accuracy. The new engine also offers enhanced features like aliases, dialect specific processing, improved scoring algorithm, enhanced transcription, phrase lists and weighting. The new engine is much faster utilizing less resources than its predecessor thus making scaling more manageable. Note that due to the new scoring mechanism, clients are advised to review any existing thresholds set.

  3. Implementation of new Grammar management service, which co-ordinates the grammar compilation, caching, parsing & DTMF processing for ASR & Transcription transactions. It is also instrumental in CPA & AMD transactions; grammar storage, retrieval & housekeeping and handling of global grammars and phrase lists.

  4. Introduction of Final Status field for all product transactions.

  5. Introduction of Enhanced Transcription allowing clients to utilize grammars within transcription for NLP type processing. 

  6. Introduction of aliases and lexicons within grammar processing (ASR & Enhanced Transcription)

  7. Introduction of dialect spelling in Transcription. Only English US/GB

  8. Enhanced licensing counters available

  9. Database consolidation under a single database for Postgres and Mongo. Clients also have the option to customize the database names within the connection strings. 

  10. Additional Prometheus counters added

  11. Improvements to structured container logging

*Note: For TTS, we recommend that text not exceed 4mb (this is roughly 1300 Characters with spaces or around 250 words)

**Note: For Transcription, we recommend that users not exceed 90 minutes of transcribed audio due to gRPC size limits.

What’s new on LumenVox Cloud 4.2.0 

New Features

  • Text normalization

  • New ASR model implemented for ASR & Transcription transactions

  • New Grammar management service introduced

  • Final status field introduced for all products

  • Enhanced Transcription implemented

  • Aliases and lexicons made available 

  • Introduction of dialect spelling

Updates

  • Improvements have been made to the following:

    • Prometheus counters enhanced

    • Structured logging enhanced

    • Licensing counters enhanced

    • Database consolidation under a single database for Postgres and Mongo. 

Installation notes

The following helm chart can be used

Helm Chart

Note that for MRCP there is no helm chart but a docker compose file. MRCP will run on its own Docker virtual machine which will integrate into the Kubernetes cluster.

Upgrade procedures

Upgrade or migration from previous versions is supported. Please contact LumenVox to discuss.

Updated API guide

APIs for all speech products available on version 4.2 can be obtained here: LumenVox API Documentation 

Information for voice biometric products relates to version 3.4.0-3.4.3 

Model versions as part of the release

ASR - 4.1.0

TTS - 1.0 sample rate 22

VB - 2.1.15 

VB incorporates Selene 2.4.3 which was integrated into the Container stack

Model version changes

Major model upgrade for ASR & Transcription


Was this article helpful?
Copyright (C) 2001-2024, Ai Software, LLC d/b/a LumenVox