Product architecture

The following diagram depicts the high-level architecture of the solution. It shows how the client can integrate with LumenVox voice biometrics services through APIs: 

Figure 1: High-level solution architecture 

 

 

 

Product Components

APIs

Customers can consume the voice biometrics services via the following APIs available: Assure/Biometrics, Speech, Management, and Reporting. 

Assure Biometrics API

This collection of APIs is used to create identities, enroll them into the platform, and verify them. They are also used to manage identities e.g., disabling an identity, or removing a voiceprint.  

Speech API

This collection can be used to consume LumenVox speech products such as ASR (Automatic Speech Recognition), TTS (Text to Speech), CPA (Call Progress Analysis), Transcription, and NLU gateway (Natural Language Understanding). 

Management API

These APIs are used to manage configuration and deployment parameters.  

Reporting API

This collection of APIs is used to extract session and transactional data from the platform. It is also used to provide various statistics e.g., number of enrollment transactions processed per month or total voiceprint enrolled to-date. The extracts can be used to obtain further information on biometric operations or audit trails.

 

Open-Source Software Components

RabbitMQ

This message queuing component is implemented independently by the customer/partner and is used by LumenVox software for the queuing of all transaction, audit, and management information. Once a service processes the data requests, they are removed from the queue. A distributed cluster of RabbitMQ instances is recommended in production for high availability and scalability. 

Redis

This in-memory cache component is implemented independently by the customer/partner and is used by LumenVox software for the caching of session & transaction information throughout the whole solution whilst it is being processed by the various internal services. It is this component that receives and processes the audio for biometric processing. A distributed cluster of Redis instances is recommended in production, for high availability and disaster recovery. 

MongoDB

This document-oriented database is implemented independently by the customer/partner and is used by LumenVox software to store the enrollment & verification audio files along with the voiceprint models. It is important to store the audio for audit/troubleshooting purposes. Enrollment audio is also required if voiceprints require re-enrollment (e.g., a new DNN model is implemented). The storing of audio files is configurable. A distributed cluster of MongoDB instances is recommended in production for high availability and scalability. 

PostgreSQL

This SQL compliant relational database is implemented independently by the customer/partner and is used by LumenVox software to store all the transactional data. A distributed cluster of PostgreSQL instances is recommended in production for high availability and scalability.

 

Licensing Service

To consume LumenVox products, the containers are required to communicate with LumenVox’s cloud licensing service to submit information on product utilization. The customer needs to ensure that external firewall requirements are modified to allow the external connection.
 

LumenVox Portal 

The LumenVox Portal is a web-based interface that utilizes our APIs to allow customers/partners to manage the deployments, view or edit configurations and perform health checks. A customer/partner may integrate the APIs into their own web portals and dashboards, if they don’t want to use the one provided by LumenVox. The portal can also be used to access reports and manage voiceprints.

 

Solution Requirements

Hardware

The solution requires an environment in which the containers can be installed - this can be a Linux or Microsoft Windows environment. A Linux-based environment is the preferred environment for optimal performance.

An example of the required hardware is provided below:

Kubernetes Environment

  • 3 nodes X 8 CPU X 8 Gig memory
  • 100 concurrent requests

 

 # of Pods

CPU

Memory

assure-identity

2

140m

1850Mi

assure-api

2

980m

2050Mi

audit

1

10m

90Mi

binarystorage

1

170m

210Mi

configuration

1

10m

150Mi

deployment

1

10m

100Mi

engineresource

1

10m

890Mi

license

1

10m

20Mi

management-api

1

10m

100Mi

reporting

1

10m

70Mi

reporting-api

1

10m

90Mi

transaction

1

260m

390Mi

voice-verifier

2

2460m

1290Mi

 

Rabbit MQ

 

 

CPU Usage: 13%

 

 

Memory Usage: 1008Mi

 

 

Peak Messages 4800

 

 

Redis Cache

 

 

 

CPU Usage: 2.8%

 

 

Memory Usage: 390Mi

 

 

 

Calls: Get 174/s

 

 

 

Setex 101/s

 

 

 

set 57/s

 

 

 

Network: 6MB/s

 

 

 

Mongo 

 

 

 

CPU Usage: 7%

 

 

 

Memory Usage: 14.48Gi

 

 

 

Postgres

 

 

 

CPU Usage: 11%

 

 

 

Memory Usage: 4.91Gi

 

 

 

 

* Note that the measurements shown for the "provisioned" services above only show a standalone (non-clustered) test environment. Sizing for production will need to be determined based on your specific cluster requirements.

The client should consider creating both a test and a production system. It is recommended that the Redis, RabbitMQ, MongoDB and PostgreSQL components be provisioned outside of the Kubernetes cluster for performance purposes. 

 

Software

LumenVox Software works well on Kubernetes infrastructure. 

The following additional software components are necessary:

  • RabbitMQ
  • Redis 
  • MongoDB
  • PostgreSQL
  • Prometheus or similar product, should the customer want to monitor their solution. Note that although we provide seamless plugin capability for our services to Prometheus (or other services) leveraging the Prometheus metrics endpoints, we don’t specialize in setting up Prometheus. A host of Prometheus metrics are available on each container for the client to monitor. 

 

Audio

Audio must be recorded in one of the following formats and converted & submitted as the following types of headerless base64 encoded byte-streams:

  • Linear signed PCM - 16-bit 8kHz sample rate (PCM-16) 
  • alaw compressed 8-bit 8kHz sample rate
  • ulaw compress 8-bit 8kHz sample rate

 

Model

The solution comes with an out-of-the-box model for voiceprint enrollment and verification. However, the best accuracy is always achieved when the model is calibrated and tuned with a client’s own domain-specific data. To improve accuracy, the existing model could be recalibrated to adjust score thresholds. The model can be fine-tuned where the model’s scoring back-end is retrained with client production audio. The model can also be calibrated and fine-tuned prior to production should the client have either: 

  1. Voice biometric recordings from an existing system or  
  2. Collected audio samples via a data collection exercise

 

Other Software and Activities in a Production Environment

Some or all of the following activities will take place in a production environment and are customer or partner responsibility:

  • Provision of required hardware & host containerization software
  • Installation of LumenVox software
  • Management of Kubernetes, RabbitMQ, Redis, MongoDB, and PostgreSQL
  • Monitoring of hardware, software, and services e.g., by using tools like Prometheus
  • Monitoring of log files e.g., by using Log analysis tools like Datadog, Splunk
  • Stress testing of the full solution in customer environment
  • Set up and monitoring of network/component latency
  • Database management including scheduling of database cleanups
  • Backup management

Was this article helpful?
Copyright (C) 2001-2024, Ai Software, LLC d/b/a LumenVox