Product architecture

The following diagram depicts the high-level architecture of the solution. It shows how the client can integrate with LumenVox voice biometrics services through APIs:

Figure 1: High-level solution architecture

Product Components

APIs

Customers can consume the voice biometrics services via the following APIs available: Assure/Biometrics, Speech, Management, and Reporting.

Assure Biometrics API

This collection of APIs is used to create identities, enroll them into the platform, and verify them. They are also used to manage identities e.g., disabling an identity, or removing a voiceprint.

Speech API

This collection can be used to consume LumenVox speech products such as ASR (Automatic Speech Recognition), TTS (Text to Speech), CPA (Call Progress Analysis), Transcription, and NLU gateway (Natural Language Understanding).

Management API

These APIs are used to manage configuration and deployment parameters.

Reporting API

This collection of APIs is used to extract session and transactional data from the platform. It is also used to provide various statistics e.g., number of enrollment transactions processed per month or total voiceprint enrolled to-date. The extracts can be used to obtain further information on biometric operations or audit trails.

Open-Source Software Components

RabbitMQ

This message queuing component is implemented independently by the customer/partner and is used by LumenVox software for the queuing of all transaction, audit, and management information. Once a service processes the data requests, they are removed from the queue. A distributed cluster of RabbitMQ instances is recommended in production for high availability and scalability.

Redis

This in-memory cache component is implemented independently by the customer/partner and is used by LumenVox software for the caching of session & transaction information throughout the whole solution whilst it is being processed by the various internal services. It is this component that receives and processes the audio for biometric processing. A distributed cluster of Redis instances is recommended in production, for high availability and disaster recovery.

MongoDB

This document-oriented database is implemented independently by the customer/partner and is used by LumenVox software to store the enrollment & verification audio files along with the voiceprint models. It is important to store the audio for audit/troubleshooting purposes. Enrollment audio is also required if voiceprints require re-enrollment (e.g., a new DNN model is implemented). The storing of audio files is configurable. A distributed cluster of MongoDB instances is recommended in production for high availability and scalability.

PostgreSQL

This SQL compliant relational database is implemented independently by the customer/partner and is used by LumenVox software to store all the transactional data. A distributed cluster of PostgreSQL instances is recommended in production for high availability and scalability.

Licensing Service

To consume LumenVox products, the containers are required to communicate with LumenVox’s cloud licensing service to submit information on product utilization. The customer needs to ensure that external firewall requirements are modified to allow the external connection.

LumenVox Portal

The LumenVox Portal is a web-based interface that utilizes our APIs to allow customers/partners to manage the deployments, view or edit configurations and perform health checks. A customer/partner may integrate the APIs into their own web portals and dashboards, if they don’t want to use the one provided by LumenVox. The portal can also be used to access reports and manage voiceprints.

Solution Requirements

Hardware

The solution requires an environment in which the containers can be installed - this can be a Linux or Microsoft Windows environment. A Linux-based environment is the preferred environment for optimal performance.

An example of the required hardware is provided below:

Kubernetes Environment

3 nodes X 8 CPU X 8 Gig memory
100 concurrent requests

	# of Pods	CPU	Memory
assure-identity	2	140m	1850Mi
assure-api	2	980m	2050Mi
audit	1	10m	90Mi
binarystorage	1	170m	210Mi
configuration	1	10m	150Mi
deployment	1	10m	100Mi
engineresource	1	10m	890Mi
license	1	10m	20Mi
management-api	1	10m	100Mi
reporting	1	10m	70Mi
reporting-api	1	10m	90Mi
transaction	1	260m	390Mi
voice-verifier	2	2460m	1290Mi

Rabbit MQ
CPU Usage: 13%
Memory Usage: 1008Mi
Peak Messages 4800
Redis Cache
CPU Usage: 2.8%
Memory Usage: 390Mi
Calls: Get 174/s
Setex 101/s
set 57/s
Network: 6MB/s
Mongo
CPU Usage: 7%
Memory Usage: 14.48Gi
Postgres
CPU Usage: 11%
Memory Usage: 4.91Gi

* Note that the measurements shown for the "provisioned" services above only show a standalone (non-clustered) test environment. Sizing for production will need to be determined based on your specific cluster requirements.

The client should consider creating both a test and a production system. It is recommended that the Redis, RabbitMQ, MongoDB and PostgreSQL components be provisioned outside of the Kubernetes cluster for performance purposes.

Software

LumenVox Software works well on Kubernetes infrastructure.

The following additional software components are necessary:

RabbitMQ
Redis
MongoDB
PostgreSQL
Prometheus or similar product, should the customer want to monitor their solution. Note that although we provide seamless plugin capability for our services to Prometheus (or other services) leveraging the Prometheus metrics endpoints, we don’t specialize in setting up Prometheus. A host of Prometheus metrics are available on each container for the client to monitor.

Audio

Audio must be recorded in one of the following formats and converted & submitted as the following types of headerless base64 encoded byte-streams:

Linear signed PCM - 16-bit 8kHz sample rate (PCM-16)
alaw compressed 8-bit 8kHz sample rate
ulaw compress 8-bit 8kHz sample rate

Model

The solution comes with an out-of-the-box model for voiceprint enrollment and verification. However, the best accuracy is always achieved when the model is calibrated and tuned with a client’s own domain-specific data. To improve accuracy, the existing model could be recalibrated to adjust score thresholds. The model can be fine-tuned where the model’s scoring back-end is retrained with client production audio. The model can also be calibrated and fine-tuned prior to production should the client have either:

Voice biometric recordings from an existing system or
Collected audio samples via a data collection exercise

Other Software and Activities in a Production Environment

Some or all of the following activities will take place in a production environment and are customer or partner responsibility:

Provision of required hardware & host containerization software
Installation of LumenVox software
Management of Kubernetes, RabbitMQ, Redis, MongoDB, and PostgreSQL
Monitoring of hardware, software, and services e.g., by using tools like Prometheus
Monitoring of log files e.g., by using Log analysis tools like Datadog, Splunk
Stress testing of the full solution in customer environment
Set up and monitoring of network/component latency
Database management including scheduling of database cleanups
Backup management

Was this article helpful?