Introduction

This document explains the technical details of Expertflow Voice Recording Solution. It will cover all terminologies and concepts used behind the solution.

Terminology

Following are the terms which are used in this document

SIP Message	It is the type of message through which CUCM communicates with VRS. It can either be a Request or a Response. It can either have content or an empty message.
Call	A call aggregates all sessions of a call. An actual call may have several sessions due to hold/resume or transfer/conference scenarios. A call object contains all sessions of a call.
Session	A session determines a single recording joining all voice streams of all participants.
Session Leg	A Session Leg is the voice stream of one participant in the session. A session has at least two session_legs.
Calling Number	This is the end-user/party who initiated the SIP call.
Called Number	This is the end-user/party who received the incoming call.
`FORCE_TERMINATION`	It is a flag which represents that a recording for this call is corrupted or incomplete. This recording might be empty or incomplete.
Zombie Timer	If the call is terminated for equal or more then this interval we mark it Terminated.
Thread Timer	This is the time interval after which our internal process check all the terminated calls and send them to the Mixer.
Call_Timeout	If no RTP packets received from the socket for this amount of time we mark it “Force Terminated”

Components

Expertflow Voice Recording Solution consists of 3 components. These components communicate with each other through Apache Kafka.

Recorder

Expertflow Recorder is the major component responsible for handling Handshake and call recording between Cisco CUCM and EF Voice Recording Solution.

The recorder has multiple internal processes developed to record each and every call seamlessly without any interruption or delay.

Correlation

Expertflow Recorder has data structures implemented to store each and every detail of a call in memory which makes it easy to correlate calls on runtime. A proper mapping is implemented which stores call sessions, call legs and a complete call in a data structure so at the end of any call we have a complete correlated call object having all sessions and legs.

The identification of each call is done via Xrefci Id we get from SIP Packets. In-Memory storage makes the whole process fast and seamless.

RTP Storage

Expertflow Recorder stores all voice RTP packets for each call leg in a separate raw file. Recorder decodes the RTP packets from RAW to PCM on the basis of Codecs. So far we are supporting G711 only.

Tagging

Recorder Tags each call with its completion status, either the call is properly recorded or forcefully terminated.

All the calls which are tagged “Force Terminated” are those which are not properly recorded due to a network glitch or any other cause. This recorded file may contain the complete recording, Partial or no recording.

Metadata

Expertflow Recorder is connected with Mysql Database in order to store required metadata in the database.

We store information on correlated calls along with required parameters.

Mixer

Expertflow Mixer is responsible for mixing each individual recorded call-leg files into a single session file on the basis of provided correlation information from the Recorder. Mixer after merging relevant files into a single file can convert it into .wav file depends on the configuration.

APIs

APIs provide RESTFul endpoints for any third-party application to fetch a list of recordings and download and play individual recording files.

Front-end

UI to search, play and download recordings. Front-end fetches recordings from the database via APIs component.

Archival Process

This is the feature used to archive all recordings after a specified time from the current server that hosts the VRS to a remote server via SFTP.

Component level network diagram

Recording Flow

The recording solution supports the Built-in Bridge recording (“BIB recording”) where the recording streams are forked from an agent IP phone to the EF-recorder, The agent voice and the customer's voice are sent separately i.e. stored as separate call legs and then mixed by EF-Recorder.

EF_Recorder will be configured in CUCM as a SIP trunk device in order to receive calls and recording streams.

The Recording is done from CUCM using SIP. The Recording Solution works as a SIP server for CUCM and captures every SIP event generated. Based on those events the recording is done over RTP.

Voice Recording Solution has three main components:

Recorder
Mixer
REST APIs

The recorder has data structures implemented to store each and every detail of a call in memory which makes it easy to correlate calls on runtime. A proper mapping is implemented which stores call sessions, call legs and a complete call in a data structure so at the end of any call we have a complete correlated call object having all sessions and legs.