.Deployment Considerations v13.2

Deployment Architecture

For HA, ExpertFlow deploys the solution on two VMs as Docker containers using Docker Compose. If one VM goes down, the traffic will be routed to the other VM.

Each solution component is deployed as a stateless microservice in its own Docker container. Expertflow opted for a service-per-container paradigm in which each backend microservice is executed in its separate container.

Each Docker container is deployed with the “Restart Always” policy which causes it to restart automatically on the same VM. Notice that there might be a downtime of a few seconds while restarting the container. This implies that any requests received during the downtime will be rejected.

Services will be available via a virtual IP. If one of the VMs goes down, the virtual IP will route the traffic to the other VM.

Out of Scope

Component-Level Failover: Failover of individual components is not supported.
Network-level failover support: The internal network/link between the VMs (or the Docker network) is assumed to be working all the time.
DB Cluster: Setting up the DB cluster for MS SQL Server lies under the Customer responsibility.
Manual intervention will be required when both VMs are down.

Failover scenarios

The following table lists the various failover scenarios and the behavior of the different solution components in each scenario:

Failover Scenarios

Behavior

Synchronizer failover

Synchronizer component runs in an active-active state on both the VMs, using a heartbeat mechanism.

The active synchronizer updates its state in the DB after every 2 minutes.
When the active synchronizer becomes down or failover happens, it takes 2 minutes to shift to the other active synchronizer instance. Meanwhile, the front-end keeps showing the preserved stats. Once the newly-active synchronizer takes the control, all stats become zero for a second and then get refreshed.

Synchronizer restores after failover

If and when the previous, non-active synchronizer comes up again after the failover is successfully completed, the stats appearing on the gadgets become zero for a second once again. It continues to work afterward.

VM failover

The application runs on both VMs. When one VM goes down or failover happens, the user can still continue to perform all the CRUD operations on the application front-end.

However, since the insertion of the active synchronizer state is updated in the DB after every 2 minutes, the synchronizer stops syncing the stats with Cisco DB for 2 minutes. Once the newly-active synchronizer instance takes full control, all stats become zero for a second and then get refreshed afterward.