Jump to: navigation, search

Disaster Recovery

In the event of the catastrophic failure of a particular site--in which all Genesys components become unavailable, including locally paired HA servers—peer site redundancy is used to provide ongoing support for all logged in agents. For those agents logged in to the surviving SIP Server Peer, their login remains unaffected and they can continue handling calls. For those agents that were logged in to the failed site, there is a temporary increase in queue wait times as these agents are logged in to the surviving site. Some loss of calls may occur at the failed site.

Site Failure

The Site Failure figure illustrates what typically happens when one site in a SIP Server Peer group suffers a catastrophic failure.

Site Failure

The following steps describe how Business Continuity recovers from a catastrophic failure of a particular SIP Server Peer site:

  1. Site 1 suffers a catastrophic failure. All Genesys components, including paired HA servers, are unavailable.
  2. The media gateways detect a response timeout from Site 1. In response, the media gateways begin sending all new calls to Site 2.
    If the media gateway itself is affected by the disaster outage, the PSTN should detect this; load-balancing at the gateway level should redirect calls to the surviving media gateways.
  3. Agents that are currently logged in to Site 2 continue to handle calls. Queue wait times increase temporarily.
  4. The agent's SIP phone responds in either of the following ways:
    • If the phone is configured to register on one site only, it re-registers now on the Site 2 SIP Server.
    • If the phone is configured for dual-registration, the phone automatically switches call handling from the local site to the backup site (Site 2).
    Agent desktops detect Site 1 failure, and re-login automatically to the SIP Server on Site 2. In addition, the desktop establishes connections to the Stat Server and Configuration Server Proxy on Site 2.
  5. The standby Configuration Server and Configuration Server Database as Site 2 are brought into service.
  6. When the surviving SIP Server detects that its peer is failed, it continues operation in single-site mode, stopping Business Continuity functions as follows:
    • It no longer applies the call forwarding procedure to new calls.
    • It allows agents to log in independently of the status of their endpoint.
    • It does not employ the forced logout procedure

Networking Failure Between Sites

The Networking Failure figure illustrates what typically happens when a networking failure occurs between SIP Server Peer sites.

Networking Failure

The following steps describe how Business Continuity recovers from a networking failure between the SIP Server Peer sites:

  1. In this example, network connectivity between the two data center sites is lost. SIP Server detects this failure through Active Out of Service detection (options oos-check and oos-force) of the inter-site Trunk DN. Connectivity between the media gateways and contact centers at each site are still available.
  2. The SIP Server instances at each site revert to their normal non-peered operation.
  3. Incoming calls at each site are routed only to agents logged in at that site--Business Continuity Forwarding does not apply.
  4. In this case, the Business Continuity solution avoids any "split-brain" problems because there are no longer any inter-dependencies between the sites.
  5. For short-term outages, the Configuration Server Proxy on Site 2 provides configuration data to local Site 2 applications. For longer outages, Site 2 Configuration Server and Configuration Server Database can be brought into service.
  6. When the surviving SIP Server detects that its peer is failed, it continues operation in single-site mode, stopping Business Continuity functions as follows:
    • It no longer applies the call forwarding procedure to new calls.
    • It allows agents to log in independently of the status of their endpoint.
    • It does not employ the forced logout procedure
This page was last modified on September 22, 2015, at 19:30.

Feedback

Comment on this article:

blog comments powered by Disqus