Hive off recovery process for Oracle

Process Summary

The purpose of this memo is to give background information on the Genesys Intelligent Automation (GIA) hive off process and to provide instructions on the most efficient way to recover from a failure in the hive off process in an Oracle environment.

Due to the large amount of reporting data captured and processed by the Genesys Intelligent Automation application, it is imperative that, should a failure in the hive off process occur, the system is quickly brought back into a state where the next night’s hive off can run successfully.

Version Compatibility Notice

Beginning with GIA version 9.0.120, the behavior of the hive-off process underwent significant changes.

In legacy versions (pre-9.0.120), the hive-off process copies reporting data from the Daily tables into the Historical tables each night, deletes the copied records from the Daily tables, and manages historical data retention according to the configuration.

In modern versions (9.0.120 and later), the hive-off process no longer copies data from Daily tables into Historical tables. This behavior is expected and correct. Historical tables no longer grow daily, and the hive-off process focuses on purging Daily data and generating reporting aggregates.

This document supports both legacy and modern architectures. Version-specific behavior is explicitly called out under each section where it differs.

Important

If corrective action is not taken within several hours of the original failure, then data loss is likely to result.

Version-Specific Impact

In legacy versions, data loss may occur if the hive-off does not complete successfully, resulting in missing or incomplete historical reporting data.

In modern versions, historical tables are no longer populated as part of hive-off. As a result, any data loss impacts only the Daily tables and may affect reporting aggregates generated for the affected period.

Definitions and Background

Term

Definition

Hive off

The nightly batch job runs on the primary Genesys Intelligent Automation

GUI server to manage the large amounts of reporting data that are captured during the day. The hive off job typically runs in the early hours of each morning and can take several hours to complete.

Behaviour	Legacy Mode (versions earlier than 9.0.120)	Modern Mode (versions after 9.0.120 and later)
Copying of records from the Daily tables to the Historical tables.	Records are copied from the Daily tables into the corresponding Historical tables during the hive off process	Records are not copied from the Daily tables into the Historical tables.
Deletion of copied Daily records	Records are deleted from the Daily tables after they have been copied to Historical tables.	Records are deleted from the Daily tables without being copied to the Historical tables
Purging of Historical data.	Expired data is deleted from the Historical tables based on the configured retention period.	Expired data is deleted only if legacy Historical data exists

Daily tables

The DB tables that receive details about incoming calls and Control

Centre auditing for the current day. These tables are constantly receiving new reporting data from the Genesys Intelligent Automation servers:callscall_ sitescall_stepsbusiness_tasksgui_actions.

Behaviour	Legacy Mode (versions earlier than 9.0.120)	Modern Mode (versions after 9.0.120 and later)
Reception of call handling and auditing data.	Daily tables receive call handling and Control Centre auditing data throughout the day.	Daily tables receive call handling and Control Centre auditing data throughout the day.
Copying of Daily data to Historical tables	Daily table data is copied into Historical tables during hive off.	Daily table data is not copied into Historical tables.
Deletion of Daily data.	Daily data is deleted after it has been copied to Historical tables.	Daily data is deleted after it has been copied to Historical tables.

Historical tables

Read-only data for previous days, that has been moved from the ‘daily’ tables

to their ‘historical’ equivalents as part of the hive off: historical_callshistorical_call_siteshistorical_call_stepshistorical_business_taskshistorical_gui_actions.

Behaviour	Legacy Mode (versions earlier than 9.0.120)	Modern Mode (versions after 9.0.120 and later)
Population of Historical Tables.	Historical tables are populated nightly during the hive off process.	Historical tables are not populated with new data.
Synchronisation with Daily tables.	Historical tables must match the Daily tables after the data is copied.	Historical tables are not synchronised with the daily tables
Use for reporting.	Historical tables are used for reporting.	Historical tables are retained only for legacy data retention.
Growth of table size	Historical tables grow as new data is added daily.	Historical tables do not grow because no new data is added.

Aggregate tables

Read-only data for previous days, that has been moved from the ‘daily’ tables as part

of the hive off process and report generation: HIST_STATS_RECOGNITION1HIST_STATS_BLOCKRESULTSHIST_ STATS_BUSINESSTASKSLAST31DAYS_SLOTSDAILY_HEALTH_AGGREGATES_STARTDAILY_HEALTH_AGGREGATESDAILY_ DASHBOARD_AGGREGATESHIST_STATS_BUSINESSTASKS1HIST_STATS_BUSINESSTASKS2HIST_STATS_CALLJOURNEYSHIST _STATS_CALLSPERHOURHIST_STATS_CALLSPERDAYHIST_STATS_LASTMENUHIST_STATS_RECOGNITIONHIST_STATS_CALLSUMMARY

Behaviour	Legacy Mode (versions earlier than 9.0.120)	Modern Mode (versions after 9.0.120 and later)
Generation of aggregate data	Aggregate data is generated during the hive off process.	Aggregate data is generated during the hive off process.
Source of aggregate data	Aggregate data is generated using the Daily and Historical tables.	Aggregate data is generated directly from the daily tables.
Use for reporting	Aggregate tables are used for reporting.	Aggregate tables are used for reporting.

Steps involved in hive off process

The Genesys Intelligent Automation hive off process consists of four distinct steps. The behavior of these steps differs depending on the GIA version.

Legacy Mode (versions earlier than 9.0.120)

Start of step #1: Calculating aggregates and copying records from the ‘daily’ tables into the ‘historical’ tables. Aggregate reporting data is calculated, and records are copied from the Daily tables into their corresponding Historical tables. 2. Deleting those records from the ‘daily’ tables that were copied to the ‘historical’ tables in step 1. Once the copy is complete, the copied rows are removed from the Daily tables. 3. Deleting any expired data from the ‘historical’ tables as per the NumDaysHistoricalDataToKeep setting. Historical records older than the configured retention period are deleted. 4. Re-enabling indexes on the ‘daily’ tables. Indexes that were disabled earlier in the process are re-enabled to restore normal database performance.

Important

Indexes do not get disabled or re-enabled in the Oracle version of Genesys Intelligent Automation; however, this step is still part of the hive off flow and may appear in logs or troubleshooting references.

Modern Mode (versions 9.0.120 and later)

1. Calculating aggregates using the modern reporting pipeline. Aggregate reporting data is generated directly from the Daily tables using the modern reporting architecture. 2. Purging expired data from the ‘daily’ tables. Daily table records are deleted as part of hive off processing, without being copied to Historical tables. 3. Deleting expired legacy data from the ‘historical’ tables, if present. Only pre-existing legacy Historical records are purged according to retention settings. 4. Completing the hive off process. No index disabling or re-enabling is performed as part of hive off execution.

Important

In Modern Mode, the hive off process does not populate the Historical tables.

This is expected and correct behaviour.

Handling hive off failure

Should the hive off process fail for any reason, the corrective actions to ensure that the next night’s hive off runs successfully will be different depending on which step the hive off failed.

Determining at which step the hive off failed

This can be best determined by looking at the fish.log file (and, perhaps, fish.log.1, fish.log.2, etc.) in the primary GUI server’s following directory:

The names of the steps in the log file appear as: SpeechStorm\Platform\TomcatGUI\logs The names of the steps in the log file appear as:

populateAndHiveOff
deleteBatchAfterHiveOff
deleteOldHistoricalData
rebuildIndexes

The following entries should be written to the logs during each hive off run, and the presence or absence of certain of the entries below will indicate which step of the hive off failed, thus allowing you to follow the correct path in the Recovery Process in the next section.

Log entries for hive off steps Start of step #1 [INFO ] 2016-06-15 00:15:00,833 (Timer-1::) com.speechstorm.fish.reporting.PopulateAggregatesAndHiveOffHistoricalDataTask.populateAggregatesAndHiveOff() starting at ...

Completion of step #1 / start of step #2 The following entry indicates that step #1 has finished and step #2 is about to begin: [INFO ] 2016-06-15 00:37:48,557 (Timer-1::) com.speechstorm.fish.reporting.PopulateAggregatesAndHiveOffHistoricalDataTask.populateAggregatesAndHiveOff() finished populateAndHiveOff at ...

Completion of step #2 / start of step #3 This entry indicates that step #2 has finished and step #3 is about to begin: [INFO ] 2016-06-15 01:26:43,590 (Timer-1::) com.speechstorm.fish.reporting.PopulateAggregatesAndHiveOffHistoricalDataTask.populateAggregatesAndHiveOff() finished all deleteBatchAfterHiveOff steps at ...

Completion of step #3 / start of step #4 This entry indicates that step #3 has finished and step #4 is about to begin: [INFO ] 2016-06-15 02:06:24,271 (Timer-1::) com.speechstorm.fish.reporting.PopulateAggregatesAndHiveOffHistoricalDataTask.deleteOldHistoricalData() finished at ...

Completion of all hive off steps This entry indicates that all steps have completed: [INFO ] 2016-06-15 02:06:45,571 (Timer-1::) com.speechstorm.fish.reporting.PopulateAggregatesAndHiveOffHistoricalDataTask.rebuildIndexes() finished rebuildIndexes at ...

Version-specific behaviour notes

Legacy Mode (versions earlier than 9.0.120)

All four steps listed above are executed.
- Log entries for:
- populateAndHiveOff
- deleteBatchAfterHiveOff
- deleteOldHistoricalData
- rebuildIndexes

are expected during a successful hive off run.

Failures commonly occur due to:
- Duplicate data between the Daily and Historical tables
- Partial data copies
- Long-running purge or index operations

Modern Mode (versions 9.0.120 and later)

The hive off process does not copy data from the Daily tables to the Historical tables.

Log entries related to:
- copying data to Historical tables may not appear.
Failures typically occur during:
- Aggregate calculation
- Purging of Daily table data
- Purging of legacy Historical data only
Index disabling and re-enabling is not performed.

Recovery process

While it is important to understand and fix the problem that caused the hive off to fail, if this cannot be done immediately (for example, it requires more investigation), then it is better to act immediately to ensure that the next night’s hive off still runs.

There is a short window of time each night in which to run the hive off (for example, between midnight and 7 am). If the hive off does not run successfully for several days in a row, the amount of data in the daily tables will become prohibitive and will have an impact on both call handling and reporting. The only practical solution, should this be allowed to happen, is to perform a bulk delete of the daily tables, which will result in reporting data being lost.

The main factors to take into account are: •There must be no duplicate records in the daily and historical tables when the hive off runs. (Applies to Legacy Mode only – Daily → Historical copying does not occur in Modern Mode) •If the hive off has not run for several days and the value of the DBOvernightJobs.NumDaysHistoricalDataToKeep setting is not increased accordingly, the hive off will attempt to delete a much larger volume of records from the historical tables. This can take significantly longer to execute and may lead to exhaustion of transaction log space on the database server.

The overall hive off recovery process is described in the following diagram:

Important

Modern Mode note Branches involving Daily → Historical synchronisation do not apply in versions 9.0.120 and later.

Corrective actions in detail

One or more of the following actions may be required according to the diagram above.

(A) Delete duplicate records from ‘daily’ tables

This action is required if the hive off stopped during step 2. There will be no loss of reporting data when you execute this step, as the data has already been copied to the ‘historical’ tables.

The SQL to be run against the ‘fishreports’ database is as follows:

DELETE FROM call_steps WHERE id IN (SELECT id FROM historical_call_steps) AND rownum <= 20000

DELETE FROM call_sites WHERE id IN (SELECT id FROM historical_call_sites) AND rownum <= 20000

DELETE FROM business_tasks WHERE id IN (SELECT id FROM historical_business_tasks) AND rownum <= 20000

DELETE FROM calls WHERE call_id IN (SELECT call_id FROM historical_calls) AND rownum <= 20000

DELETE FROM gui_actions WHERE id IN (SELECT id FROM historical_gui_actions) AND rownum <= 20000

Important

This SQL should be run multiple times until there are no more duplicate records reported.

The max rownum limit can be increased with caution, but care needs to be taken that this does not have an adverse effect on table locking; monitoring is key here.

Modern Mode (versions 9.0.120 and later) ❌ Do not run this step. Daily → Historical copying does not occur in Modern Mode, therefore duplicate records between these tables cannot exist.

The max rownum limit can be increased with caution, but care needs to be taken that this does not have an adverse effect on table locking; monitoring is key here.

(B) Truncate data in ‘daily’ tables

This action is required if the amount of data in the ‘daily’ tables is such that the hive off is likely to run into peak business hours.

Important

Be aware that this step will result in a loss of all reporting data in the ‘daily’ tables, although the reporting data in the ‘historical’ tables will not be affected. We recommend taking a backup of the database just before running it in order to preserve the data in the ‘daily’ tables for further analysis of the root cause.

The SQL to run against the ‘fishreports’ database is as follows:

truncate table calls;

truncate table call_sites;

truncate table call_steps;

truncate table business_tasks;

truncate table gui_actions;

Modern Mode note In versions 9.0.120 and later, truncated Daily data is not repopulated into Historical tables. This is expected behaviour.

(C) Delete orphan records from ‘daily’ tables

This action is required once the ‘daily’ tables have been truncated, to ensure that there are no child records (for example, call_steps) whose parent records (i.e. calls) have been truncated. This situation can occur due to the timing of the truncate statements. It will result in the loss of a minute amount of reporting data and can be viewed as the completion of the ‘truncate’ operation above.

There should be a one-minute pause between executing action (B) and this action. The SQL to run against the ‘fishreports’ database is as follows:

delete from call_sites where call_id not in (select call_id from calls);

delete from call_steps where call_id not in (select call_id from calls);

delete from business_tasks where call_id not in (select call_id from calls);

(D) Increase ‘numdayshistoricaldatatokeep’ setting

Legacy Mode only

Run the following SQL against the ‘fishreports’ database to determine what value should be used for the NumDaysHistoricalDataToKeep setting in order to only hive off a maximum of 2 days’ old historical data at once:

DECLARE mindate DATE;

begin

select (SELECT min(call_start_date) FROM historical_calls) into mindate from DUAL;

dbms_output.put_line((SYSDATE-mindate) - 1);

end;

You need only change the value of the setting if the result of the above SQL is greater than the original/current value of the setting.

The setting can be changed via the Genesys Intelligent Automation Control Centre, in the Administration -> Default Server Settings

The value should be decreased each day from this point forward until it reaches its original value. In this way each night’s hive off will involve deleting a maximum of 2 days old historical data.

Important

Modern Mode note This setting applies only to the purge of legacy historical data.

No new historical rows are added in Modern Mode.

Next Steps

1. Ensuring that the next hive off will run

Ensure that the hive off job is enabled and that the 'in-progress flag is set to false.

These settings can be changed via the Genesys Intelligent Automation Control Centre, in the: Administration → Default Server Settings page.

You might also need to change the DBOvernightJobs.Enabled setting to true on the Primary GUI via the Genesys Intelligent Automation Control Centre, in the Administration -> Servers -> Edit (for the Primary GUI).

2. (Optional) Manually re-running the hive-off

In high-call traffic scenarios, it is best not to run the hive off during the day but instead to wait for it to run automatically the following night.

If the recovery steps have been undertaken shortly after a failure, or in a low-call traffic scenario, however, it is possible to kick off the hive off process manually. To do this, change the following setting (in the same location as above) from 1 to 0: (Aggregate hive off setting)

Now restart the primary GUI server and the hive-off will begin immediately, even though the current time is after the schedule hour and minute that are defined in the server settings.

When the hive-off has finished, be sure to change this setting back to 1 so that any future restarts of the GUI server will not trigger an immediate hive-off. Note: It is not required to restart the GUI again at this point.

3. Identifying the root cause

If the underlying cause for the failure has not yet been identified and resolved, then you should take steps to record more detailed log events to better troubleshoot if the issue recurs. On the Genesys Intelligent Automation primary GUI server, enable full SQL debug logs by editing: SpeechStorm\Platform\TomcatGUI\webapps\fish-gui\WEB-INF\classes\log4j.properties Ensure the following line is not commented out:

log4j.logger.java.sql=DEBUG

Important

Enabling full SQL debug logs will create a lot of log files on the primary GUI server
Care must be taken to perform a daily archive of these logs stretching back to at least 3 months

4. Managing the ongoing deletion of old historical records

Remember to decrease NumDaysHistoricalDataToKeep by 1 each day if it was increased as per action (D) above. This should be done daily until the setting is back to its original value.

Contents

Hive off recovery process for Oracle

Process Summary

Version Compatibility Notice

Version-Specific Impact

Definitions and Background

Steps involved in hive off process

Legacy Mode (versions earlier than 9.0.120)

Modern Mode (versions 9.0.120 and later)

Handling hive off failure

Determining at which step the hive off failed

Version-specific behaviour notes

Recovery process

Corrective actions in detail

(A) Delete duplicate records from ‘daily’ tables

(B) Truncate data in ‘daily’ tables

(C) Delete orphan records from ‘daily’ tables

(D) Increase ‘numdayshistoricaldatatokeep’ setting

Next Steps

1. Ensuring that the next hive off will run

2. (Optional) Manually re-running the hive-off

3. Identifying the root cause

4. Managing the ongoing deletion of old historical records

Contact

Genesys

Customer Care

Legal