Jump to: navigation, search

IWD ETL Jobs

The data in iWD Data Mart is made available through a process that is called Extract, Transform and Load—or ETL, for short. The system that is used to create, configure, and execute the ETL process is Kettle, which is part of the Pentaho reporting suite. Kettle top-level objects are known as jobs. Jobs are a sequence of steps that are executed according to success/failure criteria. One of the steps that is used by iWD reporting is to transform steps.

The following list describes the preconfigured ETL jobs that are responsible for creating and populating the iWD Data Mart and for calculating the various aggregates and dimensions that are described in the iWD Data Mart schema. Jobs names are those displayed in the iWD GAX Plug-in's Data Mart Dashboard.

Preconfigured ETL Jobs
Job Function Attribute Description
Initialize iWD Data Job Name Initialize
Function Initializes the necessary data structures and populates static dimensions, such as the AGE, BUSINESS_VALUE, EVENT_DATE, EVENT_TIME, DATE_TIME, PRIORITY, STATUS, TASK_EVENT_TYPE, and TIMEZONE dimensions.
Schedule Runs once.
Load Configuration Job Name Load Configuration
Function Loads iWD configuration updates from the following ETL parameters:
  • customTaskAttributeMapping
  • customTaskDimensionMapping
  • customTenantAttributeMapping
  • customDepartmentAttributeMapping
  • customProcessMapping

into the ETL_CUSTOM_MAP and CUSTOM_DIM Data Mart tables. This function loads iWD configuration updates that have been gathered from the configuration database into the following Data Mart tables:

Schedule Configurable through service properties; typically, runs on a 15-minute cycle, but not more frequently than a 15-minute cycle.
Load Intraday Job Name Load Intraday
Function Loads updates from tables in the Interaction Server database and event log into the following core fact tables:

as well as the following dimensions:

Schedule Configurable through Service Properties in iWD GAX Plug-in; recommended that it be scheduled to run after the Load Configuration job ends through the Job Dependency scheduling option.
Aggregate Intraday Job Name Aggregate Intraday
Function Aggregates data that previously was loaded into fact tables by the Load Intraday job into the aggregation tables.
Schedule It is recommended that this job be scheduled immediately after the Load Intraday job has completed—typically, running every 15 minutes. The frequency of running this aggregate job does not have any bearing on the current 15-minute aggregate that is being populated.
Aggregate Statistics Job Name Aggregate Statistics
Function Generate extended statistics by executing statistics plug-ins.
Schedule It is recommended that this job be scheduled immediately following completion of the Aggregate Intraday job, because most of the statistics plug-ins user aggregated facts.
Load Historical Job Name Load Historical
Function Moves noncurrent data from the intraday fact tables to their corresponding historical fact tables. (“Noncurrent” refers to data other than today’s data [CREATED_DATE_KEY < today].)
  • Noncurrent I_TASK_EVENT_FACT data is moved to H_TASK_EVENT_FACT.
  • Noncurrent I_TASK_WORK_FACT data is moved to H_TASK_WORK_FACT.
  • Noncurrent I_TASK_FACT data is moved to H_TASK_FACT.
  • Noncurrent I_TASK_aggr_FACT_15min data is moved to H_TASK_aggr_FACT_15min.
Important
Task facts must also be finalized (having reached Completed, Canceled, or Rejected state) before they can be moved from the intraday fact tables regardless of duration in the intraday tables.
Schedule Runs daily through the schedule that is defined in Service Properties.
Aggregate Historical Job Name Aggregate Historical
Function Aggregates data for the historical DAY aggregation tables (H_TASK_aggr_FACT_DAY).
Schedule Runs once a day, after the Load Historical job.
Maintain iWD Job Name Maintain
Function Removes expired facts from the following tables based on the values that are set in the detailsExpirationDays and aggregation15min ExpirationDays parameters:
  • H_TASK_EVENT_FACT
  • H_TASK_WORK_FACT
  • H_TASK_FACT
  • H_TASK_aggr_FACT_15min

(These parameters are defined as rules on the ETL Service property in iWD GAX Plug-in.) This job also adds to DATE_TIME ensuring that next-year values are present in this table.

Schedule Runs once a day, after the Aggregate Historical job.
Prune events and interactions Job Name Prune
Function Removes events from the Event Log database when tasks are archived—that is, when they are moved or deleted from Interaction Server.
Schedule Runs once a day, after the Maintain job.

You can schedule ETL jobs to run:

  • On a recurring basis by using a CRON expression.
  • Manually.
  • Upon the successful completion of a dependent service.

For more information on the configuration of scheduling ETL jobs, refer to the iWD 8.5 GAX Plug-in Help.

This page was last edited on November 29, 2017, at 09:11.
Comments or questions about this documentation? Contact us for support!