Jump to: navigation, search

Importing and Managing Datasets

The data you want to import must be collected in a CSV file. For a detailed discussion of the types of data you might use and how it is processed in Predictive Routing, see the Genesys Predictive Routing Deployment and Operations Guide.

To open the configuration menu, click the Settings gear icon, located on the right side of the top menu bar: GPMsettingsGear.png.

To view updates to the dataset, such as appended data, reload the page.
  • Because a large complex dataset takes significant time to import and display, Genesys recommends that you start a new deployment by importing a small, test dataset, which will load quickly and enable you to troubleshoot any issues efficiently.
  • If you use a Microsoft editor to create your CSV file, remove the carriage return (^M) character before uploading. Microsoft editors such as Excel, WordPad, and NotePad automatically insert this character. For tips on removing the character from Excel files, refer to How to remove carriage returns (line breaks) from cells in Excel 2016, 2013, 2010.

Unsupported Characters in Agent and Customer Profiles and Datasets

Certain characters in column names are ignored, are unsupported, or cause an upload to fail, as explained in the following points:

  • Columns with the following symbols in their column names are not added to Agent Profiles or Customer Profiles:
    *, !, %, ^, (, ), ', &, /, â, è, ü, ó, â, ï
  • Columns with the following symbols in their column names are ignored and the column is added with the symbol dropped out as though it had not been entered:
    [Space], -, <
  • Non-ASCII characters are not supported. How they are handled differs depending on what data you are uploading:
    • In Agent Profiles and Customer Profiles, columns with non-ASCII characters in the column name are not added.
    • }In Datasets, when a column name contains a mix of ASCII and non-ASCII characters, GPR removes the non-ASCII characters from the column name as though they had not been entered and correctly uploads all column values.
    • }In Datasets, when a column name contains only non-ASCII characters, the column name is entirely omitted. All the column values are preserved, but you cannot modify or save the schema. In this scenario, GPR generates the following error message: An unhandled exception has occurred: KeyError('name').

Logs for Unsupported Characters

The following Agent State Connector log messages record issues with unsupported characters:

  • <datetime> [47] ERROR <BOTTLE> schema_based.py:63 Invalid expression while parsing: <fieldname> = None
  • <datetime> [47] ERROR <BOTTLE> agents.py:172 Fields set([u'<fieldname>']) were ignored because names were invalid.

Create a new dataset


To import data:

  1. Select Dataset from the left-side navigation bar.
  2. Click Create dataset.

Name your dataset and select the data file


In the Create Dataset window that opens, perform the following steps:

  1. Enter a name for your dataset.
  2. Select the separator type for your CSV file. You can choose either TAB or COMMA.
  3. Click Select File. Navigate to your CSV file and select it.
  4. Click Create.
    GPR automatically determines the data types of the columns in your dataset during dataset initialization by analyzing the first 1000 rows of each column. To ensure that GPR can make a correct determination, Genesys recommends that you insert a "dummy" row at the beginning of your dataset that contains values that can be unambiguously interpreted as the expected data types for each column. This prevents cases in which the first 1000 rows may contain all NULL or 0 values, which might lead to an incorrect data type assignment (since 0 can be a valid integer, float, or Boolean value). If a column does contain meaningful values, the dummy row is analyzed along with the other values and contributes to the data type determination. Genesys recommends you use the following data type specifications in your dummy row:
    • ‘a_string’ - Is recognized as a string.
    • 2.1, or any integer or float value > 1 - Is recognized as a float.
    • False or True - Is recognized as Boolean.
    • Unix Timestamp (such as 1535538976) - Is recognized as a timestamp.

Select a timestamp field and review your data schema


When your data has been uploaded, perform the following steps:

  1. Click the name of your new dataset to open it.
  2. Select the checkbox next to the field which contains timestamp data for the dataset. Then, click Set as Created Time. In the graphic, the INTERACTION_DATE field, which displays the CREATED_AT FIELD identifier, contains the timestamp data for the dataset.
    GPR supports the following timestamp formats:
    • Unix timestamp format, such as: 1493325496
    •  %Y-%m-%d %H:%M:%S.%f format, such as: 2017-05-13 21:11:01.436757
  3. Review your dataset to make sure that each field contains the correct datatype. If necessary, change it by selecting the correct datatype from the drop-down lists in the Type column of the table.
  4. Optionally, add descriptions for your data fields.

Save changes and synchronize


Once you have reviewed your data schema for accuracy, perform the following steps:

  1. Click Save Changes.
  2. Click Sync Schema.

When the schema has been synced, the Schema out of synchronization message changes to Schema synchronized and the associated icon turns green.

Viewing and updating datasets


On the Schema > Datasets tab, you can view a list of your datasets. The Status column shows whether the schema is synchronized, while the Created, Updated, and Description columns enable you to see more information about your dataset. The following actions are also available from this list:

  • Search - To locate a specific dataset or field in the associated list, type the dataset or field name into the Search field on the upper right side of the tab.
  • Delete - To delete a dataset, select the check box in the row for that dataset. Then click Delete Selected.
  • View dataset fields - Click the name of a dataset in the list to view a table showing all of its fields, and giving the visibility status, type, cardinality, and (optionally) description for each.
    • Toggle field visibility - Click the radio toggle switch to show or hide fields. When a dataset has many fields, you might want to hide some to view the most relevant fields more easily. Hiding fields only removes them from your view. Hidden fields are still used in Feature Analysis reports for predictors and datasets.
      Viewing a dataset with a large number of columns on the Datasets tab can make the page respond very slowly. To improve performance, leave no more than about 20 fields visible.
    • Append data - Click the name of a dataset in the list to append data to that dataset.
      Your appended data must have the same schema structure as the existing data. You can add fields and values, but you cannot change the existing schema. If you need to change the structure of your schema, delete the existing schema and upload your corrected data as a new dataset.
    • View field values - From the list of field names, click the name of a field to see a complete list of all values for that field.


Comment on this article:

blog comments powered by Disqus
This page was last modified on November 12, 2018, at 12:36.