Jump to: navigation, search

Migrate data from Cassandra database versions

Cassandra versions 2.x and higher do not support backward compatibility with Cassandra versions 1.x. The data migration is required when upgrading Feature Server's Cassandra database backend from versions 1.x to versions 2.x and/or 3.x. Feature Server release 8.1.202.02 includes the following Python scripts for migrating data from Cassandra database version 1.x to versions 2.x and 3.x:

  • copyKeyspaceSchema.py—Creates a keyspace and its column families in the destination Cassandra cluster.
  • copyKeyspaceColumnFamilies.py—Copies content of source keyspace column families to the destination keyspace column families.

Pre-requisites for data migration

The following are the pre-requisites for the data migration from versions 1.x to versions 2.x and/or 3.x.

  • Destination Cassandra cluster must be deployed and all the nodes must be up and running.
  • In terms of Feature Server deployment, the destination Cassandra cluster must be deployed in external mode.
  • The destination Cassandra cluster must not have any Feature Servers assigned to it before the copying of data from the source Cassandra cluster is completed.
  • SIP Feature Server must run in the ReadOnly mode to ensure proper data copy during migration with running Feature Servers. The ReadOnly mode must be turned on before deploying the Python migration scripts. Use the following configuration:
    [Cassandra]readOnly=true

Run Cassandra database migration scripts

The following steps show how to migrate data from Cassandra v1.x to v2.x and v3.x

  1. Deploy the Python scripts.
  2. Run the Python scripts.
  3. Connect Feature Server nodes to migrated Cassandra cluster.

Deploy the Python scripts

  1. Install Python 2.7.5 32-bit version and Pycassa libraries on the destination Cassandra host where the scripts must be run.
  2. The Python scripts copyKeyspaceSchema.py, copyKeyspaceColumnFamilies.py and the sample json input file, copyKeyspaceInput.json are present in the Python utilities folder of Feature Server deployment: FS installation path/Python/util/. Copy these script files to a directory on the destination Cassandra host.
  3. Navigate to the directory location and run the scripts.

For more details, refer to Python Scripts.

Run the Python scripts

Following is a sample copyKeyspaceInput.json input json file:

{"sourceHostPort": "FsNode01:9160",
"sourceHostUserName": "",
"sourceHostPassword": "",
"sourceHostTls": "false",
 "destinationHostPort": "CassNode01:9160",
 "destinationHostUserName": "",
 "destinationHostPassword": "",
 "destinationHostTls": "false",
 "replicationStrategyClassName": "NetworkTopologyStrategy", 
 "replicationOptions": {"DC1":"2", "DC2":"2"},
 "sourceKeyspace": "sipfs",
 "destinationKeyspace": "sipfs",
 "excludedCFs": [ ],
 "includedCFs": [ ] }

Copy keyspace schema

The following steps show the procedure to copy the keyspace schema:

  1. Verify that the input json file has the following parameters:

  2. Parameters

    Description

    Sample

    Mandatory

    sourceHostPort

    Host and the Thrift port of source Cassandra DB in the URL format: host IP:port

    FsNode01:9160

    Yes

    destinationHostPort

    Host and the Thrift port of destination Cassandra database in the URL format: host IP:port

    CassNode01:9160

    Yes

    sourceKeyspace

    Name of the source keyspace

    sipfs

    Yes

    destinationKeyspace

    Name of the destination keyspace

    sipfs

    Yes

    replicationStrategyClassName

    Replication Strategy Class Name

    NetworkTopologyStrategy

    Yes

    replicationOptions

    Replication Options for the destination keyspace


    Ensure to configure this value according to the cassandra-toplogy.properties file.

    {"DC1": "2", "DC2": "2"}

    Yes

    sourceHostUserName

    The username of source Cassandra.

    FSadmin

    Yes, if authentication is enabled in the source Cassandra Cluster.

    sourceHostPassword

    The password of source Cassandra.

    FSadmin

    Yes, if authentication is enabled in the source Cassandra Cluster.

    sourceHostTls

    Set this option to true when SSL is enabled for the source Cassandra connection.

    true

    Yes, if SSL is enabled for the source Cassandra.

    destinationHostUserName

    The username of destination Cassandra.

    FSadmin

    Yes, if authentication is enabled in the destination Cassandra Cluster.

    destinationHostPassword

    The password of destination Cassandra.

    FSadmin

    Yes, if authentication is enabled in the destination Cassandra Cluster.

    destinationHostTls

    Set this option to true when SSL is enabled for the destination Cassandra connection.

    true

    Yes, if SSL is enabled for the destination Cassandra.


  3. Run the copyKeyspaceSchema.py script.

  4. Sample command line
    python ./copyKeyspaceSchema.py -i ./copyKeyspaceInput.json -o ./copyKeyspaceSchema_`date +%y%m%d-%H:%M`.log

Copy keyspace column families

  1. Verify that the input json file has the following parameters:
  2. Parameters

    Description

    Sample

    Mandatory

    sourceHostPort

    Host and the Thrift port of source Cassandra database in the URL format: host IP:port

    FsNode01:9160

    Yes

    destinationHostPort

    Host and the Thrift port of destination Cassandra database in the URL format: host IP:port

    CassNode01:9160

    Yes

    sourceKeyspace

    Name of the source keyspace

    sipfs

    Yes

    destinationKeyspace

    Name of the destination keyspace

    sipfs

    Yes

    excludedCFs

    List of comma-separated column family names to be excluded from copying while running the copyKeyspaceColumnFamilies.py script.

    message_bytes, device

    No

    includedCFs

    List of comma-separated column family names to be copied while running the copyKeyspaceColumnFamilies.py script.

    message_bytes, device

    No

    sourceHostUserName

    The username of source Cassandra.

    FSadmin

    Yes, if authentication is enabled in the source Cassandra Cluster.

    sourceHostPassword

    The password of source Cassandra.

    FSadmin

    Yes, if authentication is enabled in the source Cassandra Cluster.

    sourceHostTls

    Set this option to true when SSL is enabled for the source Cassandra connection.

    true

    Yes, if SSL is enabled for the source Cassandra.

    destinationHostUserName

    The username of destination Cassandra.

    FSadmin

    Yes, if authentication is enabled in the destination Cassandra Cluster.

    destinationHostPassword

    The password of destination Cassandra.

    FSadmin

    Yes, if authentication is enabled in the destination Cassandra Cluster.

    destinationHostTls

    Set this option to true when SSL is enabled for the destination Cassandra connection.

    true

    Yes, if SSL is enabled for the destination Cassandra.

    If one or more source column families contain huge volumes of data, then run the copyKeyspaceColumnFamilies.py script to copy these column families separately from the rest of the source column families. Use the excludedCFs and includedCFs parameters to exclude or include a specific column family. When the includedCFs list is not empty, the excludedCFs parameter is ignored and only the column families in the includedCFs list are copied.

    For example, provide the following json file as the input to the copyKeyspaceColumnFamilies.py script to copy the content of all column families except message_bytes column family.
    {"sourceHostPort": "FsNode01:9160",
     "sourceHostUserName": "",
     "sourceHostPassword": "",
     "sourceHostTls": "false",
     "destinationHostPort": "CassNode01:9160",
     "destinationHostUserName": "",
     "destinationHostPassword": "",
     "destinationHostTls": "false",
     "replicationStrategyClassName": "NetworkTopologyStrategy", 
     "replicationOptions": {"DC1": "2", "DC2": "2"},
     "sourceKeyspace": "sipfs",
     "destinationKeyspace": "sipfs",
     "excludedCFs": [ “message_bytes” ],
     "includedCFs": [ ] }
    For example, provide the following json file as input to the copyKeyspaceColumnFamilies.py script to copy the content of only the message_bytes column family.
    {"sourceHostPort": "FsNode01:9160"
    "sourceHostUserName": "",
     "sourceHostPassword": "",
     "sourceHostTls": "false",
     "destinationHostPort": "CassNode01:9160",
     "destinationHostUserName": "",
     "destinationHostPassword": "",
     "destinationHostTls": "false",
     "replicationStrategyClassName": "NetworkTopologyStrategy", 
     "replicationOptions": {"DC1": "2", "DC2": "2"},
     "sourceKeyspace": "sipfs",
     "destinationKeyspace": "sipfs",
     "excludedCFs": [],
     "includedCFs": [ “message_bytes” ] }
  3. Run the copyKeyspaceColumnFamilies.py script.

  4. Sample command line
    python ./copyKeyspaceColumnFamilies.py -i ./copyKeyspaceInput.json -o ./copyKeyspaceContent_`date +%y%m%d-%H:%M`.log

    Important
    If there are regional keyspaces to be copied, all the keyspaces, the global keyspace and all regional keyspaces must be copied one after the other. To copy all keyspaces, the scripts must be run for each keyspace: the global keyspace and each regional keyspace.

Connecting Feature Server nodes to migrated Cassandra cluster

The following steps should be performed for every Feature Server node involved:

  1. Disable the ReadOnly mode in Feature Server. Use the configuration: [Cassandra]readOnly=false
  2. Stop Feature Server node.
  3. Edit <FS installation path>\launcher.xml file and set the property startCassandra to False.
  4. <parameter name="startCassandra" displayName="com.genesyslab.common.application.cassandraServer" hidden="true" mandatory="false">
    <description><![CDATA[ Start Cassandra Server]]></description>
    <valid-description><![CDATA[]]></valid-description>
    <effective-description/>
    <format type="string" default="false"/>
    <validation>
    </validation>
    </parameter>
  5. Update the [Cassandra] section of the Feature Server application as shown in the following table:
  6. [Cassandra] section Option Default Value Feature Server Application Value Mandatory

    nodes

    NA

    Configure all the Cassandra nodes IP addresses that belong to the data center where Feature Server is installed.

    Yes

    nodeFailureTolerance

    Replication factor of Feature Server data center is 1.


    If the regional keyspace is used, then the least value (keyspace, regional keyspace) replication_factor of its data center is 1.


    For example, if the DC1 contains 4 nodes and the replication_factor for the global keyspace is 3 and the regional keyspace is 2, then the value is 1.

    No

    keyspace

    sipfs

    Name of the 'global' keyspace

    This option must have the same value as the keyspace name parameter for the copyKeyspaceSchema.py script when copying the global keyspace.

    No

    replicationStrategyClassName

    NA

    This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script when copying both the global keyspace and the regional keyspace values.

    Yes

    replicationOptions

    NA

    This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script.

    Yes

    regionalKeyspace

    sipfs_<region>

    Name of the regional keyspace

    This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script when copying the regional keyspace.

    Mandatory if regional keyspace(s) is enabled.

    regionalReplicationOptions

    NA

    This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script.

    Mandatory if regional keyspace(s) is enabled.

    username

    cassandra

    Cassandra Username

    Mandatory if authentication is enabled in Cassandra Cluster.

    password

    cassandra

    Cassandra Password

    Mandatory if authentication is enabled in Cassandra Cluster.

  7. Start Feature Server node.

Deactivating Embedded Cassandra module for version 8.1.203 and later

If you have upgraded SIP Feature Server to version 8.1.203 and later and used external Cassandra in your deployment, you can deactivate the Embedded Cassandra module from the deployment. Note that deactivating the Embedded Cassandra module is recommended but it is an optional step.

To deactivate the Embedded Cassandra module:

  1. Locate the start.ini file in the path: <FS installation folder>/start.ini.
  2. Open the file with a text editor and remove the line --module=fs-cass11.
  3. Save the file.
  4. Restart SIP Feature Server if it is running.
  5. After restart, remove the installation files from the <FS installation folder>/lib/fs-cass11 folder.
This page was last edited on January 23, 2024, at 06:37.
Comments or questions about this documentation? Contact us for support!