Contents
Migrate data from Cassandra database versions
Cassandra versions 2.x and higher do not support backward compatibility with Cassandra versions 1.x. The data migration is required when upgrading Feature Server's Cassandra database backend from versions 1.x to versions 2.x and/or 3.x. Feature Server release 8.1.202.02 includes the following Python scripts for migrating data from Cassandra database version 1.x to versions 2.x and 3.x:
- copyKeyspaceSchema.py—Creates a keyspace and its column families in the destination Cassandra cluster.
- copyKeyspaceColumnFamilies.py—Copies content of source keyspace column families to the destination keyspace column families.
Pre-requisites for data migration
The following are the pre-requisites for the data migration from versions 1.x to versions 2.x and/or 3.x.
- Destination Cassandra cluster must be deployed and all the nodes must be up and running.
- In terms of Feature Server deployment, the destination Cassandra cluster must be deployed in external mode.
- The destination Cassandra cluster must not have any Feature Servers assigned to it before the copying of data from the source Cassandra cluster is completed.
Run Cassandra database migration scripts
The following steps show how to migrate data from Cassandra v1.x to v2.x and v3.x
- Deploy the Python scripts.
- Run the Python scripts.
- Connect Feature Server nodes to migrated Cassandra cluster.
Deploy the Python scripts
- Install Python 2.7.5 32-bit version and Pycassa libraries on the destination Cassandra host where the scripts must be run.
- The Python scripts copyKeyspaceSchema.py, copyKeyspaceColumnFamilies.py and the sample json input file, copyKeyspaceInput.json are present in the Python utilities folder of Feature Server deployment: FS installation path/Python/util/. Copy these script files to a directory on the destination Cassandra host.
- Navigate to the directory location and run the scripts.
For more details, refer to Python Scripts.
Run the Python scripts
Following is a sample copyKeyspaceInput.json input json file:
{"sourceHostPort": "FsNode01:9160",
"sourceHostUserName": "",
"sourceHostPassword": "",
"sourceHostTls": "false",
"destinationHostPort": "CassNode01:9160",
"destinationHostUserName": "",
"destinationHostPassword": "",
"destinationHostTls": "false",
"replicationStrategyClassName": "NetworkTopologyStrategy",
"replicationOptions": {"DC1":"2", "DC2":"2"},
"sourceKeyspace": "sipfs",
"destinationKeyspace": "sipfs",
"excludedCFs": [ ],
"includedCFs": [ ] }
Copy keyspace schema
The following steps show the procedure to copy the keyspace schema:
- Verify that the input json file has the following parameters:
- Run the copyKeyspaceSchema.py script.
Parameters |
Description |
Sample |
Mandatory |
sourceHostPort |
Host and the Thrift port of source Cassandra DB in the URL format: host IP:port |
FsNode01:9160 |
Yes |
destinationHostPort |
Host and the Thrift port of destination Cassandra database in the URL format: host IP:port |
CassNode01:9160 |
Yes |
sourceKeyspace |
Name of the source keyspace |
sipfs |
Yes |
destinationKeyspace |
Name of the destination keyspace |
sipfs |
Yes |
replicationStrategyClassName |
Replication Strategy Class Name |
NetworkTopologyStrategy |
Yes |
replicationOptions |
Replication Options for the destination keyspace
|
{"DC1": "2", "DC2": "2"} |
Yes |
sourceHostUserName |
The username of source Cassandra. |
FSadmin |
Yes, if authentication is enabled in the source Cassandra Cluster. |
sourceHostPassword |
The password of source Cassandra. |
FSadmin |
Yes, if authentication is enabled in the source Cassandra Cluster. |
sourceHostTls |
Set this option to true when SSL is enabled for the source Cassandra connection. |
true |
Yes, if SSL is enabled for the source Cassandra. |
destinationHostUserName |
The username of destination Cassandra. |
FSadmin |
Yes, if authentication is enabled in the destination Cassandra Cluster. |
destinationHostPassword |
The password of destination Cassandra. |
FSadmin |
Yes, if authentication is enabled in the destination Cassandra Cluster. |
destinationHostTls |
Set this option to true when SSL is enabled for the destination Cassandra connection. |
true |
Yes, if SSL is enabled for the destination Cassandra. |
Sample command line
python ./copyKeyspaceSchema.py -i ./copyKeyspaceInput.json -o ./copyKeyspaceSchema_`date +%y%m%d-%H:%M`.log
Copy keyspace column families
- Verify that the input json file has the following parameters:
- Run the copyKeyspaceColumnFamilies.py script.
Parameters |
Description |
Sample |
Mandatory |
sourceHostPort |
Host and the Thrift port of source Cassandra database in the URL format: host IP:port |
FsNode01:9160 |
Yes |
destinationHostPort |
Host and the Thrift port of destination Cassandra database in the URL format: host IP:port |
CassNode01:9160 |
Yes |
sourceKeyspace |
Name of the source keyspace |
sipfs |
Yes |
destinationKeyspace |
Name of the destination keyspace |
sipfs |
Yes |
excludedCFs |
List of comma-separated column family names to be excluded from copying while running the copyKeyspaceColumnFamilies.py script. |
message_bytes, device |
No |
includedCFs |
List of comma-separated column family names to be copied while running the copyKeyspaceColumnFamilies.py script. |
message_bytes, device |
No |
sourceHostUserName |
The username of source Cassandra. |
FSadmin |
Yes, if authentication is enabled in the source Cassandra Cluster. |
sourceHostPassword |
The password of source Cassandra. |
FSadmin |
Yes, if authentication is enabled in the source Cassandra Cluster. |
sourceHostTls |
Set this option to true when SSL is enabled for the source Cassandra connection. |
true |
Yes, if SSL is enabled for the source Cassandra. |
destinationHostUserName |
The username of destination Cassandra. |
FSadmin |
Yes, if authentication is enabled in the destination Cassandra Cluster. |
destinationHostPassword |
The password of destination Cassandra. |
FSadmin |
Yes, if authentication is enabled in the destination Cassandra Cluster. |
destinationHostTls |
Set this option to true when SSL is enabled for the destination Cassandra connection. |
true |
Yes, if SSL is enabled for the destination Cassandra. |
If one or more source column families contain huge volumes of data, then run the copyKeyspaceColumnFamilies.py script to copy these column families separately from the rest of the source column families. Use the excludedCFs and includedCFs parameters to exclude or include a specific column family. When the includedCFs list is not empty, the excludedCFs parameter is ignored and only the column families in the includedCFs list are copied.
{"sourceHostPort": "FsNode01:9160",
"sourceHostUserName": "",
"sourceHostPassword": "",
"sourceHostTls": "false",
"destinationHostPort": "CassNode01:9160",
"destinationHostUserName": "",
"destinationHostPassword": "",
"destinationHostTls": "false",
"replicationStrategyClassName": "NetworkTopologyStrategy",
"replicationOptions": {"DC1": "2", "DC2": "2"},
"sourceKeyspace": "sipfs",
"destinationKeyspace": "sipfs",
"excludedCFs": [ “message_bytes” ],
"includedCFs": [ ] }
{"sourceHostPort": "FsNode01:9160"
"sourceHostUserName": "",
"sourceHostPassword": "",
"sourceHostTls": "false",
"destinationHostPort": "CassNode01:9160",
"destinationHostUserName": "",
"destinationHostPassword": "",
"destinationHostTls": "false",
"replicationStrategyClassName": "NetworkTopologyStrategy",
"replicationOptions": {"DC1": "2", "DC2": "2"},
"sourceKeyspace": "sipfs",
"destinationKeyspace": "sipfs",
"excludedCFs": [],
"includedCFs": [ “message_bytes” ] }
Sample command line
python ./copyKeyspaceColumnFamilies.py -i ./copyKeyspaceInput.json -o ./copyKeyspaceContent_`date +%y%m%d-%H:%M`.log
Connecting Feature Server nodes to migrated Cassandra cluster
The following steps should be performed for every Feature Server node involved:
- Stop Feature Server node.
- Edit <FS installation path>\launcher.xml file and set the property startCassandra to False.
- Update the [Cassandra] section of the Feature Server application as shown in the following table:
- Start Feature Server node.
<parameter name="startCassandra" displayName="com.genesyslab.common.application.cassandraServer" hidden="true" mandatory="false">
<description><![CDATA[ Start Cassandra Server]]></description>
<valid-description><![CDATA[]]></valid-description>
<effective-description/>
<format type="string" default="false"/>
<validation>
</validation>
</parameter>
[Cassandra] section Option | Default Value | Feature Server Application Value | Mandatory |
nodes |
NA |
Configure all the Cassandra nodes IP addresses that belong to the data center where Feature Server is installed. |
Yes |
nodeFailureTolerance |
Replication factor of Feature Server data center is 1.
|
No | |
keyspace |
sipfs |
Name of the 'global' keyspace This option must have the same value as the keyspace name parameter for the copyKeyspaceSchema.py script when copying the global keyspace. |
No |
replicationStrategyClassName |
NA |
This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script when copying both the global keyspace and the regional keyspace values. |
Yes |
replicationOptions |
NA |
This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script. |
Yes |
regionalKeyspace |
sipfs_<region> |
Name of the regional keyspace This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script when copying the regional keyspace. |
Mandatory if regional keyspace(s) is enabled. |
regionalReplicationOptions |
NA |
This option must have the same value as the replication options parameters for the copyKeyspaceSchema.py script. |
Mandatory if regional keyspace(s) is enabled. |
username |
cassandra |
Cassandra Username |
Mandatory if authentication is enabled in Cassandra Cluster. |
password |
cassandra |
Cassandra Password |
Mandatory if authentication is enabled in Cassandra Cluster. |