Using the Data Migration Tool
Hazelcast Enterprise Feature
This guide explains how to migrate data from Hazelcast 4.x or 5.x clusters to newer 5.x clusters. The purpose of Data Migration Tool (DMT) is to migrate data from your Open Source or Enterprise clusters, to either a newer Enterprise cluster or a cluster on Hazelcast Viridian.
Currently, the Data Migration Tool (DMT) supports:
-
migrating data only for maps and replicated maps
-
migrating data from Hazelcast 4.x.y and 5.x.y clusters to Hazelcast 5.3.2 and newer clusters.
Before You Begin
-
The data migration tool is referred to as DMT throughout this guide.
-
The DMT is available as a separate download on Hazelcast Viridian and Hazelcast web site.
-
The steps in this guide require that you have Docker and Hazelcast Command-Line Client (CLC) installed.
This guide consists of two sets of instructions; one for trying the DMT, other to use it with your existing cluster and maps.
Trying Out the DMT
The instructions below for trying out the DMT use a map in Hazelcast 4.x.y cluster whose data is migrated to that of Hazelcast 5.3.2 cluster.
The steps can be summarized as follows:
-
Start a source cluster.
-
Create a map and put data into it.
-
Start the target cluster.
-
Start the migration cluster.
-
Migrate map data from the source cluster to the target.
-
Verify the data in the target cluster.
Step 1. Start the Source Cluster
Run the following Docker command to start a one-member source cluster.
$ docker run -p 127.0.0.1:5701:5701 -e HZ_CLUSTERNAME=source hazelcast/hazelcast:4.2.7
The Hazelcast version can be any of 4.x.y and 5.x.y. We use 4.2.7 as an example here.
|
This member is started using an internal Docker IP and the port 5701
on that IP address, something like 172.12.0.1:5701
.
Since the Docker container is in a separate network, it does not collide with the running processes on your
local machine. For example, in this case, the member can run on 127.0.0.1:5701
, and our container can run on 172.12.0.1:5701
.
Instead of using Docker to start a member, you can also download one of the 4.x.y packages from Hazelcast web site and start a member. Make sure to set the cluster’s name as source .
|
INFO: Since the member started with Docker runs in a virtual network, we need to make it accessible to your local processes because migration and target
clusters, CLC, and DMT will be processes outside the Docker environment, directly running on your computer. The -p
option in the Docker command allows you to
map a container’s port to the host machine. The command assigns 5701
as the container’s port, so 127.0.0.1:5701
becomes the Hazelcast member’s address.
127.0.0.1
refers to your local machine’s address. Make sure there is nothing else running on port 5701
on your local machine.
Step 2. Create a Map and Put Data
The source.yaml
, which is contained in the DMT download package, is the Hazelcast CLC configuration file to access the source cluster to populate data.
You can modify the address
field in this file, if needed, in case you used a different port mapping than 5701
while starting the member,
or you started a cluster without using Docker and it started on a port other than 5701
.
The content of source.yaml
:
cluster:
name: "source"
address: "127.0.0.1:5701"
Check that you can put an entry to the source cluster by running the following CLC command on your terminal.
clc -c source.yaml map --name my-map set key-1 value-1
If you get an error related to CLC not being able to connect the source cluster, check your port mapping is correct, the source cluster container is running,
and source.yaml
configuration is correct.
If there are no errors, you can run the following script on your terminal to populate 1000 entries to the source cluster.
for i in {1..1000}; do clc -c source.yaml map --name my-map set key-$i value-$i --quiet; done && echo OK
for /l %x in (1, 1, 1000) do clc -c source.yaml map --name my-map set key-%x value-%x --quiet
You can also use alternative ways to create a map and put data. |
Step 3. Start the Target Cluster
The target cluster must be of version Hazelcast 5.3.2 or newer.
Run the following Docker command to start a one-member target cluster.
docker run -e HZ_NETWORK_PORT_PORT=5901 -p 127.0.0.1:5901:5901
-e HZ_LICENSEKEY="<license>" (1)
-e HZ_CLUSTERNAME=target hazelcast/hazelcast-enterprise:{full-version}-slim (2)
1 | If you don’t have a Hazelcast Enterprise license key, you can either use a Hazelcast Viridian trial cluster or request a trial license. |
2 | If you want to use the full distribution, remove the -slim postfix. See Distributions to learn about the differences between full and slim distributions. |
The cluster tries to start on port 5901
if it’s available. It chooses another port otherwise. You can check the port in the cluster logs shown on the terminal.
The configuration file for the target cluster is target.yaml
, which can be found in the folder where you extracted the DMT package.
In case the cluster starts on a different port as you notice within the cluster logs, you must change the address
field in target.yaml
accordingly to match the port number.
Instead of Docker, you can also by download Hazelcast 5.3.2 or newer, request a trial license, and start a cluster. |
Step 4. Start the Migration Cluster
Run the following command in the folder where you extracted the DMT package.
$ HZ_NETWORK_PORT_PORT=5702 HZ_CLUSTERNAME=migration ./bin/hz start
The cluster tries to start on port 5702
if it’s available. It chooses another port otherwise. You can check the port in the cluster logs shown on the terminal.
The configuration file for the migration cluster is migration.yaml
, which can be found in the folder where you extracted the DMT package.
In case the cluster starts on a different port as you notice within the cluster logs, you must change the address
field in migration.yaml
accordingly to match the port number.
The migration.yaml file used by DMT is in Hazelcast CLC configuration format, i.e., DMT and CLC uses the same configuration format. DMT uses this configuration file to connect to the migration cluster as described in the next step below.
|
Step 5. Start the Migration using DMT
Go to bin
in the folder where you extracted the DMT package; there are DMT binaries in the format dmt_[platform]_[arch]
. Note the binary suitable
for your machine; you need to know your operating system and your processor architecture. For arm
, choose the suitable arm64
binary, and for Intel, choose the suitable amd64
binary.
Run the following command in the folder where you extracted the DMT package, to start the migration.
$ ./bin/dmt_[platform]_[arch] --config migration.yaml start migration_config --yes
If you are on macOS, and the above command is rejected by the operating system, click OK
, go to Privacy & Security
settings of the machine, and allow the dmt*`binary to run.
Then retry and click `Open
on the OS' dialog.
Step 6. Verify the Data in Target Cluster
You can verify the size of the map in the target cluster using Hazelcast CLC. Run the following CLC commands to see the size of the map, and the value of a random key from the data we put in Step 2.
$ clc -c target.yaml map size --name my-map
1000
OK
$ clc -c target.yaml map get key-42 --name my-map
value-42
OK
The target.yaml
file is used by Hazelcast CLC to connect to the target cluster and verify the data is migrated.
Alternatively, you can use the Hazelcast Management Center to verify the data in the target cluster.
Using the DMT for an Existing Cluster
To migrate data from your existing cluster (source), skip Step 1 and Step 2 in Trying Out the DMT above.
Before starting with Step 3 and moving forward, you need to
-
Check the
migration_config/source/hazelcast-client.yaml
file in the folder where you extracted the DMT package. Modify thecluster-name
andcluster-members
fields in this file, such that they match the name of your existing cluster and addresses of the cluster members. -
Check the
migration_config/data/imap_names.txt
andmigration_config/data/replicated_map_names.txt
files in the folder where you extracted the DMT package. Modify the content of these files, such that they match the names of your existing maps and replicated maps. If you have multiple maps/replicated maps, put one map name per line.