Synchronize Data between Two Hazelcast Clusters using WAN Sync
Learn how to synchronize data across two Hazelcast clusters using WAN Sync.
Context
In this tutorial, you’ll do the following:
-
Deploy two Hazelcast clusters.
-
Create a Hazelcast map configuration on one of the clusters.
-
Synchronize map data between the two Hazelcast clusters using Full and Delta WAN Sync.
Before you Begin
Before starting this tutorial, make sure you have the following:
-
A running Kubernetes cluster
-
The Kubernetes command-line tool, kubectl
-
A deployed Hazelcast Platform Operator
Step 1. Start the Hazelcast Clusters
In this step, you’ll start two Hazelcast clusters, one called hazelcast-first
and one called hazelcast-second
.
-
Create a secret with your Hazelcast Enterprise License.
kubectl create secret generic hazelcast-license-key --from-literal=license-key=<hz-license-key>
-
Start the first Hazelcast cluster.
kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: Hazelcast metadata: name: hazelcast-first spec: licenseKeySecretName: hazelcast-license-key exposeExternally: type: Unisocket discoveryServiceType: LoadBalancer EOF
-
Check the status of the cluster to ensure it is running.
kubectl get hazelcast
NAME STATUS MEMBERS hazelcast-first Running 3/3
-
Start the second cluster
kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: Hazelcast metadata: name: hazelcast-second spec: licenseKeySecretName: hazelcast-license-key exposeExternally: type: Unisocket discoveryServiceType: LoadBalancer EOF
-
Check the status to ensure that both clusters are running.
kubectl get hazelcast
NAME STATUS MEMBERS hazelcast-first Running 3/3 hazelcast-second Running 3/3
Step 2. Create and Populate a Map for WAN Replication
In this step, you will create and populate a map on the hazelcast-first
cluster. This map will be the source for WAN Replication.
-
Create the configuration for a map called
map4Sync
on thehazelcast-first
cluster.If you need Full WAN Synchronization, use the following command.
kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: Map metadata: name: map4Sync spec: hazelcastResourceName: hazelcast-first EOF
If you need Delta WAN Synchronization, use the following command.
kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: Map metadata: name: map4Sync spec: hazelcastResourceName: hazelcast-first merkleTree: depth: 10 EOF
Aren’t sure whether you need Full or Delta? See Full WAN Synchronization and Delta WAN Synchronization for details.
-
Find the address of the first cluster.
kubectl get hazelcastendpoint --selector="app.kubernetes.io/instance in (hazelcast-first)"
NAME TYPE ADDRESS hazelcast-first Discovery 34.123.9.149:5701 hazelcast-first-wan WAN 34.123.9.149:5710
The
ADDRESS
column displays the external address of the first Hazelcast cluster. -
If you want to use a client other than Hazelcast CLC, download the repository for the client code:
git clone https://github.com/hazelcast-guides/hazelcast-platform-operator-wan-sync.git cd hazelcast-platform-operator-wan-sync
You will find language-specific client code in the
docs/modules/ROOT/examples/operator-wan-sync
directory. -
Configure the Hazelcast client to connect to the first cluster using its address.
Before using CLC, it should be installed on your system. Check the installation instructions for CLC: Installing the Hazelcast CLC. Run the following command to add the first cluster config to the CLC.
clc config add hz-1 cluster.name=dev cluster.address=<FIRST-CLUSTER-EXTERNAL-IP>
package com.hazelcast; import com.hazelcast.client.HazelcastClient; import com.hazelcast.client.config.ClientConfig; import com.hazelcast.core.HazelcastInstance; import com.hazelcast.map.IMap; import java.util.Random; public class Main { public static void main(String[] args) throws Exception { if(args.length != 2) { System.out.println("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`."); } else if (!((args[0].equals("fill") || args[0].equals("size")))) { System.out.println("Wrong argument, you should pass: fill or size"); } else{ ClientConfig config = new ClientConfig(); config.getNetworkConfig().addAddress("<EXTERNAL-IP>"); HazelcastInstance client = HazelcastClient.newHazelcastClient(config); System.out.println("Successful connection!"); String mapName = args[1]; IMap<String, String> map = client.getMap(mapName); if (args[0].equals("fill")) { System.out.printf("Starting to fill the map (%s) with random entries.\n", mapName); Random random = new Random(); while (true) { int randomKey = random.nextInt(100); map.put("key-" + randomKey, "value-" + randomKey); System.out.println("Current map size: " + map.size()); } } else { System.out.printf("The map (%s) size: (%d)\n\n", mapName, map.size()); client.shutdown(); } } } }
'use strict'; const { Client } = require('hazelcast-client'); const clientConfig = { network: { clusterMembers: [ '<EXTERNAL-IP>' ] } }; (async () => { try { if (process.argv.length !== 4) { console.error('You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.'); } else if (!(process.argv[2] === 'fill' || process.argv[2] === 'size')) { console.error('Wrong argument, you should pass: fill or size'); } else { const client = await Client.newHazelcastClient(clientConfig); const mapName = process.argv[3] const map = await client.getMap(mapName); await map.put('key', 'value'); const res = await map.get('key'); if (res !== 'value') { throw new Error('Connection failed, check your configuration.'); } console.log('Successful connection!'); if (process.argv[2] === 'fill'){ console.log(`Starting to fill the map (${mapName}) with random entries.`); while (true) { const randomKey = Math.floor(Math.random() * 100); await map.put('key' + randomKey, 'value' + randomKey); const size = await map.size(); console.log(`Current map size: ${size}`); } } else { const size = await map.size(); console.log(`The map (${mapName}) size: ${size}`); } } } catch (err) { console.error('Error occurred:', err); } })();
package main import ( "context" "fmt" "math/rand" "os" "github.com/hazelcast/hazelcast-go-client" ) func main() { if len(os.Args) != 3 { fmt.Println("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.") return } if os.Args[1] != "fill" && os.Args[1] != "size" { fmt.Println("Wrong argument, pass `fill` or `size` instead.") return } config := hazelcast.Config{} cc := &config.Cluster cc.Network.SetAddresses("a5fec28eae167431eb081eba49fc7e57-1903362080.us-east-1.elb.amazonaws.com:5701") cc.Unisocket = true ctx := context.TODO() client, err := hazelcast.StartNewClientWithConfig(ctx, config) if err != nil { panic(err) } fmt.Println("Successful connection!") mapName := os.Args[2] m, err := client.GetMap(ctx, mapName) if err != nil { panic(err) } if os.Args[1] == "fill" { fmt.Printf("Starting to fill the map (%s) with random entries.\n", mapName) for { num := rand.Intn(100) key := fmt.Sprintf("key-%d", num) value := fmt.Sprintf("value-%d", num) if _, err = m.Put(ctx, key, value); err != nil { fmt.Println("ERR:", err.Error()) continue } mapSize, err := m.Size(ctx) if err != nil { fmt.Println("ERR:", err.Error()) continue } fmt.Println("Current map size:", mapSize) } return } mapSize, err := m.Size(ctx) if err != nil { fmt.Println("ERR:", err.Error()) return } fmt.Printf("The map (%s) size: %v", mapName, mapSize) }
import logging import random import sys import hazelcast logging.basicConfig(level=logging.INFO) if len(sys.argv) != 3: print("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.") elif not (sys.argv[1] == "fill" or sys.argv[1] == "size"): print("Wrong argument, you should pass: fill or size") else: client = hazelcast.HazelcastClient( cluster_members=["<EXTERNAL-IP>"], use_public_ip=True, ) print("Successful connection!", flush=True) mapName = sys.argv[2] m = client.get_map(mapName).blocking() if sys.argv[1] == "fill": print(f'Starting to fill the map ({mapName}) with random entries.', flush=True) while True: random_number = str(random.randrange(0, 100000)) m.put("key-" + random_number, "value-" + random_number) print("Current map size:", m.size()) else: print(f'The map ({mapName}) size: {m.size()}')
using System; using System.Threading.Tasks; using Hazelcast; using Microsoft.Extensions.Logging; namespace Client { public class Program { static async Task Main(string[] args) { if (args.Length != 2) { Console.WriteLine("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`."); return; } if (!(args[0] == "fill" || args[0] == "size")) { Console.WriteLine("Wrong argument, you should pass: fill or size"); return; } var mapName = args[1]; var options = new HazelcastOptionsBuilder() .With(args) .With((configuration, options) => { options.LoggerFactory.Creator = () => LoggerFactory.Create(loggingBuilder => loggingBuilder .AddConsole()); options.Networking.UsePublicAddresses = true; options.Networking.SmartRouting = false; options.Networking.Addresses.Add("<EXTERNAL-IP>:5701"); }) .Build(); await using var client = await HazelcastClientFactory.StartNewClientAsync(options); Console.WriteLine("Successful connection!"); Console.WriteLine("Starting to fill the map with random entries."); var map = await client.GetMapAsync<string, string>(mapName); var random = new Random(); if (args[0] == "fill") { Console.WriteLine("Starting to fill the map with random entries."); while (true) { var num = random.Next(100); var key = $"key-{num}"; var value = $"value-{num}"; await map.PutAsync(key, value); var mapSize = await map.GetSizeAsync(); Console.WriteLine($"Current map size: {mapSize}"); } } else { var mapSize = await map.GetSizeAsync(); Console.WriteLine($"Current map size: {mapSize}"); await client.DisposeAsync(); } } } }
-
Add entries to the map.
Execute the following command to populate the map with entries, replacing
<MAP-NAME>
with the actual map name,map4Sync
.for i in {1..10}; do clc -c hz-1 map set --name <MAP-NAME> key-$i value-$i; done
Verify that the map has the expected number of entries.
clc -c hz-1 map size --name <MAP-NAME>
Start the application to populate the map with entries, replacing
<MAP-NAME>
with the actual map name,map4Sync
.cd java mvn package java -jar target/*jar-with-dependencies*.jar fill <MAP-NAME>
You should see the following output.
Successful connection! Starting to fill the map (<MAP-NAME>) with random entries. Current map size: 2 Current map size: 3 Current map size: 4 .... ....
Start the application to populate the map with entries, replacing
<MAP-NAME>
with the actual map name,map4Sync
.cd nodejs npm install npm start fill <MAP-NAME>
You should see the following output.
Successful connection! Starting to fill the map (<MAP-NAME>) with random entries. Current map size: 2 Current map size: 3 Current map size: 4 .... ....
Start the application to populate the map with entries, replacing
<MAP-NAME>
with the actual map name,map4Sync
.cd go go run main.go fill <MAP-NAME>
You should see the following output.
Successful connection! Starting to fill the map (<MAP-NAME>) with random entries. Current map size: 2 Current map size: 3 Current map size: 4 .... ....
Start the application to populate the map with entries, replacing
<MAP-NAME>
with the actual map name,map4Sync
.cd python pip install -r requirements.txt python main.py fill <MAP-NAME>
You should see the following output.
Successful connection! Starting to fill the map (<MAP-NAME>) with random entries. Current map size: 2 Current map size: 3 Current map size: 4 .... ....
Start the application to populate the map with entries, replacing
<MAP-NAME>
with the actual map name,map4Sync
.cd dotnet dotnet build dotnet run fill <MAP-NAME>
You should see the following output.
Successful connection! Starting to fill the map (<MAP-NAME>) with random entries. Current map size: 2 Current map size: 3 Current map size: 4 .... ....
Step 3. Enable WAN Replication and Replicate Entries
In this step, you’ll first verify that the second cluster does not contain a map4Sync
structure. You’ll then enable WAN Replication and verify that the map and all entries have been copied to the second cluster.
-
Find the address of the second cluster.
kubectl get hazelcastendpoint --selector="app.kubernetes.io/instance in (hazelcast-second)"
NAME TYPE ADDRESS hazelcast-second Discovery 34.16.0.16:5701 hazelcast-second-wan WAN 34.16.0.16:5710
The
ADDRESS
column displays the external address of the second Hazelcast cluster. -
Repeat 2.3, using the address of the second cluster, to enable the client to connect to
-
Connect to the second cluster and verify that the map named
map4Sync
contains no data.clc -c hz-2 map size --name <MAP-NAME>
cd clients/java mvn package java -jar target/*jar-with-dependencies*.jar size <MAP-NAME>
You should see the following output:
Successful connection! Current map (<MAP-NAME>) size: 0
cd clients/nodejs npm install npm start size <MAP-NAME>
You should see the following output:
Successful connection! Current map (<MAP-NAME>) size: 0
cd clients/go go run main.go size <MAP-NAME>
You should see the following output:
Successful connection! Current map (<MAP-NAME>) size: 0
cd clients/python pip install -r requirements.txt python main.py size <MAP-NAME>
You should see the following output:
Successful connection! Current map (<MAP-NAME>) size: 0
cd clients/dotnet dotnet build dotnet run size <MAP-NAME>
You should see the following output:
Successful connection! Current map (<MAP-NAME>) size: 0
-
Modify the configuration of the first cluster to add the address of the second cluster as the WAN Replication event target.
If you need a Full WAN Sync, run the following command to apply the configuration.
kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: WanReplication metadata: name: wan-replication spec: resources: - name: hazelcast-first kind: Hazelcast targetClusterName: dev endpoints: "<SECOND-CLUSTER-EXTERNAL-IP>" EOF
If you need a Delta WAN Sync, run the following command to apply the configuration.
kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: WanReplication metadata: name: wan-replication spec: resources: - name: hazelcast-first kind: Hazelcast targetClusterName: dev synConsistencyCheckStrategy: "MERKLE_TREES" endpoints: "<SECOND-CLUSTER-EXTERNAL-IP>" EOF
-
Create a WAN Sync resource within the Kubernetes cluster, using the existing
WanReplication
CR.kubectl apply -f - <<EOF apiVersion: hazelcast.com/v1alpha1 kind: WanSync metadata: name: wan-sync spec: wanReplicationResourceName: wan-replication EOF
WAN Sync ensures data consistency between the two Hazelcast clusters. Full WAN Sync transmits all data from the source cluster to the target cluster, aligning the state of the target
IMap
with the sourceIMap
. This method is particularly beneficial when the synchronization between two remote clusters is lost due to WAN queue overflows or cluster restarts. -
Run the following command to see the WAN synchronization status:
kubectl get wansync wan-sync
The output should be similar to the following:
NAME STATUS wan-sync Completed
-
Repeat step 3.3 to verify that the
map4Sync
structure on thehazelcast-second
cluster now contains data.