Synchronize Data between Two Hazelcast Clusters using WAN Sync

Learn how to synchronize data across two Hazelcast clusters using WAN Sync.

Context

In this tutorial, you’ll do the following:

  • Deploy two Hazelcast clusters.

  • Create a Hazelcast map configuration on one of the clusters.

  • Synchronize map data between the two Hazelcast clusters using Full and Delta WAN Sync.

Before you Begin

Before starting this tutorial, make sure you have the following:

Step 1. Start the Hazelcast Clusters

In this step, you’ll start two Hazelcast clusters, one called hazelcast-first and one called hazelcast-second.

  1. Create a secret with your Hazelcast Enterprise License.

    kubectl create secret generic hazelcast-license-key --from-literal=license-key=<hz-license-key>
  2. Start the first Hazelcast cluster.

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: Hazelcast
    metadata:
      name: hazelcast-first
    spec:
      licenseKeySecretName: hazelcast-license-key
      exposeExternally:
        type: Unisocket
        discoveryServiceType: LoadBalancer
    EOF
  3. Check the status of the cluster to ensure it is running.

    kubectl get hazelcast
    NAME               STATUS    MEMBERS
    hazelcast-first    Running   3/3
  4. Start the second cluster

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: Hazelcast
    metadata:
      name: hazelcast-second
    spec:
      licenseKeySecretName: hazelcast-license-key
      exposeExternally:
        type: Unisocket
        discoveryServiceType: LoadBalancer
    EOF
  5. Check the status to ensure that both clusters are running.

    kubectl get hazelcast
    NAME               STATUS    MEMBERS
    hazelcast-first     Running   3/3
    hazelcast-second    Running   3/3

Step 2. Create and Populate a Map for WAN Replication

In this step, you will create and populate a map on the hazelcast-first cluster. This map will be the source for WAN Replication.

  1. Create the configuration for a map called map-sync on the hazelcast-first cluster.

    If you need Full WAN Synchronization, use the following command.

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: Map
    metadata:
      name: map-sync
    spec:
      hazelcastResourceName: hazelcast-first
    EOF

    If you need Delta WAN Synchronization, use the following command.

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: Map
    metadata:
      name: map-sync
    spec:
      hazelcastResourceName: hazelcast-first
      merkleTree:
        depth: 10
    EOF

    Aren’t sure whether you need Full or Delta? See Full WAN Synchronization and Delta WAN Synchronization for details.

  2. Find the address of the first cluster.

    kubectl get hazelcastendpoint --selector="app.kubernetes.io/instance in (hazelcast-first)"
    NAME                   TYPE        ADDRESS
    hazelcast-first        Discovery   34.123.9.149:5701
    hazelcast-first-wan    WAN         34.123.9.149:5710

    The ADDRESS column displays the external address of the first Hazelcast cluster.

  3. If you want to use a client other than Hazelcast CLC, download the repository for the client code:

    git clone https://github.com/hazelcast-guides/hazelcast-platform-operator-wan-sync.git
    cd hazelcast-platform-operator-wan-sync

    You will find language-specific client code in the docs/modules/ROOT/examples/operator-wan-sync directory.

  4. Configure the Hazelcast client to connect to the first cluster using its address.

    • CLC

    • Java

    • NodeJS

    • Go

    • Python

    • .NET

    Before using CLC, it should be installed on your system. Check the installation instructions for CLC: Installing the Hazelcast CLC.

    Run the following command to add the first cluster config to the CLC.

    clc config add hz-1 cluster.name=dev cluster.address=<FIRST-CLUSTER-EXTERNAL-IP>
    package com.hazelcast;
    
    import com.hazelcast.client.HazelcastClient;
    import com.hazelcast.client.config.ClientConfig;
    import com.hazelcast.core.HazelcastInstance;
    import com.hazelcast.map.IMap;
    
    import java.util.Random;
    
    public class Main {
        public static void main(String[] args) throws Exception {
            if(args.length != 2) {
                System.out.println("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.");
            } else if (!((args[0].equals("fill") || args[0].equals("size")))) {
                System.out.println("Wrong argument, you should pass: fill or size");
            } else{
                ClientConfig config = new ClientConfig();
                config.getNetworkConfig().addAddress("<EXTERNAL-IP>");
    
                HazelcastInstance client = HazelcastClient.newHazelcastClient(config);
                System.out.println("Successful connection!");
    
                String mapName = args[1];
                IMap<String, String> map = client.getMap(mapName);
    
                if (args[0].equals("fill")) {
                    System.out.printf("Starting to fill the map (%s) with random entries.\n", mapName);
    
                    Random random = new Random();
                    while (true) {
                        int randomKey = random.nextInt(100);
                        map.put("key-" + randomKey, "value-" + randomKey);
                        System.out.println("Current map size: " + map.size());
                    }
                } else {
                    System.out.printf("The map (%s) size: (%d)\n\n", mapName, map.size());
                    client.shutdown();
                }
            }
    
        }
    }
    'use strict';
    
    const { Client } = require('hazelcast-client');
    
    const clientConfig = {
        network: {
            clusterMembers: [
                '<EXTERNAL-IP>'
            ]
        }
    };
    
    (async () => {
        try {
            if (process.argv.length !== 4) {
                console.error('You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.');
            } else if (!(process.argv[2] === 'fill' || process.argv[2] === 'size')) {
                console.error('Wrong argument, you should pass: fill or size');
            } else {
                const client = await Client.newHazelcastClient(clientConfig);
                const mapName = process.argv[3]
                const map = await client.getMap(mapName);
                await map.put('key', 'value');
                const res = await map.get('key');
                if (res !== 'value') {
                    throw new Error('Connection failed, check your configuration.');
                }
                console.log('Successful connection!');
                if (process.argv[2] === 'fill'){
                    console.log(`Starting to fill the map (${mapName}) with random entries.`);
                    while (true) {
                        const randomKey = Math.floor(Math.random() * 100);
                        await map.put('key' + randomKey, 'value' + randomKey);
                        const size = await map.size();
                        console.log(`Current map size: ${size}`);
                    }
                } else {
                    const size = await map.size();
                    console.log(`The map (${mapName}) size: ${size}`);
                }
            }
        } catch (err) {
            console.error('Error occurred:', err);
        }
    })();
    package main
    
    import (
    	"context"
    	"fmt"
    	"math/rand"
    	"os"
    
    	"github.com/hazelcast/hazelcast-go-client"
    )
    
    func main() {
    	if len(os.Args) != 3 {
    		fmt.Println("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.")
    		return
    	}
    	if os.Args[1] != "fill" && os.Args[1] != "size" {
    		fmt.Println("Wrong argument, pass `fill` or `size` instead.")
    		return
    	}
    
    	config := hazelcast.Config{}
    	cc := &config.Cluster
    	cc.Network.SetAddresses("a5fec28eae167431eb081eba49fc7e57-1903362080.us-east-1.elb.amazonaws.com:5701")
    	cc.Unisocket = true
    	ctx := context.TODO()
    	client, err := hazelcast.StartNewClientWithConfig(ctx, config)
    	if err != nil {
    		panic(err)
    	}
    	fmt.Println("Successful connection!")
    
    	mapName := os.Args[2]
    	m, err := client.GetMap(ctx, mapName)
    	if err != nil {
    		panic(err)
    	}
    	if os.Args[1] == "fill" {
    		fmt.Printf("Starting to fill the map (%s) with random entries.\n", mapName)
    		for {
    			num := rand.Intn(100)
    			key := fmt.Sprintf("key-%d", num)
    			value := fmt.Sprintf("value-%d", num)
    			if _, err = m.Put(ctx, key, value); err != nil {
    				fmt.Println("ERR:", err.Error())
    				continue
    			}
    			mapSize, err := m.Size(ctx)
    			if err != nil {
    				fmt.Println("ERR:", err.Error())
    				continue
    			}
    			fmt.Println("Current map size:", mapSize)
    		}
    		return
    	}
    	mapSize, err := m.Size(ctx)
    	if err != nil {
    		fmt.Println("ERR:", err.Error())
    		return
    	}
    	fmt.Printf("The map (%s) size: %v", mapName, mapSize)
    }
    import logging
    import random
    import sys
    
    import hazelcast
    
    logging.basicConfig(level=logging.INFO)
    
    if len(sys.argv) != 3:
        print("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.")
    elif not (sys.argv[1] == "fill" or sys.argv[1] == "size"):
        print("Wrong argument, you should pass: fill or size")
    else:
        client = hazelcast.HazelcastClient(
            cluster_members=["<EXTERNAL-IP>"],
            use_public_ip=True,
        )
        print("Successful connection!", flush=True)
    
        mapName = sys.argv[2]
        m = client.get_map(mapName).blocking()
    
        if sys.argv[1] == "fill":
            print(f'Starting to fill the map ({mapName}) with random entries.', flush=True)
            while True:
                random_number = str(random.randrange(0, 100000))
                m.put("key-" + random_number, "value-" + random_number)
                print("Current map size:", m.size())
        else:
            print(f'The map ({mapName}) size: {m.size()}')
    using System;
    using System.Threading.Tasks;
    using Hazelcast;
    using Microsoft.Extensions.Logging;
    
    namespace Client
    {
        public class Program
        {
            static async Task Main(string[] args)
            {
                if (args.Length != 2)
                {
                    Console.WriteLine("You need to pass two arguments. The first argument must be `fill` or `size`. The second argument must be `mapName`.");
                    return;
                }
                if (!(args[0] == "fill" || args[0] == "size"))
                {
                    Console.WriteLine("Wrong argument, you should pass: fill or size");
                    return;
                }
    
                var mapName = args[1];
                var options = new HazelcastOptionsBuilder()                
                    .With(args)                
                    .With((configuration, options) =>
                    {
                        options.LoggerFactory.Creator = () => LoggerFactory.Create(loggingBuilder =>
                            loggingBuilder
                                .AddConsole());
    
                        options.Networking.UsePublicAddresses = true;
                        options.Networking.SmartRouting = false;
                        options.Networking.Addresses.Add("<EXTERNAL-IP>:5701");
                        
                    })
                    .Build();
    
    
    
                await using var client = await HazelcastClientFactory.StartNewClientAsync(options);
                
                Console.WriteLine("Successful connection!");
                Console.WriteLine("Starting to fill the map with random entries.");
    
                var map = await client.GetMapAsync<string, string>(mapName);
                var random = new Random();
    
                if (args[0] == "fill")
                {
                    Console.WriteLine("Starting to fill the map with random entries.");
                    while (true)
                    {
                        var num = random.Next(100);
                        var key = $"key-{num}";
                        var value = $"value-{num}";
                        await map.PutAsync(key, value);
                        var mapSize = await map.GetSizeAsync();
                        Console.WriteLine($"Current map size: {mapSize}");
                    }
                }
                else
                {
                    var mapSize = await map.GetSizeAsync();
                    Console.WriteLine($"Current map size: {mapSize}");
                    await client.DisposeAsync();
                }
            }
        } 
    }
  5. Add entries to the map.

    • CLC

    • Java

    • NodeJS

    • Go

    • Python

    • .NET

    Execute the following command to populate the map with entries, replacing <MAP-NAME> with the actual map name, map-sync.

    for i in {1..10};
    do
       clc -c hz-1 map set --name <MAP-NAME> key-$i value-$i;
    done

    Verify that the map has the expected number of entries.

    clc -c hz-1 map size --name <MAP-NAME>

    Start the application to populate the map with entries, replacing <MAP-NAME> with the actual map name, map-sync.

    cd java
    mvn package
    java -jar target/*jar-with-dependencies*.jar fill <MAP-NAME>

    You should see the following output.

    Successful connection!
    Starting to fill the map (<MAP-NAME>) with random entries.
    Current map size: 2
    Current map size: 3
    Current map size: 4
    ....
    ....

    Start the application to populate the map with entries, replacing <MAP-NAME> with the actual map name, map-sync.

    cd nodejs
    npm install
    npm start fill <MAP-NAME>

    You should see the following output.

    Successful connection!
    Starting to fill the map (<MAP-NAME>) with random entries.
    Current map size: 2
    Current map size: 3
    Current map size: 4
    ....
    ....

    Start the application to populate the map with entries, replacing <MAP-NAME> with the actual map name, map-sync.

    cd go
    go run main.go fill <MAP-NAME>

    You should see the following output.

    Successful connection!
    Starting to fill the map (<MAP-NAME>) with random entries.
    Current map size: 2
    Current map size: 3
    Current map size: 4
    ....
    ....

    Start the application to populate the map with entries, replacing <MAP-NAME> with the actual map name, map-sync.

    cd python
    pip install -r requirements.txt
    python main.py fill <MAP-NAME>

    You should see the following output.

    Successful connection!
    Starting to fill the map (<MAP-NAME>) with random entries.
    Current map size: 2
    Current map size: 3
    Current map size: 4
    ....
    ....

    Start the application to populate the map with entries, replacing <MAP-NAME> with the actual map name, map-sync.

    cd dotnet
    dotnet build
    dotnet run fill <MAP-NAME>

    You should see the following output.

    Successful connection!
    Starting to fill the map (<MAP-NAME>) with random entries.
    Current map size: 2
    Current map size: 3
    Current map size: 4
    ....
    ....

Step 3. Enable WAN Replication and Replicate Entries

In this step, you’ll first verify that the second cluster does not contain a map-sync structure. You’ll then enable WAN Replication and verify that the map and all entries have been copied to the second cluster.

  1. Find the address of the second cluster.

    kubectl get hazelcastendpoint --selector="app.kubernetes.io/instance in (hazelcast-second)"
    NAME                   TYPE        ADDRESS
    hazelcast-second       Discovery   34.16.0.16:5701
    hazelcast-second-wan   WAN         34.16.0.16:5710

    The ADDRESS column displays the external address of the second Hazelcast cluster.

  2. Repeat steps 2.3 to 2.4, using the address of the second cluster to enable the client to connect.

  3. Connect to the second cluster and verify that the map named map-sync contains no data.

    • CLC

    • Java

    • NodeJS

    • Go

    • Python

    • .NET

    clc -c hz-2 map size --name <MAP-NAME>
    cd clients/java
    mvn package
    java -jar target/*jar-with-dependencies*.jar size <MAP-NAME>

    You should see the following output:

    Successful connection!
    Current map (<MAP-NAME>) size: 0
    cd clients/nodejs
    npm install
    npm start size <MAP-NAME>

    You should see the following output:

    Successful connection!
    Current map (<MAP-NAME>) size: 0
    cd clients/go
    go run main.go size <MAP-NAME>

    You should see the following output:

    Successful connection!
    Current map (<MAP-NAME>) size: 0
    cd clients/python
    pip install -r requirements.txt
    python main.py size <MAP-NAME>

    You should see the following output:

    Successful connection!
    Current map (<MAP-NAME>) size: 0
    cd clients/dotnet
    dotnet build
    dotnet run size <MAP-NAME>

    You should see the following output:

    Successful connection!
    Current map (<MAP-NAME>) size: 0
  4. Modify the configuration of the first cluster to add the address of the second cluster as the WAN Replication event target.

    If you need a Full WAN Sync, run the following command to apply the configuration.

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: WanReplication
    metadata:
      name: wan-replication
    spec:
      resources:
        - name: hazelcast-first
          kind: Hazelcast
      targetClusterName: dev
      endpoints: "<SECOND-CLUSTER-EXTERNAL-IP>"
    EOF

    If you need a Delta WAN Sync, run the following command to apply the configuration.

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: WanReplication
    metadata:
      name: wan-replication
    spec:
      resources:
        - name: hazelcast-first
          kind: Hazelcast
      targetClusterName: dev
      syncConsistencyCheckStrategy: "MERKLE_TREES"
      endpoints: "<SECOND-CLUSTER-EXTERNAL-IP>"
    EOF
  5. Create a WAN Sync resource within the Kubernetes cluster, using the existing WanReplication CR.

    kubectl apply -f - <<EOF
    apiVersion: hazelcast.com/v1alpha1
    kind: WanSync
    metadata:
      name: wan-sync
    spec:
      wanReplicationResourceName: wan-replication
    EOF

    WAN Sync ensures data consistency between the two Hazelcast clusters. Full WAN Sync transmits all data from the source cluster to the target cluster, aligning the state of the target IMap with the source IMap. This method is particularly beneficial when the synchronization between two remote clusters is lost due to WAN queue overflows or cluster restarts.

  6. Run the following command to see the WAN synchronization status:

    kubectl get wansync wan-sync

    The output should be similar to the following:

    NAME       STATUS
    wan-sync   Completed
  7. Repeat step 3.3 to verify that the map-sync structure on the hazelcast-second cluster now contains data.

Clean Up

To remove all custom resources, run the following commands:

kubectl delete secret hazelcast-license-key
kubectl delete $(kubectl get wansync,wanreplications,map,hazelcast -o name)