Restore a Cluster from Cloud Storage with Hazelcast Platform Operator

Learn how to back up data in Hazelcast maps to cloud storage and restore a cluster from that backup data.

Context

In this tutorial, you’ll do the following:

  • Deploy Hazelcast with persistence enabled.

  • Create a Hazelcast map that has persistence enabled.

  • Back up all map entries to external storage in the cloud.

  • Restart the Hazelcast cluster and restore the backup map entries from the cloud.

Before you Begin

Before starting this tutorial, make sure that you meet the following prerequisites:

Step 1. Start the Hazelcast Cluster

  1. Create a license secret

    Create a secret with your Hazelcast Enterprise License.

    kubectl create secret generic hazelcast-license-key --from-literal=license-key=<hz-license-key>
  2. Create the Hazelcast Cluster

    Run the following command to create the Hazelcast cluster with Persistence enabled using External type.

    cat <<EOF | kubectl apply -f -
    apiVersion: hazelcast.com/v1alpha1
    kind: Hazelcast
    metadata:
      name: my-hazelcast
    spec:
      clusterSize: 3
      repository: 'docker.io/hazelcast/hazelcast-enterprise'
      version: '5.1-slim'
      licenseKeySecret: hazelcast-license-key
      persistence:
        backupType: "External"
        baseDir: "/data/hot-restart/"
        clusterDataRecoveryPolicy: "FullRecoveryOnly"
        pvc:
          accessModes: ["ReadWriteOnce"]
          requestStorage: 8Gi
      agent:
        repository: hazelcast/platform-operator-agent
        version: 0.1.0
      exposeExternally:
        type: Smart
        discoveryServiceType: LoadBalancer
        memberAccess: NodePortExternalIP
    EOF
    The agent configuration is optional. If you do not pass the agent configuration, the operator directly use the latest stable version of the agent.
  3. Check the Cluster Status

    Run the following commands to see the cluster status

    $ kubectl get hazelcast my-hazelcast
    NAME           STATUS    MEMBERS   EXTERNAL-ADDRESSES
    my-hazelcast   Running   3/3       34.70.165.31:5701,34.70.165.31:8080
    $ kubectl get pods -l app.kubernetes.io/instance=my-hazelcast
    
    NAME             READY   STATUS    RESTARTS   AGE
    my-hazelcast-0   2/2     Running   0          3m43s
    my-hazelcast-1   2/2     Running   0          3m16s
    my-hazelcast-2   2/2     Running   0          2m50s

    As you can see from the pod states, when external backup is used, the Backup Agent container will be deployed with the Hazelcast container in the same Pod. The agent is responsible for backing data up into the external storage.

  4. Get the Address of the Hazelcast Cluster

    After verifying that the cluster is Running and all the members are ready, run the following command to find the discovery service address.

    $ kubectl get service my-hazelcast
    NAME           TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                         AGE
    my-hazelcast   LoadBalancer   10.168.25.113   34.70.165.31   5701:32653/TCP,8080:32162/TCP   69m

    The field EXTERNAL-IP is the address of your Hazelcast cluster.

Step 2. Create Persistent Map and Put Data

  1. Create Persistent Map

    Run the following command to create the Map resource with Persistence enabled.

    cat <<EOF | kubectl apply -f -
    apiVersion: hazelcast.com/v1alpha1
    kind: Map
    metadata:
      name: persistent-map
    spec:
      hazelcastResourceName: my-hazelcast
      persistenceEnabled: true
    EOF
  2. Configure the Hazelcast client to connect to the cluster external address.

    • Java

    • NodeJS

    • Go

    • Python

    ClientConfig config = new ClientConfig();
    config.getNetworkConfig().addAddress("<EXTERNAL-IP>");
    const { Client } = require('hazelcast-client');
    
    const clientConfig = {
        network: {
            clusterMembers: [
                '<EXTERNAL-IP>'
            ]
        }
    };
    const client = await Client.newHazelcastClient(clientConfig);
    import (
    	"log"
    
    	"github.com/hazelcast/hazelcast-go-client"
    )
    
    func main() {
    	config := hazelcast.Config{}
    	cc := &config.Cluster
    	cc.Network.SetAddresses("<EXTERNAL-IP>")
    	ctx := context.TODO()
    	client, err := hazelcast.StartNewClientWithConfig(ctx, config)
    	if err != nil {
    		panic(err)
    	}
    }
    import logging
    import hazelcast
    
    logging.basicConfig(level=logging.INFO)
    
    client = hazelcast.HazelcastClient(
        cluster_members=["<EXTERNAL-IP>"],
        use_public_ip=True,
    )

    Now you can start the application to fill the map.

    • Java

    • NodeJS

    • Go

    • Python

    cd clients/java
    mvn package
    java -jar target/*jar-with-dependencies*.jar fill
    cd clients/nodejs
    npm install
    npm start fill
    cd clients/go
    go run main.go fill
    cd clients/python
    pip install -r requirements.txt
    python main.py fill

    You should see the following output.

    Successful connection!
    Starting to fill the map with random entries.
    Current map size: 2
    Current map size: 3
    Current map size: 4
    Current map size: 5
    Current map size: 6
    Current map size: 7
    Current map size: 8
    Current map size: 9
    Current map size: 10

Step 3. Trigger External Backup

For triggering backup, you need bucketURI where backup data will be stored in and secret with credentials for accessing the given Bucket URI.

  1. Create Secret

    Run one of the following command to create the secret according to the cloud provider you want to backup.

    • AWS

    • GCP

    • Azure

    kubectl create secret generic <secret-name> --from-literal=region=<region> \
    	--from-literal=access-key-id=<access-key-id> \
    	--from-literal=secret-access-key=<secret-access-key>
    kubectl create secret generic <secret-name> --from-file=google-credentials-path=<service_account_json_file>
    kubectl create secret generic <secret-name> \
    	--from-literal=storage-account=<storage-account> \
    	--from-literal=storage-key=<storage-key>
  2. Trigger Backup

    Run the following command to trigger backup

    cat <<EOF | kubectl apply -f -
    apiVersion: hazelcast.com/v1alpha1
    kind: HotBackup
    metadata:
      name: hot-backup
    spec:
      hazelcastResourceName: my-hazelcast
      bucketURI: "<bucketURI>"
      secret: <secret-name>
    EOF
  3. Check the Status of the Backup

    Run the following command to check the status of the backup

    kubectl get hotbackup hot-backup

    The status of the backup is displayed in the output.

    NAME         STATUS
    hot-backup   Success

Step 4. Restore from External Backup

  1. Delete the Hazelcast Cluster

    Run the following command to delete the Hazelcast cluster

    kubectl delete hazelcast my-hazelcast
  2. Create new Hazelcast Cluster

    For restoring you will use the secret that you already created. Also you should pass the bucketURI with exact path of the backup

    Example URI → "s3://operator-backup?prefix=hazelcast/2022-06-08-17-01-20/"

    Run the following command to create the Hazelcast cluster. Before the Hazelcast cluster is started, the operator starts the Restore Agent(InitContainer) which restores the backup data.

    cat <<EOF | kubectl apply -f -
    apiVersion: hazelcast.com/v1alpha1
    kind: Hazelcast
    metadata:
      name: my-hazelcast
    spec:
      clusterSize: 3
      repository: 'docker.io/hazelcast/hazelcast-enterprise'
      version: '5.1-slim'
      licenseKeySecret: hazelcast-license-key
      persistence:
        baseDir: "/data/hot-restart/"
        clusterDataRecoveryPolicy: "FullRecoveryOnly"
        pvc:
          accessModes: ["ReadWriteOnce"]
          requestStorage: 8Gi
        restore:
          bucketURI: "<bucketURI>"
          secret: <secret-name>
      exposeExternally:
        type: Smart
        discoveryServiceType: LoadBalancer
        memberAccess: NodePortExternalIP
    EOF

    As you may see, the agent configuration is not set. Thus, the operator directly use the latest stable version of the agent.

  3. Check the Cluster Status

    Run the following commands to see the cluster status

    $ kubectl get hazelcast my-hazelcast
    NAME           STATUS    MEMBERS   EXTERNAL-ADDRESSES
    my-hazelcast   Running   3/3       34.70.165.31:5701,34.70.165.31:8080

    Since we recreate the Hazelcast cluster, services are also recreated. The EXTERNAL-IP may change.

    After verifying that the cluster is Running and all the members are ready, run the following command to find the discovery service address.

    $ kubectl get service my-hazelcast
    NAME           TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                         AGE
    my-hazelcast   LoadBalancer   10.168.25.113   34.70.165.31   5701:32653/TCP,8080:32162/TCP   69m

    The field EXTERNAL-IP is the address of your Hazelcast cluster.

  4. Check Map Size

    Configure the Hazelcast client to connect to the cluster external address as you did in Configure the Hazelcast Client.

    Now you can start the application to check the map size and see if the restore is successful.

    • Java

    • NodeJS

    • Go

    • Python

    cd clients/java
    mvn package
    java -jar target/*jar-with-dependencies*.jar size
    cd clients/nodejs
    npm install
    npm start size
    cd clients/go
    go run main.go size
    cd clients/python
    pip install -r requirements.txt
    python main.py size

    You should see the following output.

    Successful connection!
    Current map size: 12

Clean Up

To clean up the created resources remove the all Custom Resources and PVCs.

kubectl delete secret <secret-name>
kubectl delete secret hazelcast-license-key
kubectl delete $(kubectl get hazelcast,hotbackup,map -o name)
kubectl delete pvc -l "app.kubernetes.io/managed-by=hazelcast-platform-operator"