Restart a cluster
Static configuration is configuration that cannot be changed while members are running. To apply updates to static configuration, you must restart the cluster.
Restart options
There are two options for restarting a cluster:
Where possible, you should perform a rolling restart. In a rolling restart, members are stopped one at a time and each member’s data is automatically repartitioned across the remaining cluster members before shutting down. This minimizes the risk of service interruptions and data loss. When the replacement member joins the cluster, data is repartitioned again.
Depending on the scope of your configuration changes, you may need to update a single member, a cluster, or multiple clusters. The process is the same in all cases.
You can also use a rolling restart for canarying: testing changes on a single member before updating the rest of the cluster. |
Some configuration needs to be consistent across all members in a cluster. To update this configuration, you must restart the whole cluster simultaneously. By default, this causes all data in the cluster to be lost. To avoid data loss, you must enable persistence. To avoid downtime, you must divert traffic to another Hazelcast cluster before restarting. Hazelcast supports blue-green deployments to facilitate this.
Note that member processes are not reused. In all cases, members are terminated, and new members are created to replace them.
Persistence
If persistence is enabled and the cluster enters NO_MIGRATION
or FROZEN
state, replacement members will adopt the UUIDs of the terminated members and attempt to load persisted data from disk. This avoids the repartitioning process, helping the cluster recover more quickly during a rolling restart.
Persistence is required to avoid data loss during a whole cluster restart. See Configuring Persistence for details.
Perform a restart
This section describes how to restart a cluster using a choice of common tools. Hazelcast supports a wide range of deployment options that may require other tools or allow you to automate parts of the process, but the required steps will be similar.
You should perform restarts in a low utilization or maintenance window to reduce the risk of service impacts.
Deprecation Notice for the Community Edition REST API
The Community Edition REST API has been deprecated and will be removed as of Hazelcast version 7.0. An improved Enterprise version of this feature is available and actively being developed. For more information, see Enterprise REST API. |
Perform a rolling restart
You should check the cluster is healthy before and after making changes. The best way to do this is to run a Healthcheck in Management Center. For other options, see Monitoring.
To perform a rolling restart:
-
Update the static configuration as required.
-
Gracefully shut down the first member:
-
Wait until all partition migrations are completed. You can monitor the progress of migrations on the cluster dashboard in Management Center.
-
Start a new member.
-
Wait until the new member joins the cluster and all partition migrations are completed. Again, you can monitor progress using the cluster dashboard in Management Center.
-
Repeat this process until all members have been restarted with the updated configuration.
Perform a whole cluster restart
You should check the cluster is healthy before and after making changes. The best way to do this is to run a Healthcheck in Management Center. For other options, see Monitoring.
To perform a whole cluster restart:
-
Update the static configuration as required.
-
Divert traffic to another Hazelcast cluster. How to do this depends on how your Hazelcast deployment is architected. Management Center provides an easy way to manage client connections using filtering rules.
-
If you have enabled persistence, change the cluster state to
NO_MIGRATION
orFROZEN
: -
Gracefully shut down the cluster:
-
Check that all members are shut down. The cluster dashboard in Management Center shows which members are in
SHUT_DOWN
state. If all members are shut down, Management Center will disconnect from the cluster. -
Recreate the cluster:
-
Check that all members are in
ACTIVE
state using the cluster dashboard in Management Center or change state toACTIVE
ifNO_MIGRATION
orFROZEN
was used for restart. -
Confirm the cluster is healthy, for example by running a Healthcheck.
-
Restore traffic to the cluster.