Cluster Administration

On the Administration page, you can change the state of a cluster, shut it down, update the Management Center license, trigger a rolling upgrade, or handle clusters with persisted data.

Administration Menu

This menu item is available only to admin users.

Changing the Cluster State

Go to Administration > Cluster State and select a state from the dropdown menu.

Changing Cluster state

See the Hazelcast documentation to learn more about cluster states.

Shutting Down the Cluster

Go to Administration > Cluster State and click Shutdown.

Triggering a Rolling Upgrade

  1. Upgrade the codebase of each Hazelcast member. Follow the steps described in the Hazelcast documentation.

  2. Go to Administration > Rolling Upgrade and click Upgrade Cluster.

    RollingUpgradeMenu

When the cluster is upgraded, you will see the following notification:

UpgradeClusterVersionSuccess

See the Hazelcast documentation to learn more about rolling upgrades.

Working with Persistence

In the Persistence tab, you can do the following:

  • Trigger a force start.

  • Trigger a partial start.

  • See the status of cluster members.

  • Create backups of data in the peristence store (hot backup).

See the Hazelcast documentation to learn more about these topics.

Triggering a Force Start

The restart process cannot be completed if a member crashes permanently and cannot recover from a failure since it cannot start or it fails to load its own data. In that case, you can force the cluster to clean its persisted data and make a fresh start. This process is called force start.

To trigger a force start, click Force Start and confirm that you want to continue by clicking Confirm in the dialog.

If everything goes well, a success message is displayed.

If an exception occurs, an error notification is displayed.

Triggering a Partial Start

Partial start allows a cluster to start with an incomplete set of members. Data belonging to the missing members is assumed lost and the Management Center tries to recover the missing data using the restored backups. For example, if you have minimum two backups configured for all maps and caches, then a partial start up to two missing members is safe against data loss. If there are more than two missing members or there are maps/caches with fewer than two backups, then data loss is expected.

To trigger a partial start, your cluster must be configured with one of the following cluster-data-recovery-policy option:

  • PARTIAL_RECOVERY_MOST_RECENT

  • PARTIAL_RECOVERY_MOST_COMPLETE

To perform a partial start on the cluster, click on the Partial Start button. A notice dialog appears.

You can also see two fields related to Partial Start operation: Remaining Data Load Time and Remaining Validation Time.

Remaining Data Load Time is a countdown from the value configured in the data-load-timeout-seconds configuration option.

Remaining Validation Time is a countdown from the value configured in the validation-timeout-seconds configuration option.

Creating Backups of Persisted Data

You can take a snapshot of the persistence store at a certain point in time. This is useful when you wish to bring up a new cluster with the same data or parts of the data. The new cluster can then be used to share load with the original cluster, to perform testing, quality assurance or reproduce an issue on the production data.

If the backup directory is configured, you can trigger the backup by clicking Hot Backup.

You can see the progress of the backup operation under the Last Hot Backup Task Status field.

Status Information

At the bottom of Persistence tab, you can see the following statuses of cluster members:

  • Last Hot Backup Task Status

  • Persistence Status

CP Subsystem

CP subsystem management operations require enabled REST API in the Hazelcast cluster. See the Hazelcast documentation for more information.

The CP Subsystem tab can be used to monitor overall status of the CP subsystem in the current cluster and perform certain management operations.

Status

Monitoring CP Subsystem

The Status field shows a summary of the current CP subsystem status. It may have one of the following values:

  • CP Subsystem is not supported by this cluster: Shown for Hazelcast clusters with version prior to 3.12.

  • CP Subsystem is not enabled: Shown if CP subsystem is not enabled for the current cluster.

  • All CP members are accessible: Shown if there are at least the same amount of accessible CP members as the configured CP member count.

  • CP Subsystem warning: one CP member is not accessible: Shown if there is one missing CP member and the minority count in the CP subsystem is greater than 1. For example, this value is shown when there are 6 accessible CP members and the configured count is 7. In this example, the minority count is 3 members and the majority count is 4 members.

  • CP Subsystem alert: multiple CP members are not accessible: Shown if there are multiple missing CP members, but their count is less than the minority.

  • CP Subsystem error: minority of the CP members are not accessible: Shown if the minority of CP members are missing.

  • CP Subsystem error: majority of the CP members are not accessible: Shown if the majority of CP members are missing.

The CP Members (Accessible/Configured) field shows the current count of accessible CP members and the configured CP members count.

You may promote additional members or remove inaccessible CP members, so the total count of members that participate in the CP subsystem may be greater or less than the configured CP member count. As the Status field considers the configured CP member count as the total CP member count, it should be treated only as a basic health indicator for the CP subsystem.

Promoting Members to CP Subsystem

To promote one of the AP members to become a CP member, click on the Promote button. A confirmation dialog appears as shown below.

Promote Member to CP Confirmation

It asks you to choose one of AP members, i.e., one of the members that do not participate in the CP subsystem. Note that lite members are not shown in the dropdown list as lite members do not store data. Once you press the Promote button, the CP subsystem starts the promote operation for the given member.

Removing CP Members

To remove one of the inaccessible CP members from the CP subsystem, click on the Remove button. A confirmation dialog appears as shown below.

Remove CP Member Confirmation

It asks you to choose one of the members that is not connected to the Management Center, but is known by the cluster’s CP subsystem. Once you press the Remove button, the CP subsystem starts the remote operation for the given member.

Restarting the CP Subsystem

To wipe and restart the whole CP subsystem of the cluster, click on the Restart button. A confirmation dialog appears as shown below.

Restart CP Subsystem Confirmation

Once you press the Restart button, CP subsystem proceeds with the restart operation.

The CP subsystem restart operation is NOT idempotent and multiple invocations can break the whole system! After using this dialog, you must observe the system to see if the restart process is successfully completed or failed before starting this operation again.